Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

descriptors (getset, wrapper, method), __new__, __init__, builtin_function_or_method, hack js inheritance to match py #1092

Merged
merged 761 commits into from Feb 16, 2021

Conversation

s-cork
Copy link
Contributor

@s-cork s-cork commented May 7, 2020

It aims to: close #1043, close #1090, close #1075, close #942, close #1093, close #1084, close #1094, close #819, close #1195


The aim of the pr is to:

  1. improve inheritance of type objects (force type onto their prototypical chains!)
  2. implement getset_descriptor, 'wrapper_descriptor', 'method_descriptor'
  3. implement mappingproxy for type objects
  4. native implementations of property, classmethod, staticmethod
  5. create the type builtin_function_or_method aka Sk.builtin.sk_method
  6. new api for building native typeobjects

inheritance (type and object and maybe metaclasses)

since the implementation of #1131 all type objects are instances of Sk.builtin.type and this pr uses the benefits that prototypical inheritances brings.

When doing lookups we no longer have to walk the mro for single inherited objects and do a direct access of the property.

Paving the way for metaclasses:
You can now inherit from type. Though meta classes are not supported since kwargs are not supported in the compiler for class B(metaclass=A)

class A(type): pass # works fine
>>> A.__mro__
(<class '__main__.A'>, <class 'type'>, <class 'object'>)

Multiple inheritance

since multiple inheritance breaks protoypical inheritance I set a new flag sk$prototypical on all objects.

  • This flag is created in $mroMerge function.
  • If this flag returns false we take the slow path in various instances (like typeLookup) and check up the MRO.
    If this flag is true we take the fast path and simply do the equivalent of
   if (this.prototype.sk$prototypical) {
        return this.prototype[jsName];
    } else {
        ... // check the MRO
    }

__new__ and __init__

  • by implementing tp$init and tp$new we get to be stricter about how the constructor can be used internally.
  • All constructors now enforce new as a convention
  • python will be forced to use the tp$call tp$new tp$init methods in order to create an object.
  • some constructors allow python objects for convenience but this is usually a slower path and not equivalent to python in all cases. e.g. new Sk.builtin.float_(new Sk.builtin.int_(1)) is valid because nb$float is defined. But passing a python string to the float constructor won't work.

getset_descriptors

to implement getset_descriptors effectively I move all klass attributes onto their prototype.
this improves the logic for klass.tp$getattr

implementing getset_descriptors allows for the removal of __class__, __name__, __bases__, __mro__, __dict__ from the code base, apart from as getset_descriptors on either Sk.builtin.type or Sk.builtin.object

tp$mro is now an array. Since this is critical for typeLookup it seems sensible.

Similarly all sk$klass objects get a __dict__ getset_descriptor which returns the $d for the instance of that klass... removing the need to hack __dict__


mapping proxy

a mappingproxy is a non-writeable dict like object. By implementing a mappingproxy I remove the $d from all type objects. This removes the requirement to look inside the $d for lookups. When the user asks for klass.__dict__ they hit the type getset_descriptor for __dict__ which gathers up the attributes on the prototype and returns a mapping_proxy

the mappingproxy implementation is close to Cpython and can also be used with the types module with tests.


Suggested new API for building native types:

Sk.builtin.property = Sk.abstr.buildNativeClass("property", {
    constructor: function property(fget, fset, fdel, doc) {...
    },
    slots: {
        tp$new: Sk.generic.new,
        tp$init: function (args, kwargs) {...},
        tp$descr_get: function (obj, type) {...},
        tp$descr_set: function (obj, value) {...},
        tp$doc: "..."
    },
    methods: {
        getter: {
            $meth: function getter (fget) {...},
            $flags: { OneArg: true },
            $textsig: "...",
            $doc: "..."
        },
        ...
    },
    getsets: {
        __doc__: {
            $get: function () { return this.prop$doc },
            $set: function (value) {...},
            $doc: "..."
        },
        ...
    }
});

A limitation of this API is that you can only have one base which is true of all builtins.
there are other options that be tagged onto the object literal including:

  • constructor: the JS implementation - how to create an instance of the object (defaults to new Function)
  • slots: any slot functions (defining any slot function gives the equivalent dunder method)
  • getsets: getset_descriptors e.g. __dict__
  • methods: e.g. str.count
  • classmethods: e.g. dict.fromkeys
  • base: defaults to Sk.builtin.object
  • meta: defaults to Sk.builtin.type
  • proto: anything here will be added onto the klass.prototype
    • useful for private methods that are neither slots nor getsets nor methods
  • flags: anything here will be added directly onto the klass

There are a bunch of other changes that are part of this PR

  • move Sk.builtin.Genric* to Sk.generic.*
    • a place for templates like Sk.generic.new... etc
  • wrapper_descriptors
    • eliminates the need to define both the slot and the function on builtin type
      • this is done in Sk.abstr.setUpSlots
    • allows for full support of dynamically created magic methods
    • stops the requirement to use dunder methods when slot functions should be used
  • builtin_function_or_method type (Sk.buitlin.sk_method) used by:
    • method_descriptors mean the call signature can be defined and separates the function.tp$call api from sk_method.tp$call api
  • showcase math with the new api - all Sk.builtin.func become Sk.builtin.sk_method objects with little fuss.
  • showcase collections with the new api (including an improved namedtuple (close Named Tuple #1040) implementation)
  • showcase itertools with the new api - these are now all type objects rather than hacked generators
  • showcase operator with the new api and adding missing features.
  • skulpt flags:
    • sk$acceptable_as_base_class - not all builtins are subclassable
    • sk$object - a way to determine that you have a python object e.g.Sk.asserts.assert(obj.sk$object)
    • sk$prototypical
    • sk$builtinBase - the most builtin base of an object
    • sk$baseClass - a flag on type objects that are direct children of object
  • all private methods now have $ rather than also ending in an _
  • generally building on top of the ideas from Proposed Breaking Changes for Performance #1014

@s-cork s-cork force-pushed the prototypical_getsets_mappingproxy branch from 5d51d05 to a450657 Compare May 7, 2020 02:34
@s-cork s-cork changed the title Prototypical getset_discriptors and mappingproxy hack js inheritance to match py, full descriptoror support, __new__, __init__, builtin_method_or_function type May 23, 2020
@s-cork s-cork changed the title hack js inheritance to match py, full descriptoror support, __new__, __init__, builtin_method_or_function type full descriptoror support, __new__, __init__, builtin_method_or_function type, hack js inheritance to match py, May 23, 2020
@s-cork s-cork changed the title full descriptoror support, __new__, __init__, builtin_method_or_function type, hack js inheritance to match py, descriptors (getset, wrapper, method), __new__, __init__, builtin_method_or_function, hack js inheritance to match py May 23, 2020
@s-cork s-cork changed the title descriptors (getset, wrapper, method), __new__, __init__, builtin_method_or_function, hack js inheritance to match py descriptors (getset, wrapper, method), __new__, __init__, builtin_function_or_method, hack js inheritance to match py May 23, 2020
@s-cork s-cork mentioned this pull request May 26, 2020
@s-cork
Copy link
Contributor Author

s-cork commented May 28, 2020

Ran speed tests on a version this branch with fields.py taking the avg of 10 loops.

this branch

avg in repl: 2.40860002041
avg in browser: 2.670786499977112

master

avg in repl: 2.813100004196167
avg in browser: 3.221490049362183

so approx 17% improvement...

@albertjan
Copy link
Member

Holy moly that's some improvement @s-cork

@albertjan
Copy link
Member

I'll see if I can find some time today to go through this 😄

@s-cork
Copy link
Contributor Author

s-cork commented May 29, 2020

I'll see if I can find some time today to go through this 😄

I wouldn't look at this branch! I haven't pushed any changes to this branch since I started hacking too deep! And I went quite deep...

It's far from ready for review. But if you're curious about current progress, it can be found here. It's mostly me hacking away with these ideas, but it does present some nice features that I hope will make it into this pr.
https://github.com/s-cork/skulpt/tree/pr/hacking_for_skulpt

The branch mostly works - (I'm down to to 24 failing py unit 2 tests...)

interesting files to look at:


Once you start implementing descriptors the game changes... And I've had a bit too much time...

At this stage I'm most curious about - what you think of the new api...


p.s. I wouldn't check the speeds right now. After my day of hacking I lost it somewhere so will need to check that!
p.p.s I got the speed improvements back - I had a few rogue debugger statements (one of which was in genericGetAttr!

@albertjan
Copy link
Member

I'm still trying to get my head around all this but I have a few questions:

  • Our main API that is used outside of this repository is probably Sk.builtin.func this would continue to work right? Sk.builtin.func(function() { return Sk.ffi.remapToPy("test"); })
  • a module is still just a function that returns a map?
  • What will happen with Sk.builtin.buildClass?

These are the things that are most exposed to the outside and would be most painful to break.

@s-cork
Copy link
Contributor Author

s-cork commented May 30, 2020

In my head i would leave the Sk.builtin.func and Sk.misceval.buildClass api
untouched.

In fact Sk.miscval.buildClass was perfectly primed for this since to build
the class it does Sk.misceval.callsimArray(Sk.builtin.type, args)

Which now uses the tp$call method of type rather than the constructor
function.

Similarly the compiler does tp$call ?
And now all builtin types have that method. So no need to use
applyOrSuspend.

The new api doesn’t replace the existing api by any means, it is a
complement - a way to build a native skulpt type with skulpt slots more
concisely.

With Sk.misceval.buildClass you get functions and attributes. With
Sk.abst.buildNativeClass you get slot_wrappers, method_descriptors and
getset_descriptors.

Internally we can use it. And anyone who wants to build a skulpt_slot based
implementation could use it.

My aim with this pr will be to just show the bare bones - which I hope will
be adapting type, adding descriptors, adding tp$new and tp$init methods and
removing all the slot wrappers that we’ve written internally as
Sk.builtin.funcs

A benefit with this pr will hopefully be that:
Instead of defining nb$add as a js func and then __add__ as an
Sk.builtin.func, just define the former and let Sk.abstr.setUpSlots do all
the work for you.


and no change to how to build a module.

@s-cork
Copy link
Contributor Author

s-cork commented May 31, 2020

Update: I did some more hacking today and my times for fields.py for 100 Ticks - avg of 10 loops

avg in repl: 1.06900002956
avg in browser: 1.246094465255737
around 60% improvement from master

I think this is because I was working on our use of number slots.
removing the need to check __add__ etc since we're guaranteed to define nb$add if __add__ is defined and vice versa.
I mean I hope that's what it was - fields.py does a lot of number calculations...


and on pystone.py (10000 passes avg of 10 loops)

hacking branch: 0.823798500001 seconds
master branch: 2.10569200001 seconds

@albertjan
Copy link
Member

That's massive @s-cork. 🎉 🚀

@s-cork
Copy link
Contributor Author

s-cork commented Jun 2, 2020

update on this branch - I now have a version that's passing tests so I've pushed to this branch. It's still quite rough around the edges...

@s-cork s-cork closed this Jun 2, 2020
@s-cork s-cork reopened this Jun 2, 2020
@meredydd
Copy link
Contributor

meredydd commented Jun 2, 2020

This is magnificent stuff -- it makes the type system simultaneously cleaner and more correct, and makes using Skulpt from JS a whole lot easier. I look forward to going over it in more detail.

Am I right in thinking this is a breaking change? (At very least, it moves the Generic* functions.) I suggest we try to get it reviewed and merged basically as soon as @bnmnetp cuts the next release :)

@s-cork
Copy link
Contributor Author

s-cork commented Jun 3, 2020

Am I right in thinking this is a breaking change? (At very least, it moves the Generic* functions.)

I have yet to document anything but things that break - but the short answer is - yes - it's breaking.

Particularly on constructing a builtin e.g. the only valid call to tuple's constructor is:

new Sk.buitin.tuple(Array|undefined)
Sk.builtin.tuple(Array) -> "assertion fail must be called with 'new'"
new Sk.builtin.tuple(pyIterable) -> "assertion fail must be called with an Array"
/*to convert a pyIterable to a tuple you'd need to use*/
Sk.misceval.callsimArray(Sk.builtin.tuple, [pyIterable])

Typically a constructor should only ever be called with a JS object (this improves the speed and is useful for the compiler which already calls the constructor with the JS object). The one exception is str mostly because we've used this throughout the codebase as a conversion tool. (But that one is fairly tolerable because all py objects have a tp$str method inherited from object so new Sk.builtin.str(pyObject) is really a short hand for pyObject.tp$str())


Other breaking changes off the top of my head

Sk.builtin.object.prototype.GenericGetAttr -> Sk.generic.getAttr
Sk.builtin.object.prototype.GenericSetAttr -> Sk.generic.setAttr
Sk.builtin.object.prototype.GenericPython(Get/Set)Attr -> removed
Sk.builtin.type.typeLookup -> removed
all private methods on builtins (will) now have $ rather than a trailing underscore e.g.
Sk.builtin.list.prototype.list_sort_ -> Sk.builtin.list.prototype.$list_sort

(not especially breaking but notable)

  • Py2 methods -> Py3 methods
    (unbound methods are no longer a thing, so im_klass is no longer a thing)
  • Old style classes no longer supported -
    (all classes inherit from object in py2 and py3)

I'll try and separate some logic in this pr and create a separate pr for small non breaking changes that can be pre-release changes - e.g. everywhere I found we weren't using new but should have been.

@s-cork
Copy link
Contributor Author

s-cork commented Jun 5, 2020

a very minor detail - related to implementing slots in this pr - in python 3.8:

bpo-36793: Removed __str__ implementations from builtin types bool, int, float, complex 
and a few classes from the standard library. They now inherit __str__() from object.

This seems sensible to me. removes some repeat code in the code base. I'll put this into this pr unless any strong objections. Only changes I think 1 test in our library for multiple inheritance where it should override __repr__ instead of __str__

@s-cork
Copy link
Contributor Author

s-cork commented Jun 15, 2020

An update on this branch. I've recently been working on int and unifying int and long.

I've got it to a place where py2 compatibility is maintained by making long a trivial subclass of int.
There were a few tests where arithmetic operations on longs may become ints in py2 mode.
In py3 subclassing from int should now work as expected.

I used JSBI over biginteger. https://github.com/GoogleChromeLabs/jsbi


Things I'm struggling with:
Closure compiler - @rixner - I saw you did some work on this in your biginteger pr request but I wasn't sure how to achieve what you did.
I couldn't actually fully remove biginter without closure complaining.
And similarly it complains for jsbi.

I'm doing well with speed. fields.py is now consistently running at 0.6 seconds on this branch (in the repl). Which is around 80% speed improvement on master! and only 3 times slower than Cpython.

though I have an oddity that if I remove a function on Sk like Sk.setupDictIterators which is no longer called in the codebase - the speed becomes twice as slow. And I've no idea why. 🤷‍♂️


lots left to do... but just in case you felt this was going cold.

@s-cork
Copy link
Contributor Author

s-cork commented Jun 21, 2020

Another update on this branch.
I've now merged master into this branch and am relatively close to a pull request that's worth reviewing.
I've merged #1013 into this branch.
As well as fixing #949 using the ideas @acbart suggested.

I've got a slightly adapted tact for name mangling. You can see the approach in the string constructor. There also seemed to me, no reason to treat mangling any different for names/words so I unified the approach.


Speed update
I upped the TICKS in the fields.py test from 100 to 1000 in the repl taking the average of as many rounds as I was prepared to wait...

this_branch: 4.4  seconds (repl py3)
master:      28.5 seconds (repl py3)
Cpython:     2.2  seconds (ipython terminal macos)
Brython:     130  seconds (console firefox)

so overall approx 85% improvement on master and running at 2x Cpython 🔥

This was referenced Dec 19, 2020
@bennorth
Copy link
Contributor

I can add another success report for this PR.

A colleague and I are working an educational Python environment called Pytch, which aims to be a stepping stone between MIT's Scratch and Python. We've built on Skulpt, making heavy use of its suspensions to implement event-triggered concurrency.

If you haven't come across Scratch, it's an excellent system for learning programming, where you write programs by dragging/dropping blocks. There's a nice runtime with interactive sprites, built-in graphics, sounds, etc., so it's easy for novice programmers to concentrate on the interesting bits. Once they're skilled in Scratch, people often want to move on to Python. But then not only do they have to deal with a whole new way of expressing their programs (as text, getting all the syntax right), but they also have to deal with a whole new runtime, leaving behind their sprites, sounds, and so on. Pytch lets them stay in that world, keeping all the understanding they've built up about it, and concentrate just on the change from block-based to text-based programming.

I made an experimental branch of Pytch, and merged in @s-cork's branch. I had to make a handful of changes to Pytch: using new for builtins was mechanical; I'd made a few assumptions about Skulpt internals which needed updating; I changed to the new buildNativeClass() in some cases. Then Pytch's own suite of unit tests all passed, as did its browser-based UI tests.

More info:

  • Pytch — to see how we're using Skulpt, choose a tutorial and click the "Try the finished project!" button.
  • Pytch VM source — this is a fork of the Skulpt codebase, and adds logic for event-driven concurrency and other Scratch-like concepts. The handful of changes I needed to make post-merge of this PR are in the branch expt/try-s-cork-descriptors-rewrite.

Copy link
Contributor

@meredydd meredydd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1:1 with @s-cork, tested, and in production on Anvil for months now!

@meredydd
Copy link
Contributor

meredydd commented Feb 16, 2021

Hi all,

I think it's time to get this sucker merged! I have gone over it with a fairly fine tooth-comb with @s-cork, and all my feedback is already incorporated in the current state of the PR.

We have been running it in production at Anvil for most of a year now (which means we've shaken out all the bugs our users could find, and some of them ride Skulpt quite hard). It's getting traction elsewhere, too: At least @acbart's, @bennorth's and @alexwenbj's projects are using it in production.

Waiting for it to land is also blocking up Skulpt development - either PRs are stacking on top of it (#1196, #1240, #1241) or others are nervous to contribute to something that's about to be refactored.

It's a big win for python compatibility, and the performance gain is simply eye-popping (@rixner should be happy ;) ). It translates beyond benchmarks, too - since we shipped it, we got a lot of "Is it just me, or has everything got noticeably faster?" on the Anvil forums.

@bnmnetp I think we should just hit the button, and as soon as possible.

@LeszekSwirski
Copy link
Contributor

FWIW I've looked at the getattr and call internals (mostly with an eye on performance gotchas more than correctness) and those looked good to me.

@bnmnetp
Copy link
Contributor

bnmnetp commented Feb 16, 2021

Trying this out on my Fork. Things looked very promising until I get to document.

This simple program:

import document

t = document.getElementById('text1')
print('value = ', t.value)

Which has always worked fine now fails with TypeError: could not convert object of type '' to str on line 3

    mod.getElementById = new Sk.builtin.func(function (id) {
        var result = document.getElementById(id.v);
        if (result) {
            return Sk.misceval.callsimArray(mod.Element, [result]);
        }
        return Sk.builtin.none.none$;
    });

What needs to be updated in the definition to get it to work again? In debugging it fails in the Sk.miscval.callSimArray line..

@bnmnetp
Copy link
Contributor

bnmnetp commented Feb 16, 2021

Ok, so I've tracked this down a bit further. The error is coming from a line:

new Sk.builtin.str(self.checked)

where self.checked evaluates to false

@bnmnetp
Copy link
Contributor

bnmnetp commented Feb 16, 2021

Nevermind -- It turns out that I should have been making a new bool not a new string. In my defense this particular line of code is at least 8 years old. But oddly has been working many thousands of times a day until I test merged this PR. 😄

@bnmnetp
Copy link
Contributor

bnmnetp commented Feb 16, 2021

And to be fair new Sk.builtin.str(Sk.builtin.bool(false)) works just like it should.

Merging...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants