-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiling Grace to Classical Platforms: Black Equalities and Homer's Device. #134
Comments
I'm a bit puzzled as to what is new here. We have known for a long time that we need to turn the intensional definition of inheritability into an extensional one, to get consistency across implementations. That is, as long as we believe that consistency across implementations is desirable! Rice's theorem tells us that the general case is undecidable, so any extensional rules that we write down are doing to exclude some correct and useful programs; this seems to me to (now) be a reasonable tradeoff (although around 2012 I think that I argued otherwise). I'm also a bit puzzled by the so-called Black Equalities. Does the "Black" refer to me, or to the generic idea of something that is bad? What you call the Forward Black Equality is, as I read the Spec, part of the language definition: we define the class syntax as a shorthand for a certain construction of a method wrapping an object constructor. What you call the Reverse Black Equality does not, as you point out, exist at all, nor ever has, so demonstrating that it does not exist is hardly new. The class syntax is not, and and never has been, intended to be sufficient to express all ways of using an object constructor. It is more expressive than the old "dotted" class syntax, but still not fully general; we knew that when we chose it. As far as compiling Grace goers, the obvious thing to do is to translate class declarations into method declarations that contain an object constructor. Then one has to think about the best way to implement object constructors in one's target language. For JavaScript, minigrace still does what Michael made it do: generate a singleton object. This is actually a bad translation, for a couple of reasons. First, it rebuilds the static part of the object anew on every execution of the object constructor. Second, JavaScript runs faster if the methods are in the prototype. For Smalltalk, my intention is to generate a new class corresponding to every object constructor, and to compile the object constructor into a message send to this class. (Note that I have not yet implemented; progress on SmallGrace remains frustratingly slow). I think that essentially the same translation could be used for Javascript as well, and would probably result in a significant speedup — 5 to 10 times would not surprise me. For Java, one could, as @kjx has pointed out in several papers, use anonymous classes to represent object constructors, but these are compiled into ordinary classes, so one could also use those directly. In Smalltalk (and in Java), one also has to come up with a way of representing lexical scope; I plan to use blocks, but one could also use arrays. [I plan to continue this later, but and submitting it now to avoid losing my text when the browser closes.] |
Maybe it is time to concede that our ambitions for inheritable expressions are just too hard to specify. I notice that, for example, the collections library doesn’t seem to use inheritance directly — instead there is “use”ing of traits. I believe all uses of inheritance in the objectdraw dialect are from classes.
We can still use the “one-way” Black equality to go from class to method, but not the reverse. That leaves us to think of a class as a specially annotated method that tail-returns an explicit object expression. If that simplifies the description of the language and has no impact on the actual usability of the language then that seems like a big win. With our goal of a language for novices, there seems to be no real loss in this restriction.
I’m not sure what the concern is about implementing the language directly in another language's objects (rather than “rolling our own”, which i believe is what minigrace does). It can make the implementation more efficient, but at this point I’m not sure I care much if Grace runs on the JVM using the Java model of objects.
Some of the examples below are not actually correct, but I’m not sure that matters for the point your making. For example
method foo2(x) {
var v := 0 // lexically private variable
object {
method item {x + v}
method tick {v := v +1}
}
print "Blah”
}
is not inheritable, as it returns “done” rather than an object.
The key problem of the Black equalities is that both forms are supposed to be equal in all circumstances --- in particular, both must be inheritable i.e. inherited from via an inherit clause of an object constructor or class. This causes a problem when compiling or translating Grace programs to Newspeak or the JVM or whatever --- i.e. to platforms that have classical inheritance models --- because on those platforms, for something to be inheritable it must be explicitly declared as a class.
This is even the case in O'CAML, whose class model is otherwise quite close to Grace, but where only class declarations are inheritable --- i.e. O'CAML has only forwards Black equalities. There seem to be several problems here in practice:
how to encode "classes"
how to ensure "classes" can both be inherited from and can act as factory methods.
how to work out what Manifest Inheritability means
avoiding Homer's device. Or not.
I think we’re in good shape with 1, and minigrace handles 2. Point 3 is the tough one. I don’t really see the need to avoid “Homer’s device"
Manifest Inheritability
The current spec is unclear on what arrangements of methods-returning-objects are inheritable, and, famously, what a "manifest" expression is that can resolve to a class. The problem here is actually not a lack of clarity, but rather two intensional definitions:
Inheritable: "An inherit or use clause contains a Manifest Expression that creates a new object, such as a request on a class or trait. "
Manifest: "Grace must be able to determine the shape of the object that is being inherited on a module-by-module basis.... [must not be overridden] .. [must return a] fresh object whose shape is statically determinable"
For example, technically something like this could well be inheritable:
method evil {
if (random.boolean)
then { object { var x: = 3; def y = 2;} }
else { object { var x: = 3000; def y = 2000 } }
}
While technically this is a tail-call and we know the exact structure no matter which branch is taken, allowing this seems to serve very little purpose. If someone has a compelling example (i.e, beyond just esoteric weird code) of why we should do this, I’d be interested in seeing it.
while a family polymorphism --- which is actually statically for every concrete subclass --- is banned:
class collection is abstract {
class iterator is abstract {
method next { ... }
}
}
class array {
inherit collection
alias collectionIterator = iterator
class iterator {
inherit collectionIterator
}
}
In practice, this will mean that semantics will depend on particular implementations, or at least the platform model underlying those implementations.
Homer's Device
Here's another example that is apparently correct, manifest and inheritable:
// module top level
class one { ... }
method two { one }
method three { two }
method four { three }
class five {
inherit four
}
Each of these methods tail-return a fresh new object, but of course there is know way of telling short of full symbolic execution. Apparently all of these would have to be a translated both as a method and as a class..
Homer's Device offers one solution to this. Effectively the language could require every inheritable method could have to return a fresh object explicitly. The above example is possible, but only with the circumlocution of Homer's device: making everything inheritable a class (or with two-way Black equivalences, at least a method that explicitly returns a fresh object.
// module top level
class one { ... }
class two { inherit one }
class three { inherit two }
class four { inherit three }
class five {
inherit four
}
This seems fine to me.
Solutions
Extensional definitions
Wee could replace (or augment) the intensional definitions with extensional definitions saying precisely what is manifest, what is inheritable, etc. Then at least we could hope for some consistency across implementations.
Annotations
Annotations can (as always) come to the rescue: we could have a is inheritable annotation that hints to the compiler this method is really a class. Most likely, we would still need rules for those annotations, which would have to be the extensional definitions. Presumably, though, an annotation-specific variant of the Black Equality would apply: inheritance should be possible for any declaration that meets the (extensional) criteria, whether or not annotated. There's sense to that: otherwise, well, annotations just turn into a third potential syntax for class declarations.
My substitute annotation for “is inheritable” is simply “class”. I.e., we can only inherit from classes, which abbreviate methods with extra “class” translation for inheritance.
O'CAML option
Finally, there is always the O'CAML solution: permit inheritance only from syntactic class declarations. We keep forwards Black equalities but sacrifice the reverse; Homer's device would be required for forwarding methods. Class would behave like factory methods when involved, but only declared classes could be inherited (and declared traits used). We'd still need to resolve manifestness, with an explicit extensional definition, but I think things might get more straightforward with this approach.
Barring some compelling examples of why this isn’t expressive enough, I’d lean this way right now.
… Worst of both worlds
I guess we're I'm staggering towards is that Grace's current inheritance model gets us the worst of two worlds. Black Equalities, forwarding methods, and particularly the flexible inheritability definition seem to require a very dynamic solution in practice. On the other hand, the current manifest rules impose arbitrary restrictions that forbid useful idioms like family polymorphism while permitting random inheritance (or at least something very close to it).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#134>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABuh-kAGT7aKNsBSc7d34SdxAb0Z-qahks5s7O_8gaJpZM4QudB->.
|
so it's certainly work checking that out - are there any important idioms that rely on the bidirectional equalities?
well you might not, but Moth can run about 1/2-1/4 as fast as Java. More to the point, if you want Grace to interoperate with any other (object-oriented) language's objects, you'll probably hit the same problems -- the obvious other case being JS ECMAScript 6. |
well yup. It's just that we've now hit that issue. again. which is still not resolved.
exclude correct programs, sure. Whether any useful programs are excluded is a different question.
well yes, I do recall that.
yep (see your comment about 2012). I don't think there's anything necessarily new here -- although perhaps the relationship between reverse equality in Grace, and compiling out to other class-based OOLs/platforms wasn't obvious to me before now.
Depends on the compilation target as to what is "obvious". I guess one point of this is if we're compiling (and want good interoperability) with object-oriented platforms (JVM, C#, arguably late JS, a Smalltalk without the NS bytecodes) then we will have this problem. The other option (and what my prototype did) was translate methods returning fresh objects as classes, and just fail on the rest.
I guess my rant is about the problems with this. Basically unless classes are lexically scoped within methods (and they're not in NS, and certainly not in ST) then things will be much harder than we expected. I think you'll probably need to sort all the scope yourself (although, perhaps in Smalltalk, you could get by e.g. abusing dynamically allocated pool variables :-). And then, you still have the problem you originally pointed out way back when that requests from an inherit/use context must resolve to a "class" while the same request in any other context must resolve to a "fresh instance of the class". |
I'm at a loss to understand what the problem is for which we are looking for a solution. Is it the difficulty of pinning down what is inheritable? If so, why don't we try to do this, before concluding that it is impossible? I completely disagree with your remark that the language definition should depend on the compilation target. Classes are defined by a translation to methods that contain object constructors, and that definition is independent of the compilation target. I don't see any reason to revisit that. What has changed? Here is an attempt at extensional definitions:
An expression is reusable if
A method is generative if:
This is probably not correct, but I don't think that it's so far off. Pleas tell me why I am wrong! This example qualifies
and I was actually a bit surprised that minigrace rejects it (because thought that I had made that legal, but then, I'm a lousy implementor) |
Here is a document that I prepared a while ago on how I plan to implement objects in Smalltalk. I've not talked about this because I didn't see any point until I had tried it out. I have hand-coded the specific example from this document, but don't yet have a code generator. The basic idea is to turn class parameters into instance variables, and lexical-scoped variable accesses into requests on a reader block and a writer block. Grace code:
Smalltalk re-writing:
This code depend on three classes, corresponding to three object constructors on lines 3, 7 and 11 Class GraceObject_Line03
Class GraceObject_Line07
Class GraceObject_Line11
This document does not address Grace's strange requirement that every object constructor and generative method be compiled into two versions: one for reuse and one for normal use. This is the same in JavaScript, Smalltalk, Java, Python or whatever. My JavaScript implementation makes clever use of lexical scope so that the initialize method is able to access the necessary variables; this will be tricker in Smalltalk. If you want to suggest language changes based on implementation difficulty, I think that getting rid of this feature would be a prime candidate! |
OK - but wouldn't it be simpler & quicker to just "box" the lexical variables - instead of keeping them locally, stick them into a extra object at one level of indirection and bind to that? I think that's what e.g. Scala & co do.
technically it's not requirement on compilation - I think one extra word in a frame should do it (either null, normal request, or the id of a the new object being created from a "reuse request".
I can think of three options:
|
so thinking about this overnight, I was about to make a big fool of myself & say that the whole lexical scope stuff you outlined above isn't necessary. On further reflection, that's wrong: it is necessary for "leaf classes" or object constructors at the bottom of the inheritance hierarchy --- but it isn't necessary for any inheritable classes. Going back to your rubric:
then the only classes one can inherit from are at the top level of a module. That's a pretty massive restriction, or, in other words, a pretty massive syntactic constraint on what can be inherited. Given that constraint, it seems the only lexical bindings visible in those top-level classes are
At point 1, there are only a fixed number of modules per program, right? Can't expect more than few thousand. They're not dynamically generated. So, in fact, those variables could be captured by e.g. Smalltalk pool variables (assuming Pharo still has pool variables). At point 2, the issue is mostly pseudo-private variables in the method scope surrounding the object constructor (like Given all this, it seems that permitting inheritance only from declared classes cannot be too onerous a restriction in practice --- I argue it is much less of restriction than permitting inheritance only from top-level classes! If we really, truly, want private variables, well perhaps we just permit them explicitly. This solution does solve the "feature" that every method must be "compiled twice" - only actual declared classes would need to be "compiled twice". You couldn't' forward pseudo class requests using normal methods, you'd have to use classes and Homer's device. |
(Now, given that argument, for my next trick I want to argue that we should remove the top-level restriction and go to something much more general - but keep the "inherit only from classes" restriction. This partially undercuts the argument above, but I think the resulting design is still less restrictive than the top-level-only-but-can-inherit-from-methods-returning-object-constructors-design. I'd like to think this should be a separate argument, but it's probably not) |
for what it's worth: I disagree with this too! But I think those dependencies are at least lurking behind lots of what we do. The operational semantics for unrestricted inherits clauses are clear - the request from an inherits clause must dynamically return a fresh object: if it doesn't that's a dynamic error. I surmise these can be implemented quite efficiently if we're "rolling our own" object model, and most likely on top of JS too. I bet the semantics for types and annotations as requests can be equally clear - either run all the requests for annotations as part of constructing the object, or even on each request or variable access. Any language level restrictions above the core semantics are assumptions about the implementation. |
I looked at this; it's what the Smalltalk compiler does to implement lexical scoping for blocks. It treats the lexical variables as array elements, and replaces reads and writes with array at: and array at:put:. It is a bit quicker — between 10% and 50%, depending on the proportion of object constructions to object accesses, and on the ratio of accesses from the outer scope to those from the inner scope. But it is certainly not simpler! I chose the block encoding (which Smalltalk will then implement using the Array encoding) because I thought that it would be easier for me to debug. I figured I could always switch once I had something working. Better still would be to generate the byte-codes directly, but with my limited programming abilities, and I thought that would be the hardest option to debug.
I'm not sure what you are saying here. The way that minigrace compiles object constructors and generative methods is to compile an object-build function and an object-initialize function for both. The build function has the same arguments as the class or factory method, plus extra arguments to encode the object being built onto, and the alias and exclude clauses. These two functions are then called in different orders to achieve the two different semantics of object construction (where the object being built onto is a new empty object, and the alias and exclude clauses are null) and of inheritance (where the object being built onto is the result of the build higher up the inheritance chain, and the alias and exclude clauses may not be null). The object initialize functions are also called at a different time in the two cases, as we have discussed in several papers. There is a small problem translating this to a language without lexical scope, because the initialize functions need to access the variables created by the build functions — in other words, they need to close over them. In JavaScript the initialize function for an object is returned as the result of the build function; in Smalltalk, I think that the build function will return a block. If sort of compilation scheme is what you mean by "one extra word in a frame" then I'm with you. If you have a radically simpler implementation in mind, then please explain it to me, because I would love to do something simpler! For example, I tried quite hard not to add the alias and exclude clauses as arguments, but it's not so simple to do without. For example, it does not work to let the build function add all the methods and then remove the excluded ones, because one of the excluded methods may have overridden an inherited method. |
I'm at a loss to understand why @kjx has started this crusade to inherit only from things that are declared syntactically to be clases. I've asked for an explanation a couple of times. I'm sure that there is a good reason, but I don't yet "get it". If this is going to become a serious proposal, then we need a story for how we are going to handle generative methods that don't meet the restrictions of the class syntax. The example that I give above is paradigmatic: the constructor declares some state. This ability is what makes traits work in Grace. You say: "The operational semantics for unrestricted inherits clauses are clear". That's not true. You say that they are an error if the method being requested is not generative (I think that's what you mean by "not fresh"), but why should that be the case? We can just build our object on whatever is returned by inherit — if we want to. We decided that we didn't want to because there was a consensus that not just static typing, but also program understandability, requires that we can figure out statically the shape of the object being generated. Has that changed? When I said, in my definition of generative, that the target of the request must be a module, I meant the ultimate target, not the target of every request in the chain. So if the inherit clause is
then Your claim that there is only a small and bounded collection of differently-shaped objects in any program is true, I think, under fairly weak assumptions. We would have to exclude only some fairly strange recursive inheritance situations which I'm not sure that I'm devious enough to come up with! |
that those semantics are not a requirement on compilation (or rather, that an implementation) has to build two versions of every method. I wrote a bit more on this last night (then deleted it by accident, then typed it in again) -- see #136 (which I kind-of assumed GitHub would tell you all about).
I think one could just compile one version with an extra extra argument - just as in compiling C++ to C you need an extra argument for a receiver, for Grace you need an extra extra argument which tells you if you're being called as a "normal request" or as an "inheritance request." I don't know if I'm right or not. See #136 |
I don't think I'm on a "crusade". I'm trying to explain Grace to Richard, who is building Moth. We're trying to find out how much of Newspeak we can retain. (or not). To do that, we have to try and work out what things in the spec mean - more than "whatever minigrace (or Kernan) does". I guess I'm also asking what we want to do - or what we want Grace to do - for better or worse, implementation leaks back into specification - manifest is only one such requirement (but it's a big one). Being able to compile easily to Java style classes gives one kind of language. If we're only ever going to dynamic things like JS, perhaps we don't need manifest and perhaps it just confuses matters. Thus #136 takes the opposite approach and tries to come up with a completely dynamic operational semantics that doesn't have manifest built in. |
Yeah - lexically private. I thought I talked about that in the long post that opened the issue: I think you can compile to Java-style private variables, or we could just add java-style private into Grace classes if that's want we want.
it makes stateful traits work in Grace. Stateless traits would work without it --- we obviously haven't talked enough recently because I thought you weren't keen on stateful traits. If we want stateful traits, we could remove the restrictions on traits so they can be directly stateful, which may be easier for some kinds of implementation than the fully general method-returns-object-constructor idiom. |
Ok so that wasn't clear to me from your definition - I thought you meant just the reverse - "top level only". If we wanted to make compilation very static, then "top level only" would have quite a few advantages. But it does restrict the language even further. |
fair enough. I figured I'd cheated just saying that without writing them down, and of course it took longer than I thought to write them and was more fiddly. I'm sure this can be better expressed, and I'm not sure I got them right anyway - thus #136.
We could, but that way leads to JavaScript - wouldn't that allow unlimited modifications to objects post-hoc? You could break encapsulation whenever you wanted by just using inherit.
I don't think it has changed - certainly not yet. I'm interested in exploring whether or not at least some of the restrictions could change or be removed. I'm very interested in trying to be precise about what we actually mean ---and for better or for worse, trying to be more precise seems to lead to questions at last about whether what we think we actually mean is what we think we should mean... I now see two separate restrictions here, and I see them as different.
A static type checker puts a lot of additional restrictions on programs, but we currently don't require all Grace programs to be statically checkable. I am interested in teasing out:
I guess I'm happy with implementations that e.g relax rules that are there to support static checking, or tied to particular static implementations, because a dialect checker can reimposed those rules. I'm rather less happy with implementations that change the fundamental language model. Unfortunately, this discussion shows that we have different notions of what rules are fundamental vs what rules are pragmatically useful, as well as what the rules mean (or are supposed to mean).
probably. I'm at least interested in trying to be precise about those assumptions, and also to ask: are those assumptions (or that result) a fundamental part of the Grace object model or an accident of particular implementation strategies. |
Turns out it's not the first time we've been around this particular roundabout - #73 |
another example of "kjx's useless, distracting, and pointless F**king around" as we put it in the discussion this morning. |
The Black Equalities
Grace is harder to translate into classical OO metamodels than one might expect.
One problem are the Black Equalities - the two way relationship - forward from a class declaration:
to a method returning an object constructor
and reverse in the opposite direction. The forward Black equality is injective but not surjective: every class declaration maps to a method retuning an object constructor --- but there are some inheritable arrangements of methods returning object constructors that do not have associated class declarations, e.g.:
that is, the reverse Black equality is a partial function. (Arguably, because the reverse equality is not total, the method-returning-object form is more fundamental than the class form).
The key problem of the Black equalities is that both forms are supposed to be equal in all circumstances --- in particular, both must be inheritable i.e. inherited from via an
inherit
clause of an object constructor or class. This causes a problem when compiling or translating Grace programs to Newspeak or the JVM or whatever --- i.e. to platforms that have classical inheritance models --- because on those platforms, for something to be inheritable it must be explicitly declared as a class.This is even the case in O'CAML, whose class model is otherwise quite close to Grace, but where only class declarations are inheritable --- i.e. O'CAML has only forwards Black equalities. There seem to be several problems here in practice:
Encoding classes
The short answer about how to encode class-like things is that they'll have to be encoded as platform classes on any platform that, well, requires classes to support inheritance. This isn't so much of a problem for Grace class declarations, but can I fear cause problems for methods-returning-objects that do not have a backwards Black equality, i.e. that cannot be translated to Grace class declarations. It may be possible to move the body of the method into e.g. a Java-style constructor in the underlying platform, but this is not immediately obvious to me that it works in every case. Presumably all code before the tail-return would have to be moved into the body of the constructor, and that code would prefix the translated body of the Grace class:
This also would mean that every potential class, i.e. every Grace method that could tail-return an object constructor, would have to be translated into a platform class.
Factories and Inheritance
A Grace "class" (modulo Black equalities) can be called in two contexts: as a factory method for inheritance. Both contexts must be preserved in the translation. The translation here pretty much has to be that a factory method is built to handle requests in the standard method call context, presumably by simply invoking the platform class.
For inheritance, an inherits clause must be translated so that it can find the underlying platform class to inherit from (assuming we have a platform like Newspeak or Smalltalk where inheritance is a reflexive operation from class meta-objects). This means that Grace classes declarations of whatever form must be translated so that they can be found by an inherits clause that wants to inherit from them e.g. by having the last expression mangled to invoke the underlying class...
mangled to equivalent of
to inherit from something like
Note that if classes are purely static as in C++ or Java (or rather: if superclasses must be purely static) then it's not clear how a translation could work at all.
Manifest Inheritability
The current spec is unclear on what arrangements of methods-returning-objects are inheritable, and, famously, what a "manifest" expression is that can resolve to a class. The problem here is actually not a lack of clarity, but rather two intensional definitions:
For example, technically something like this could well be inheritable:
while a family polymorphism --- which is actually statically for every concrete subclass --- is banned:
In practice, this will mean that semantics will depend on particular implementations, or at least the platform model underlying those implementations.
Homer's Device
Here's another example that is apparently correct, manifest and inheritable:
Each of these methods tail-return a fresh new object, but of course there is no way of telling, short of full symbolic execution. Apparently all of these would have to be a translated both as a method and as a class..
Homer's Device offers one solution to this. Effectively the language could require every inheritable method could have to return a fresh object explicitly. The above example is possible, but only with the circumlocution of Homer's device: making everything inheritable a class (or with two-way Black equivalences, at least a method that explicitly returns a fresh object).
Solutions
Extensional definitions
We could replace (or augment) the intensional definitions with extensional definitions saying precisely what is manifest, what is inheritable, etc. Then at least we could hope for some consistency across implementations.
Annotations
Annotations can (as always) come to the rescue: we could have a
is inheritable
annotation that hints to the compiler this method is really a class. Most likely, we would still need rules for those annotations, which would have to be the extensional definitions. Presumably, though, an annotation-specific variant of the Black Equality would apply: inheritance should be possible for any declaration that meets the (extensional) criteria, whether or not annotated. There's sense to that: otherwise, well, annotations just turn into a third potential syntax for class declarations.O'CAML option
Finally, there is always the O'CAML solution: permit inheritance only from syntactic class declarations. We keep forward Black equalities but sacrifice the reverse; Homer's device would be required for forwarding methods. Classes would behave like factory methods when invoked, but only declared classes could be inherited (and declared traits used). We'd still need to resolve manifestness, with an explicit extensional definition, but I think things might get more straightforward with this approach.
Worst of both worlds
I guess where I'm staggering towards is that Grace's current inheritance model gets us the worst of two worlds. Black Equalities, forwarding methods, and particularly the flexible inheritability definition seem to require a very dynamic solution in practice. On the other hand, the current manifest rules impose arbitrary restrictions that forbid useful idioms like family polymorphism while permitting random inheritance (or at least something very close to it).
The text was updated successfully, but these errors were encountered: