Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement class declarations #32

Open
masak opened this issue Oct 10, 2015 · 51 comments
Open

Implement class declarations #32

masak opened this issue Oct 10, 2015 · 51 comments

Comments

@masak
Copy link
Owner

masak commented Oct 10, 2015

Proposed syntax:

class Q::Statement::My {
    my ident;
    my assignment = None;

    sub Str {
        if assignment == None {
            return "My" ~ children(ident);
        }
        return "My" ~ children(ident, assignment);
    }

    sub run(runtime) {
        if assignment == None {
            return;
        }
        assignment.eval(runtime);
    }
}

(Here's the Perl 6 class for comparison:)

role Q::Statement::My does Q::Statement {
    has $.ident;
    has $.assignment;
    method new($ident, $assignment = Empty) { self.bless(:$ident, :$assignment) }
    method Str { "My" ~ children($.ident, |$.assignment) }

    method run($runtime) {
        return
            unless $.assignment;
        $.assignment.eval($runtime);
    }
}

A couple of things of note:

  • Much like Python, we re-use lexically declared variables and subs to create attributes and methods.
  • Unlike Python, there's no __init__ method. You're not expected to need one.
  • Variables with an assignment on them are optional. Variables that are not assigned are mandatory.
  • Perhaps most crucially, there is no instance context. There's no self, there's no this. The variables declared with my in the class block happen to be lexically available in the methods, but that's all.
  • As a consequence, there's no private state. You can't even emulate it by having the methods close over a lexical defined outside of the class. (Or you could, but it would be "static", not per instance.)
  • There's no class inheritance.
  • Just as with arrays and object literals, everything's immutable from construction-time onwards. If you want to mutate something, you call setProp (or butWith or whatever we end up calling it).
  • Anything other than my or sub inside the class block is conservatively disallowed.
    • Any "ordinary statements" are disallowed because they don't further the function of a class declaration. Use a BEGIN block instead. On similar grounds, BEGIN blocks are out.
    • constant declarations might be OK, though. They'd be a kind of once-only variables, valid throughout the class. If we wanted to be fancy, we could even allow MyClass.someConstant to resolve. But I doubt we'd want to be that fancy.
    • I nearly wanted to allow other class declarations inside the class body. Felt it could help to establish name spaces. (And then we'd probably switch from Q::Statement::My to Q.Statement.My.) Eventually decided against it because it feels like it's confusing together two very different things: the closed declaration of a class and its attributes/methods, and the open-ended declaration of namespaces.
    • Finally, macros would have been really really cool to allow as methods. Unfortunately, macros run long before the late-bound method dispatch happens. Even if we had types and knew what class the macro was run on, we still wouldn't know what instance it was called on.

The syntax for creating an object mirrors object literals quite a lot.

my declaration = Q::Statement::My {
    ident: Q::Identifier {
        name: "foo"
    },
    assignment: None
};

(In this case, we could've skipped passing assignment, as it's already optional.

We could have used the new keyword in the construction syntax, but I don't feel that it adds anything.

As a nice retrofit, writing Object { ... } also works, and means the same as an ordinary object literal. Unlike user-defined classes, the Object type doesn't check for required or superfluous properties.

Tuple classes

After we get types, I'd very much like for us to get a "tuple class" syntax, where you don't name your attributes, but only specify their types as a tuple:

class Q::Statement::Sub (Q::Identifier, Q::Parameters, Q::Block) { ... }

And the corresponding constructor syntax would look like a function call:

my function = Q::Statement::Sub(
    Q::Identifier("fib"),
    Q::Parameters(Q::Identifier("n")),
    Q::Block(Q::Statements(...))
); 

Internally, the attributes of a tuple class could have 0-based indices instead of names, and so you could still index them with getProp et al. This type of class would really shine if we had ADTs and case matching, the issue which see.

Meta-object protocol

If you ask type(Q::Statement::My), you get back the answer Type. (Much like Python.) Here's a rough attempt to get Type to "close the loop", meta-object-wise. Using the conjectural/future type syntax.

class Type {
    my name: Str;
    my tuplish: Int;    # since we don't have Bool
    my properties: Array<PropSpec>;
}

class PropSpec {
    my name: Str | Int;
    my type: Type;
}
@masak
Copy link
Owner Author

masak commented Oct 11, 2015

  • constant declarations might be OK, though. They'd be a kind of once-only variables, valid throughout the class. If we wanted to be fancy, we could even allow MyClass.someConstant to resolve. But I doubt we'd want to be that fancy.

Nah. We should avoid doing this, for two reasons:

  • It complicates the nice-and-simple role of the class as a constructor-ish callable.
  • It's hanging data off the wrong peg. Yes, constant declarations do run at BEGIN time, but everything else in that block properly belongs to the instance, and mixing class/instance initialization in the same block is part of the disease that is OO.

So, no constant. If you want this, you can still emulate it with something like this:

constant RingBearer::defaultRingCount = 74;

class RingBearer {
    my ringCount = RingBearer::defaultRingCount;

    # ...
}

@masak
Copy link
Owner Author

masak commented Oct 11, 2015

  • As a consequence, there's no private state. You can't even emulate it by having the methods close over a lexical defined outside of the class. (Or you could, but it would be "static", not per instance.)

Thinking about this one more iteration, this is a feature. Notice in particular that you can still get per-instance closure semantics by providing your own methods through the constructor syntax.

This leads to a situation where the class author controls the per-class mutable state, and the class consumer controls the per-instance mutable state. And there's a bit of a hand-over and negotiation taking place during object construction, in which the consumer gets the last word. I like everything about that.

@masak
Copy link
Owner Author

masak commented Oct 14, 2015

It's funny, we use the {} constructor syntax to create objects that are like records, and the () syntax to create tuple-like objects. It's quite clear what the [] constructor syntax would do if we employed it:

my args = Q::Arguments [
    Q::Literal::Int(42),
    Q::Infix::Addition(Q::Literal::Int(2), Q::Literal::Int(2)),
];

Array-like types! Currently, we have three types that would benefit from being constructed that way. In Perl 6 terms, it would be does Positional types.

The tricky part would be how to declare them. I've gone through a number of different possibilities, but no syntax feels perfect:

class Q::Arguments: Array<Q::Expr>;

This one is very clear, but it presupposes #33 and type annotations, which are not supposed to be core (while the class declaration syntax is). Then I thought maybe this form:

class Q::Arguments [Q::Expr];

This works, but then I felt it clashes in an unfortunate way with the proposed fixed array types syntax. (Through note the lack of : in the declaration — this isn't a type annotation.)

My current favorite is this:

class Q::Arguments {
    my q::arguments;
}

That is, no new syntax. If the class happens to have (a) exactly one attribute, (b) which happens to be the .lc of the class name — then it's of array type.

I know it's a little bit "magical". But I think it will have fairly few false positives. And I can't think of a solution with less egregious flaws.

Oh, and this one doesn't presuppose type annotations. If you do have type annotations loaded, there would also be a (c) condition: the annotated type of the attribute is Array<T> for some type T.

Basically, the values constructed would be Val::Array values. Except the container has a type name, not just Array. (Very similar to Object versus typed objects.) Also, the elements are typed whenever there's a type annotation on the attribute.

Naturally, all the functions/methods that work on ordinary arrays (elems reversed sorted join filter map) would work on these array types. And, most importantly, postfix:<[]>. This, in fact, would be the whole point of having these, that they behave like normal arrays.

(Note: Q::Statements currently doesn't qualify for arrayhood because it has two attributes. This is temporary and will be fixed.)

@vendethiel
Copy link
Collaborator

Can we use somethign like newtype Arguments = (Expr, Foo)?

@masak
Copy link
Owner Author

masak commented Oct 15, 2015

Another issue not addressed yet: subtype relations.

I'm a bit hesitant with this one, because I think class inheritance tends to be abused much of the time to shunt methods around between classes and smear them out across a code base. Also, given that what we already have relies quite heavily on lexical scopes and implicitly created constructor forms, straightforward extends-type syntax doesn't seem like a very good fit.

If not for that, we could've simply done

class Q::Statement::My is Q::Statement {
    # ...
}

and been done with it.

Here's what I think we should try instead. A class S is a subtype of a class T (S <: T) iff the attributes of T are a prefix of the attributes of S. In practice, this means that the deriving class still needs to repeat the attributes from the base class. (And in case of a rename refactor, all those attributes need to be found and renamed together, or the link will be lost.)

...so maybe we should do both, as a kind of failsafe? 😋 Seriously, though, I think just relying on implicit subtyping as above would be fine, at least for the rather small Q type hierarchies we have.

@masak
Copy link
Owner Author

masak commented Oct 17, 2015

Array-like types!

As I'm using this in practice, it becomes clear that this is a general phenomenon where I just saw the specific case of the array until now.

my id = Q::Ident "x";
my lit = Q::Literal::Int 7;

Both of these feel very natural too. The common denominator is not a single array attribute, but a single attribute, period.

By the proposed "attr name mirrors class name" convention, the classes would be implemented like this:

class Q::Ident {
    my q::ident: Str;
}

class Q::Literal::Int {
    my q::literal::int: Int;
}

It strikes me that I would also be OK with the convention being that the sole attribute be called value, a bit like in Java annotations. (See the JLS, Example 9.6.1-3.)

@masak
Copy link
Owner Author

masak commented Oct 17, 2015

The common denominator is not a single array attribute, but a single attribute, period.

And yes — in practice this gives us Ven++'s newtype functionality. Single-attribute types are simply wrappers for a value. Under the typed 007 extensions, they are type-checked wrappers for a value. In the cases when one wants several typed values, one uses tuple types.

@masak
Copy link
Owner Author

masak commented Oct 17, 2015

A class S is a subtype of a class T (S <: T) iff the attributes of T are a prefix of the attributes of S.

Thinking about this a bit more: the one place where I see this working less well is for empty class declarations.

From the 007 sources:

role Q {
}

role Q::Literal does Q {
}

So every Q node type would inherit from Q (which is nice). But so would all non-Q types (which, ugh). Similarly, Q and non-Q types alike would inherit from Q::Literal.

So maybe just give up and write it explicitly as is Q::Literal, etc. 😞

I think the prefix thing is still worthy, though. Maybe the thing we should keep is that a class that declares itself a subtype must also declare all the attributes of its supertype. (In any order.) Under the typing regime, we'd also require that these attributes all be assignable to the corresponding attributes in the supertype. (That is, they can be the same type or narrower.)

@masak
Copy link
Owner Author

masak commented Oct 17, 2015

Maybe the thing we should keep is that a class that declares itself a subtype must also declare all the attributes of its supertype.

And all the methods, presumably. I wonder how tempted we'd be to have some syntactical convenience for allowing methods to be "inherited" over from base classes.

Interestingly, this makes it very feasible to also have multiple inheritance. Under the typing regime, there'd even be some welcome conflict detection, in which the same method name with wildly different signatures will simply be un-implementable in the deriving class.

@masak
Copy link
Owner Author

masak commented Oct 17, 2015

It strikes me that I would also be OK with the convention being that the sole attribute be called value, a bit like in Java annotations.

Not to mention that this would mesh quite well with #41. In fact, I think that settles it: value it is.

@masak
Copy link
Owner Author

masak commented Oct 28, 2015

Rethinking the MOP and the names a little, I think this is what I want:

  • Type
    • T::Class
    • T::Tuple
    • T::Enum

I'm already going with the class and enum keywords, so maybe the middle one should use have the tuple keyword for consistency?

@masak
Copy link
Owner Author

masak commented Oct 28, 2015

There's a total collision between subcalls foo(1, 2, 3) and tuple creation syntax MyTuple (1, 2, 3).

(That space there is just convention. It's good for humans, but 007 ignores it.)

The verdict here is that if MyType isn't defined as a tuple by the time you parse MyTuple (1, 2, 3), then the parser assumes it's a sub. If that assumption turns out to be wrong, then the user gets an error and the program doesn't compile.

This is the best we can do (same as Perl 6, by the way). The alternative is to hedge the parser and go eeee maybesubcallmaybetuplecreation, and leave something weird in the Qtree that can collapse later... and we're not doing that. We have enough interesting problems as it is.

@masak
Copy link
Owner Author

masak commented Jun 15, 2016

  • As a consequence, there's no private state. You can't even emulate it by having the methods close over a lexical defined outside of the class. (Or you could, but it would be "static", not per instance.)

Having tried out objects in 007 a bit, I think the above design decision would be a mistake. Private state is available through lexical closure in normal objects. Not having that feature through classes would be a great pity, and would make Plain Old Objects strictly more powerful (and more desirable) than classes. Below is the current implementation of the Runtime object in runtime.007 to illustrate how closure is used in the new method to hold private state.

my Runtime = {
    new(config) {
        my out = config.output;
        my pads = [];
        sub enter() { pads.push({}) }
        sub leave() { pads.pop() }

        sub declare_var(name) {
            pads[pads.elems() - 1][name] = None;
        }
        sub find(name) {
            for pads.reverse() -> pad {
                if pad.has(name) {
                    return pad;
                }
            }
            throw new Exception { message: "Cannot find variable " ~ name };
        }
        sub get_var(name) {
            my pad = find(name);
            return pad[name];
        }
        sub put_var(name, value) {
            my pad = find(name);
            pad[name] = value;
        }

        my eval_of_type = {
            # properties omitted for this example
        };
        sub eval(q) { return eval_of_type[type(q).name](q); }

        my run_of_type = {
            # properties omitted for this example
        };
        sub run(q) { run_of_type[type(q).name](q); }

        return { run, get_var, put_var };
    }
};

Here's what I would like to be able to write with the class syntax instead:

class Runtime {
    has out;

    my pads = [];
    my eval_of_type = {
        # properties omitted for this example
    };
    my run_of_type = {
        # properties omitted for this example
    };

    sub enter() { pads.push({}) }
    sub leave() { pads.pop() }

    sub declare_var(name) {
        pads[pads.elems() - 1][name] = None;
    }

    sub find(name) {
        for pads.reverse() -> pad {
            if pad.has(name) {
                return pad;
            }
        }
        throw new Exception { message: "Cannot find variable " ~ name };
    }

    sub eval(q) { return eval_of_type[type(q).name](q); }

    method run(q) { run_of_type[type(q).name](q); }

    method get_var(name) {
        my pad = find(name);
        return pad[name];
    }

    method put_var(name, value) {
        my pad = find(name);
        pad[name] = value;
    }
}

Both the object form and the class form clock in at exact 40 lines. But the class form looks cleaner IMO.

Here are the significant differences from what has been proposed so far in this issue:

  • There are now has and method keywords. They are like my and sub respectively, but unlike those they declare visible properties/methods.
  • From a Perl 6 point of view has foo is much like has $.foo, and my foo is much like has $!foo.
  • Similarly, sub foo is much like Perl 6's method !foo, and method foo is like method foo.
  • In other words, everything in the class block has access to the instance context (but there's no this, you just access private and public stuff directly by name).
  • I'm open to having a constructor keyword with things that run after all the initializers. In this example I didn't need that, though.
  • In some sense, one could think of the class syntax being "compiled down to" or otherwise standing in for the object/closure syntax. Also in some sense, the class block now corresponds to the new() method block, in that in both cases, each instance has one, and that's where all the private state lives.

@vendethiel
Copy link
Collaborator

-1 to changing my's behavior in class as compared to Perl 6, I'd argue either it's lexical or it's not. The other solution would be executable class bodies, scala-ish variant where everything in the class body is ran in order

@masak
Copy link
Owner Author

masak commented Jun 15, 2016

-1 to changing my's behavior in class as compared to Perl 6, I'd argue either it's lexical or it's not.

It is lexical. What's changed isn't the behavior of the my variables; those work the same as usual. The change is that the class block (the { ... } in class Runtime { ... }) creates a new Frame with each new instance, in analogy with how a sub block creates a new Frame each time it's invoked.

@masak
Copy link
Owner Author

masak commented Jun 15, 2016

The other solution would be executable class bodies, scala-ish variant where everything in the class body is ran in order

We kinda have this, except that we're conservatively restricting the allowable statements to my, has, sub and method. (And conjecturally constructor.)

@masak
Copy link
Owner Author

masak commented Jun 15, 2016

I think it would be nice to start implementation of this feature pre-v1.0. I suggest we try doing it behind a feature flag (rather than a branch). I don't want v1.0 to block on classes, but it might benefit everyone to have actual classes to try out before v1.0.

@vendethiel
Copy link
Collaborator

One interesting thing that's always worked in Elixir, and feels nice (though a bit surprising): class bodies are compile-time evaluated context.

defmodule MyModule do
  for name <- ["a", "b", "c"] do
    def unquote(name), do: unquote(name)
  end
end

(yes, def "a"(..) ... is valid syntax)
So basically, class bodies are executable

@masak
Copy link
Owner Author

masak commented Jul 16, 2016

At first glance, I'm surprised to see an unquote outside of a quasi. But I guess that's an Elixir thing?

@vendethiel
Copy link
Collaborator

Yes, that's how they do "executable class bodies". The for's body is quoted

@masak
Copy link
Owner Author

masak commented Jul 19, 2016

I see.

Though I'm unsure what you're suggesting here for 007. Would you like the class block to be an implicit quasi block, so that unquotes work inside of it? Or would it be limited to for loops in the class block?

Is there somewhere I can read about Elixir's compile-time evaluation of class bodies?

@vendethiel
Copy link
Collaborator

I'm just trying to create a discussion about how languages with macros without a MOP usually manage those things.
For Perl 6, it's still a totally open discussion as well, not sure how to intermix those two :)

@masak
Copy link
Owner Author

masak commented Jul 20, 2016

Ah, I see. Yes, that discussion is very much worth having. Thank you for bringing it up.

I have no idea how the balance will come out between macros/Q and MOP... but in some ways I'm not so troubled by it, as long as they interoperate well enough.

@masak
Copy link
Owner Author

masak commented Jul 21, 2016

Thinking about this, I realize what we do need a good story for is how to inline a bunch of stuff (attributes and methods) into a class using unquotes and macros.

Something like this:

class XYZ {
    has normal_attr;
    method normal_method() {
        # ...
    }
    macro_with_more_stuff();
}

Or like this:

quasi {
    class {{{Q::Identifier @ ident}}} {
        has normal_attr;
        method normal_method() {
            # ...
        }
        {{{Q::MemberList @ more_stuff}}}
    }
}

We should take a leaf from JSX here, which essentially allows an array of elements and text nodes to be injected in element position; the array melts away, and the elements and text nodes are injected in sequence without any intervening structure.

@masak
Copy link
Owner Author

masak commented Jul 28, 2017

-1 to changing my's behavior in class as compared to Perl 6, I'd argue either it's lexical or it's not.

Belatedly I'm coming around to this. It's a lot more reasonable for my in a class block to mean "per-class lexical variable, kinda like a static variable in other languages" than for it to mean "private member".

I hope to throw up a complete class syntax/semantics proposal soonish.

@masak
Copy link
Owner Author

masak commented Sep 6, 2017

(...where "soonish" apparently means "more than one month later"...)

class Runtime {
    property out;

    private property pads = [];
    private property eval_of_type = {
        # properties omitted for this example
    };
    private property run_of_type = {
        # properties omitted for this example
    };

    private method enter() { pads.push({}) }
    private method leave() { pads.pop() }

    private method declare_var(name) {
        pads[pads.elems() - 1][name] = None;
    }

    private method find(name) {
        for pads.reverse() -> pad {
            if pad.has(name) {
                return pad;
            }
        }
        throw new Exception { message: "Cannot find variable " ~ name };
    }

    method eval(q) { return eval_of_type[type(q).name](q); }

    method run(q) { run_of_type[type(q).name](q); }

    method get_var(name) {
        my pad = find(name);
        return pad[name];
    }

    method put_var(name, value) {
        my pad = find(name);
        pad[name] = value;
    }
}

The four possible member declarations, in other words, are:

  • property and private property
  • method and private methods

Private properties and methods I imagine will be handled with Symbols as described in #207. In other words, they'll be inaccessible from outside view. Both the declaration and every use in the class is desugared behind the scenes to deal in symbols. You access a private property or method as foo, not as self.foo. This cuts down immensely on the need to write self.

I like this syntax. The only objection I have myself is that private property and the like are a bit long to type. I tried to do it with sigils, but nothing really felt right.

@masak
Copy link
Owner Author

masak commented Sep 6, 2017

The above proposal also completely avoids changing the behavior of my, which ought to make @vendethiel happy.

@masak
Copy link
Owner Author

masak commented Sep 7, 2017

I came up with a variant that's virtually keyword-free:

class Runtime {
    out;

    !pads = [];
    !eval_of_type = {
        # properties omitted for this example
    };
    !run_of_type = {
        # properties omitted for this example
    };

    !enter() { pads.push({}) }
    !leave() { pads.pop() }

    !declare_var(name) {
        pads[pads.elems() - 1][name] = None;
    }

    !find(name) {
        for pads.reverse() -> pad {
            if pad.has(name) {
                return pad;
            }
        }
        throw new Exception { message: "Cannot find variable " ~ name };
    }

    eval(q) { return eval_of_type[type(q).name](q); }

    run(q) { run_of_type[type(q).name](q); }

    get_var(name) {
        my pad = find(name);
        return pad[name];
    }

    put_var(name, value) {
        my pad = find(name);
        pad[name] = value;
    }
}

I think I like that better. It's strangely consistent both with the 007 object syntax (except semicolons separate members instead of commas separating properties) and with Perl 6's private-slot twigil (except in 007 it's a sigil, kinda).

@masak
Copy link
Owner Author

masak commented Sep 7, 2017

The method constructor(x, y, z) is what new MyClass(x, y, z) calls under the hood. I guess Object should have a constructor() by default. The constructor is inherited, just like other methods. (Are they in ES6? Yes, seems so:)

$ node
> class A { constructor(x) { console.log(x) } }
[Function: A]
> class B extends A {}
[Function: B]
> new B("OH HAI")
OH HAI
B {}

I feel C# got it right with the new MyClass(x, y, z) { prop1 = 1, prop2 = 2 } combination of constructor and initializer block. It's kind of the best of both worlds: you can do special things in the constructor if you want, but you can still pass in boring properties in any order in the initializer block too. Let's try and steal that for 007.

@masak
Copy link
Owner Author

masak commented Sep 7, 2017

...oh, but does that syntax mean that we again run into trouble with if statements and the like? It does, doesn't it? 😞 Need to think about that.

@masak
Copy link
Owner Author

masak commented Sep 7, 2017

self is available inside public and private methods, including of course constructor. Due to self just being an invisibly declared lexical variable, it's also available inside nested subs in methods.

However, if we allow sub declarations in classes (which we can now do if we want), then self is not available. Probably we should have some custom error message in that case.

We can try to make self available in field defaults too, but that's not a priority.

a = 1;
b = 2;
sum = self.a + self.b;    # like this

Maybe allowing that will have unintended consequences, such as users declaring all their methods as sub expressions or arrow functions? Ah, gotta trust the users on this one.

It should be a redeclaration error to try to declare a parameter self in a method (private or public), even though the declaration of self is not visible. Nested subs in the method are allowed to shadow self, though.

@masak
Copy link
Owner Author

masak commented Oct 8, 2017

As soon as we allow both property initializers and constructors, the question naturally arises which order they should run in. I'd say property initializers first, then the constructor, without hesitation.

Furthermore, initializers in base classes should also run before the constructor, ordered from base class to derived class. (In other words, in declaration order.)

The above follows C#'s design. Java does it a bit differently — it inlines the initializers into each class's constructor. This ultimately can lead to surprises during initialization, especially when a base class's constructor calls a mehod overridden by a derived class, which in turn expects the derived class's initializers to have run. They have in C# but not in Java.

@masak
Copy link
Owner Author

masak commented Mar 25, 2018

I realize I need to think more about what exact MOP instructions the various bits of the class syntax (properties, constructors, methods, private/public) desugar to.

@masak
Copy link
Owner Author

masak commented Apr 18, 2018

...and inheritance.

After we have those things in place, we can remove the "needs more design" label from this issue.

To be honest, I think such a design would help with the issues identified in #266 also. In particular, it would help nail down how enums and modules (our two "special" meta-types right now) would work.

@masak
Copy link
Owner Author

masak commented Jun 18, 2018

Not to add to all the self-bikeshedding on syntax (except really yes)...

I was enjoying The Left Hand of Equals when on page 8 this syntax showed up:

class point(x0, y0) → Point {
    method x { x0 }
    method y { y0 }
    method leftEquals(other : Object) {
        match (other)
            case { p : Point → (x.leftEquals(p.x)) && (y.leftEquals(p.y)) }
            case { _ → false }
    }
}

(The Point { ... } syntax does something with anonymously subclassing an interface, like in Java. That's not really the important part.)

I was struck by how neatly class definition is made into a type of function/routine definition. Specifically, the constructor is represented as parameters on the class body itself. I'm surprised I haven't seen this in a language somewhere. Maybe it exists but I'm unaware of it?

Anyway, I'm wondering if we should steal it for 007.

(Note that Python uses parentheses after the class name too, but for inheritance, as in the is keyword. It's one of the few things I find "too cute" about Python. Sure, it's a parameter to the class, I guess. And in Python it makes somewhat more sense, because everything is evaluated at runtime, including class definitions.)

Here's how I would write the above as 007 code, maybe:

class Point(@has x, @has y) {
    method leftEquals(other) { # not really interested in this method but here goes
        return other ~~ Point
            && x.leftEquals(other.x)
            && y.leftEquals(other.y);
    }
}

Here, the @has annotation (using #257 syntax) automatically binds the x and y parameters to their respective fields, so that (a) we don't have to declare them with has x; has y; and (b) we don't have to initialize them.

I'm thinking any additional constructor logic one might want can just be written at the top of the class body, maybe?

Note that this pretty squarely gives the class one constructor... but 007 already had it that way, so that's OK.

Does this play well with inheritance? Well, kinda:

class ColoredPoint(x, y, @has c) is Point(x, y) {
    # ...
}

Here, the is inheritance declaration and the super() call are baked into one, sort of. Again, is this enough for the general case? Again... maybe?

In this case, x and y don't need the @has annotation, because Point takes care of that. It'd work to have it, but it's redundant.

This comment is just a bikesh^Wsuggestion. I'm torn, but kinda curious. I find the syntax suggestive, not least because it would rhyme well with #307 and (even moreso) #308. (In that class declaration and class construction would look alike, just as function declarations and function calls do.)

To top it off, this syntax (and @has too) would fit well into the #306 story.

@masak
Copy link
Owner Author

masak commented Jun 18, 2018

I think the strongest argument against a class Point(x, y) { .. } syntax is that 007 would end up innovating (aka "grinding an ideological axe"), which (a) 007 has not set out to do, and (b) has backfired before (see #147, for example).

I think the biggest confidante here is our tiebreaker language, JavaScript, which indeed puts parameters on its "classes" in the old-skool constructor syntax. But that's only barely comparable. Secondly, in the new ES6+ class syntax, JavaScript explicitly does not go that route, even though it could. (It would be interesting to know if TC39 discussed a syntax like that.)

Maybe simply the fact that it's an unusual syntax, whereas 007 tries to be as syntactically "uninteresting" as possible, is an argument against.

@masak
Copy link
Owner Author

masak commented Jun 24, 2018

Furthermore, initializers in base classes should also run before the constructor, ordered from base class to derived class. (In other words, in declaration order.)

Coming back to this issue — ow, my head, all the ideas — I think this one means that initializers can't just be sugar for initializations in the constructor; they need to be their own thing.

This complicates things a little bit. But I suspect that extra complication might still be better than Java's surprising behavior.

@masak
Copy link
Owner Author

masak commented Jun 25, 2018

I think we need to start somewhere. The very smallest bit of syntax/semantics I can see that would be useful is just public properties and public methods. This in itself would help improve runtime.007.

I'm still very much on the fence about whether to prefer the long syntax or the short syntax. For now, let's go with the short one. But having any of them would be useful and open up new things to try in 007.

@masak
Copy link
Owner Author

masak commented Jul 4, 2018

I was struck by how neatly class definition is made into a type of function/routine definition. Specifically, the constructor is represented as parameters on the class body itself. I'm surprised I haven't seen this in a language somewhere. Maybe it exists but I'm unaware of it?

Holy gosh! Simula does it this way! 😱

@masak
Copy link
Owner Author

masak commented Jul 4, 2018

This changes everything!

(😛)

@masak
Copy link
Owner Author

masak commented Jul 12, 2018

In this talk, Hejlsberg laments a little bit that JavaScript didn't get a syntax where the parallel between a class and a function were made more explicit and the class took in its constructor parameters as a parameter list after the class name.

@masak
Copy link
Owner Author

masak commented Jul 31, 2018

Beyond all the syntactic self-bikeshedding I've inflicted on myself, I'm plagued by a number of orthogonal Tough Tradeoffs on the semantic level:

  • Early vs Late. Preferably types in general and classes in particular should be so static that they could be erased at runtime -- even if we don't actually do that in practice. But then classes cannot have closure semantics, right?

  • Bound methods. This is one I feel I have figured out: like in Python, methods defined in a class resolve to a bound method when accessed on an instance, so that detaching a method from its object and calling it as a function works. (I feel this one is pretty non-negotiable. Making it work is basically a kind of eta rewriting. It's highly surprising that this doesn't work in JavaScript.)

  • Should methods in object literals bind like that? Should they allow self? I don't know. Willing to drop this one if it becomes in any way problematic.

  • Type vs static vs undefined instance. I find at various different times I want the value MyClass to do three distinct things: (a) specify instances of that class: fields and methods, (b) be a "static level" where class-wide fields and methods could reside, (c) be an "empty instance" containing default field values and unbound methods. Pretty sure it's (a) I want, but that doesn't mean the other two use cases don't make themselves felt sometimes.

  • Fields and methods distinct? I mean, they're definitely in the same (property access) namespace already, like in Python. The bound method/descriptor thing makes it possible for them to be the exact same thing, but... where/how does the method get looked up? Presumably methods are not assignable; what guarantees that?

@masak
Copy link
Owner Author

masak commented Jul 31, 2018

Re some of the above questions, I found it useful to think of it in the very concrete example of the following method lookup:

my a = [1, 2];
a.push(3);

a is an Array, of course. The a.push goes to look up a method (possibly having looked for a field first), and whatever happens inside the conceptual black box in between, at the end of that process the Array type's push method gets called with a as the invocant and 3 as the single argument.

Where does it find (the functional value representing) the push method? Well, the "right peg" in this case is somewhere in a method lookup table inside the Array type object. Nominally, evaluating a.push on its own needs to have already triggered the descriptor semantics, so that a.push is actually more like Array.findMethod("push").bind(a). (This intermediate form can of course be optimized away in theory whenever we're making the call right away.)

There needs to at least be a distinction between fields and methods in the sense that a field gets found immediately on the instance, whereas a method goes through the above dance. Still, the interface looks like a property access in both cases.

@masak
Copy link
Owner Author

masak commented Jul 31, 2018

Yet another dimension to this: is it possible to make methods behave indistinguishably from func/arrow function props?

@masak
Copy link
Owner Author

masak commented Nov 4, 2018

I think the strongest argument against a class Point(x, y) { .. } syntax is that 007 would end up innovating (aka "grinding an ideological axe"), which (a) 007 has not set out to do, and (b) has backfired before (see #147, for example).

The solution to this conundrum is this: make the safe syntax the default. Have the exotic syntax available through a module.

There needs to be a way to displace/supersede grammar rules -- probably something with exactly the same category and declarative prefix as an existing rule needs to have an is replacing trait (possibly which even identifies the name of the old rule), otherwise the declaration is considered a compile-time conflict.

I'm thinking it'd also make sense to conservatively forbid an already displacing rule to be displaced by another rule... In the future when we understand the dynamics better, we might relax that.

@vendethiel
Copy link
Collaborator

The solution to this conundrum is this: make the safe syntax the default. Have the exotic syntax available through a module.

But Simula (and C#) do it that way, so it's not innovating ;-).

@masak
Copy link
Owner Author

masak commented Nov 4, 2018

C#, really? Not in this brief introduction to C# classes. Could you link me to where C# does this?

Anyway, even with some of those languages having that syntax, it would be "exotic" for 007 to step away from the way Perl 6, Python, and JavaScript do it — which differs in the details but is otherwise fairly consistent.

@vendethiel
Copy link
Collaborator

Well, I suppose what I say was misleading. The feature I’m talking about is called « records » but is only coming with C# 8.

@masak
Copy link
Owner Author

masak commented Nov 4, 2018

And C#'s upcoming records don't appear to be the same, syntactically, as Simula and what I'd like for 007's "exotic" syntax — the former is syntactic sugar for a very predictable value class (well, a "record"), while the latter contains the constructor arguments up in the class declaration header while still having an explicit body (where the constructor parameters are in scope).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants