Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify the scoping nature of fields #61

Open
leonerd opened this issue Jul 1, 2022 · 39 comments
Open

Clarify the scoping nature of fields #61

leonerd opened this issue Jul 1, 2022 · 39 comments
Labels
Feedback Cor Proposal Feedback

Comments

@leonerd
Copy link
Collaborator

leonerd commented Jul 1, 2022

The RFC as it stands does not give enough clarity on the scoping nature of field names. In particular, all of the examples simply use unique field ... names at the level of the class block itself, so they don't sufficiently explain various cornercases.

Lets for now entirely ignore the generated method names for :reader or :writer attributes, or the constructor behaviour of a :param, and think purely about the field as an internal storage mechanism.

I can imagine any of several different models.

1. my-like

Since already we're saying that fields are private within a given class; that methods even in a directly derived subclass cannot see them, I wonder if they are lexically scoped within the block that declares them. I could see a model in which they scope the same as my variables. Compare the analogy of:

package Red {
  { my $x = 1; sub one { return $x; } }
  { my $x = 2; sub two { return $x; } }
}

class Blue {
  { field $f { 1 }  method one { return $f } }
  { field $f { 2 }  method two { return $f } }
}

This has a notable upside, in that if an individual method wanted to do some sort of per-instance memoization, it can hide a field variable in its own little block to act as a storage cache; in a similar way to the way regular subs can:

class Example {
  {
    field $cache;
    method calculate_thing {
      return $cache //= do { expensive code goes here };
    }
  }
  # other code and methods here can't see $cache
}

That variable is now hidden from all the other methods, and name clashes don't matter.

I like this model because it means that these fields (names beginning with $ or other sigils) scope the same way as lexical variables. Similar looking things behave similarly.

2. method-like

Alternatively, I could imagine that the set of field names declared on a class is visible once throughout the class. By analogy to how package subs are callable even between physically-separate parts of the same package, I could imagine that fields are similarly visible:

package Red { sub func { ... } }
# now we're back in main package

package Red { func() } # calls the function defined earlier

class Blue { field $f; }

class Blue { method m { $f++ } }

In this arrangement, it doesn't matter what block-level scope a field is declared in. Once created for a given class it now exists, just once, in any code that's part of that class.

I don't like this model because it makes these fields scope the same as methods, even though they look superficially like lexicals. It's a subtle and complex model to try to explain to people.

3. Some other model?

There may be some other ideas, but so far I can't think of another good one.

@leonerd
Copy link
Collaborator Author

leonerd commented Jul 1, 2022

Given the questions above, there are now a few ideas for test cases. In each case, what should the correct behaviour be? Should it be a compiletime failure? If not and it does at least compile, what would the correct result be?

Duplicated fields in the same block

class Test1 {
   field $x;
   field $x;
}
ok(Test1->new);

Duplicated fields in their own block

class Test2 {
  { field $y; method one { $y++; return $y; } }
  { field $y; method two { $y++; return $y; } }
}
my $obj = Test2->new;
$obj->one;
is($obj->two, 1, 'Each method had its own private $y');

Class code split across two locations

class Test3 {
  field $z;
  ADJUST { $z = "value"; }
}

class Test3 {
  method z { return $z; }
}
is(Test3->new->z, "value", 'Field $z visible across scopes');

Given the competing concerns, they can't all pass - the model implied by test 2 passing means that test 3 must fail and vice versa.

So what do we prefer?

@leonerd
Copy link
Collaborator Author

leonerd commented Jul 1, 2022

My personal feeling is that we should use model 1 (my-like). This would make Test2 pass as is written. Test1 should probably compile OK but yield a warning, something about "duplicate declaration of field at same scope", similar to the warning you get from { my $x; my $x; }. This means Test3 would be a compiletime error, because $z is not visible in the second class block.

@Ovid
Copy link
Collaborator

Ovid commented Jul 1, 2022

I much prefer the my-like solution. I would probably write it like this:

class Example {
    method calculate_thing {
       field $cache;
       return $cache //= do { expensive code goes here };
    }
    # other code and methods here can't see $cache
}

But use an outer block if more than one method needs it:

class Example {
    {
        field $cache;
        method calculate_thing {
            return $cache //= do { expensive code goes here };
        }
        method needs_the_cache {
            my $foo = $cache->stuff;
        }
    }
    # other code and methods here can't see $cache
}

@Ovid
Copy link
Collaborator

Ovid commented Jul 1, 2022

For duplicated fields in the same block:

class Test1 {
   field $x;
   field $x;
}

At minimum, that should be a "redefined" warning, but I think it might be better if it's fatal. Much safer that way because it's easy to overlook/swallow warnings.

For duplicated fields in their own block:

class Test2 {
  { field $y; method one { $y++; return $y; } }
  { field $y; method two { $y++; return $y; } }
}

With your proposal for the my-like syntax, I think the above is quite reasonable, though again, I might prefer this:

class Test2 {
    method one { field $y; $y++; return $y; }
    method two { field $y; $y++; return $y; }
}

I think that gives us a powerful new mechanism for managing state, and even allowing it to not be exposed to other parts of the class which doesn't need it. Encapsulated encapsulation.

That being said, what does this mean?

method one { field $y :param; }
method two { field $y :param; }

To my mind, any field not declared in the outer scope should not have a :param, :reader or :writer attributes. In fact, maybe no attributes should be allowed (though the postfix default block should be, I think).

For class code split across two locations: definitely a compile-time error.

@Ovid
Copy link
Collaborator

Ovid commented Jul 1, 2022

Also, this is the full list of field attributes suggested in the RFC: https://github.com/Ovid/Cor/blob/master/rfc/attributes.md

I realize that not all will be implemented.

@Ovid Ovid added the Feedback Cor Proposal Feedback label Jul 1, 2022
@Ovid
Copy link
Collaborator

Ovid commented Jul 1, 2022

I wrote this:


For duplicated fields in the same block:

class Test1 {
   field $x;
   field $x;
}

At minimum, that should be a "redefined" warning, but I think it might be better if it's fatal. Much safer that way because it's easy to overlook/swallow warnings.


Now that I think about it, it must be an error because a warning would be a disaster. For regular Perl, we have this:

my $x = ...
# lots of code here
my $x = ...

And that generates a warning, but it's usually harmless. In fact, you can usually just remove the my from the second declaration and you're good to go (though maybe you wanted a different variable name).

The reason this is harmless is because in procedural code, this is just a control flow issue. You've gone straight from the first to the second $x. In OO code, it's not procedural and we don't know the order in which the code is going to execute. Sure, we assume that the second field $x is going to overwrite the first, but let's look at this:

class Example {
    use Some::Other::Class;
    field $answer :param :reader {42};

    method frobnicate ($val) {
        return $val < $answer ? "frob" : "nicate";
    }

    # much later in the same class
    field $x { Some::Other::Class->new }

    method uh_oh ($thing) {
        if ( $x->handles($thing) ) {
            ...
        }
    }
}

What does that second $x do? It's clearly not semantically equivalent to the first. If it was scoped (e.g., my-like), we could wrap it in a block and make it work, but as it stands now, what does $instance->x return? Or do we still have a reader? Is it allowed in the constructor?

If we make this fatal instead of a warning and need to back it out later, no one will be relying on that behavior. However, if we make it a warning and not fatal, someone might rely on that behavior and if it turns out to be a serious design flaw, we might be stuck with it.

I recommend playing it safe and making that fatal.

@leonerd
Copy link
Collaborator Author

leonerd commented Jul 1, 2022

@Ovid

I much prefer the my-like solution. I would probably write it like this:
{code with field inside method

Yeah that makes sense. Once you can hide a field inside any scope, it feels natural to hide it even within a single method. It has parallels to state variables in that case.

To my mind, any field not declared in the outer scope should not have a :param, :reader or :writer attributes. In fact, maybe no attributes should be allowed (though the postfix default block should be, I think).

Certainly I think those ones shouldn't be permitted - :param, or any of the others that create accessor methods. Some third-party attributes would be fine though - :isa for example. I think it would need a case-by-case decision on whether they can be permitted.

@HaraldJoerg
Copy link
Contributor

Allowing several fields in a class to have the same name like this feels ... itchy. With Object::Pad an object can have several fields with the same name - iff these fields come from different parent classes or roles. This is an enhancement compared to Moo*, and it makes some sense. It is still possible to describe an object's state in words if you think of the field name as its "given name" and the class/role holding its definition as its "surname".

What Perl's traditional OO (including Moo*, but not inside-out) has given us for free is an incredibly easy and robust way to serialize objects, and I am probably not the only one who uses that a lot. Part of it is already gone with Object::Pad: Data::Dumper can not round-trip Object::Pad objects. Even if it could be taught to serialize (that can be viewed as "describing an object's state"), de-serialization is no longer that simple.

You can, of course, go Meta with Object::Pad for serialization, and I have some hope this will be available for Corinna.

But can you? How do you call $metafield = $metaclass->get_field( $name ) if $name isn't unique?

Anecdotal evidence: I had to go Meta for serialization to be able to convert between different versions of an Object::Pad class. Storable sort of works sometimes, but fails in interesting ways if the order of field declarations in the class is changed.

@Ovid
Copy link
Collaborator

Ovid commented Jul 1, 2022

Certainly I think those ones shouldn't be permitted - :param, or any of the others that create accessor methods. Some third-party attributes would be fine though - :isa for example. I think it would need a case-by-case decision on whether they can be permitted.

I think the general rule would be: if a field attribute exposes the field outside of the class, it's only allowed in the top-level class scope. Thus, :param, :name, :reader, :writer, :predicate, and :handles would be disallowed. :common would also be disallowed because that shares the field across :classes. :isa would be fine, as would :lazy if there's any way that could be implemented.

@Ovid
Copy link
Collaborator

Ovid commented Jul 1, 2022

But can you? How do you call $metafield = $metaclass->get_field( $name ) if $name isn't unique?

That's a thorny question. At first blush, I would suggest that it only pulls top-level fields, but if you want to use the MOP to clone something, you're stuck unless we find a clean way to declare scope in the MOP and that's not good.

For now, I would suggest that duplicate names are forbidden until we hash this out. It's the simplest way to resolve the issue (though I can still see the scope issues).

class Blue { field $f; }

class Blue { method m { $f++ } }

But let's look at procedural code:

package Blue {
    my $f = 3;
    sub foo { say $f }
}
package Blue {
    sub bar { say $f }
}

That won't work and I'd suggest that, by analogy, it wouldn't work in Corinna, either. Lexical scope is lexical scope.

But yeah, this is a problem:

class Example {
    field $x {2};
    {
        field $y {2};
        method foo {
            # can access $x and $y
        }
        method bar {
            # can access $x and $y
        }
    }
    method baz {
        # can access $x, but not $y
    }
}

Thus, in the MOP, we'd need some way of saying which methods can access those fields. Either "all" (the common case) or a list of methods, identified by signature (because eventually we're likely to get multisubs and (hope, hope, hope), return values can be included in the signatures).

@HaraldJoerg
Copy link
Contributor

For now, I would suggest that duplicate names are forbidden until we hash this out.

I agree (even if you drop "For now"). The benefits of using duplicate names are too small.

Thus, in the MOP, we'd need some way of saying which methods can access those fields.

And then we might want lexical methods so that method names are no longer unique. I think the chances that this gets into the way is bigger than the chances that it actually helps to implement interesting things.

I also would forbid more than one class block with the same name. A package is just a namespace, so there's nothing Perl can do about it, but IMHO a class should be "sealed" when the declaring block is closed (unless Meta wizardry is applied). In that case there are no scoping issues with fields.

@Ovid
Copy link
Collaborator

Ovid commented Jul 1, 2022

I also would forbid more than one class block with the same name. A package is just a namespace, so there's nothing Perl can do about it ...

For now. In the future, it's entirely possible (even preferable) we can disentangle classes and packages. @leonerd Thoughts?

@haarg
Copy link

haarg commented Jul 1, 2022

I think for now it's fine to make it an error to define a field with a given name multiple times per class. But to make sure everything else in the implementation would allow it, so we don't paint ourselves in a corner. Things like anonymous classes or roles may be desirable in the future, which could easily make it impossible to have a unique 'name' for a given field.

@aquanight
Copy link

aquanight commented Jul 2, 2022

Might I toss in a 3rd possible way to look at field declarations?

3. our-like

package Red {
    { our $x; sub one { return $x; } }
    { our $x; sub two { return $x; } }
    eval q{ sub three { return $x; } }; # Global symbol '$x' requires explicit package name ($x not in scope here)
    # these are the same $x (specifically $Red::x
}

class Blue {
    { field $x; sub one { return $x; } }
    { field $x; sub two { return $x; } }
    # These are the same $x (the field '$x' within Blue objects) but $x is only visible within the block it appears
   eval q{ sub three { return $x; } };  # Global symbol '$x' requires explicit package name ($x not in scope here)
}
package Yellow;
our $x;

class Blue;
field $y; # Creates the field $y in Blue and links the lexical $y to it. Note that $y has file visibility.

class Other;
sub bark {
    say $x; # $Yellow::x
    eval q{ say $y; }; # '$y' is unavailable (in-scope lexical attempting to refer to a field of another class)
}

class Blue;
sub four { return $y; } # Legal again since we are back in the bounds of class Blue.

Where you have multiple 'fields' in the same class, they're really all the same field just available in different lexical scopes. If none of them have attributes, there's no problem. However, attributes in general should probably only appear on at most one definition but there's not any obvious reason it needs to be any specific definition other than to be able to distinguish "I meant for there to be no attributes" from "I just didn't put attributes here".

@HaraldJoerg
Copy link
Contributor

Another thing to consider with regard to scoping: Currently Corinna has no way to make a field visible to sub-classes unless this field is at the same time made available to the world at large with a method (same for Object::Pad). This is no decent encapsulation, and in my opinion worse than having a field always visible to the whole code of the current class.

If this is ever to change, we need something that can be made visible to the sub-class, and therefore outside of the class where it has been declared. This makes it different from my variables and subroutines which can not be exported in Perl. It does not make it our-like either because this is related to the package's symbol table, available to everyone. Maybe it is method-like because it creates a :isa-visibility.

@HaraldJoerg
Copy link
Contributor

@leonerd started this with:

Lets for now entirely ignore the generated method names for :reader or :writer attributes

This reminds me to a discussion in #44 ("my $f vs. field $f :shared for class data") ... in particular an observation by @Ovid, who pointed out that using my (or state) runs into a dilemma:

  • You cannot use any standard slot attributes on them
  • Or you must teach every developer when they can and cannot use slot attributes on them

This also applies to a field with a scope smaller than the class.

@leonerd
Copy link
Collaborator Author

leonerd commented Jul 3, 2022

Hrm; @HaraldJoerg makes a good point. If they do scope the same way as my lexicals, it somewhat cuts off the ability to ever do subclass-visible fields in future. Whereas keeping them in a different sort of structure makes that at least plausible.

@aquanight
Copy link

I will say that nonlexical fields are likely incompatible with being able to split class definitions, or really any other situation in which variables can be "pushed" into scope from a distance (such as subclass-visible fields). Consider a variation of @leonerd's example of doing this:

class Test3 {
  field $z;
  ADJUST { $z = "value"; }
}

package AnotherPackage;
our $z = "Foo";

class Test3 {
  method z { return $z; }
}

# Exactly one of these can pass:
is(Test3->new->z, "value", 'Field $z visible across scopes');
is(Test3->new->z, "Foo", "our $z remained visible in 2nd scope");

What should $z refer to here? With lexical fields, it must clearly refer to $AnotherPackage::z as directed by the lexical our statement. But if we insert nonlexical fields, we now have the issue that the 2nd class block wants to bring in an implicit $z as the field over the top of an existing lexical $z.

By requiring the 2nd block to restate field $z we eliminate the uncertainty (and could trigger a "field" variable $z masks earlier declaration warning if needed). (We could trigger the warning anyway, but now someone has to go find just where did this masking $z come from.)

My gut says subclass-visible fields are best done with our-like fields unless we want "spooky action at a distance" situations.

@leonerd
Copy link
Collaborator Author

leonerd commented Jul 4, 2022

Thinking further about inheritable fields for subclassing, I think it can be done within the syntax, while still keeping a lexical-like model. The trick is to be explicit about which fields are being inherited, rather than just pulling them all in by default. That feels more in keeping with the "nothing-by-default" way these all work.

I've written some notes about it in the Object::Pad queue, where I'll have a go at it first. https://rt.cpan.org/Ticket/Display.html?id=143645

Anyway, because of that, I don't think we really need to consider too much how inheriting these will work at the moment, because the explicit keyword approach probably solves it better anyway.

@leonerd
Copy link
Collaborator Author

leonerd commented Jul 4, 2022

Various thoughts above seem to be converging on the idea that we can have both toplevel fields at full class scope, and buried fields within smaller scopes or individual methods. I'm a little concerned we haven't been very clear on how this will work.

For example, it's fairly clear that class C1; field $x; is toplevel, whereas anything inside a method is buried, but what about some of these?

class C2 { field $x; }

class C3; { field $x; }

class C4; method m {}  { field $x; }

I think most folks would want the C2 example to be toplevel. Most folks would consider the C4 example to be buried. So what about C3? That single semicolon might make a lot of difference here. I wonder if it's a bit subtle and hard to explain.

@Ovid
Copy link
Collaborator

Ovid commented Jul 4, 2022

Thinking further about inheritable fields for subclassing, I think it can be done within the syntax, while still keeping a lexical-like model. The trick is to be explicit about which fields are being inherited, rather than just pulling them all in by default. That feels more in keeping with the "nothing-by-default" way these all work.

I'm still not convinced we need them, but they're effectively protected attributes, similar to what Java offers via the protected keyword. However, there's a fair amount of controversy as to whether or not these are a good idea.

But ... you can find plenty of people arguing that protected fields are better than private. Plenty of flame wars on that front.

When Java first came out, many developers simply declared their fields as public, because hey, we have types and that saves us, right?

You can imagine how well that worked out. So they learned to make their fields private and only provide readers/writers if needed (and eventually built tooling to automate that).

Making my fields directly available to a subclass feels like the same thing: a massive encapsulation violation and now I've tightly coupled the classes.

Or the subclasses can use the properly defined interface and never have to worry because you've silently changed this:

field $name :reader;

To this:

field $item_name :reader('name');

@Ovid
Copy link
Collaborator

Ovid commented Jul 5, 2022

Update: Somehow I thought this was on the implementation discussion board, not a discussion issue. This makes part of what I've written in the following irrelevant.

i've been thinking about this a lot and I've reached some tentative conclusions. I really like a lot of the ideas being discussed here. That being said, @leonerd asked about these cases:

class C2 { field $x; }

class C3; { field $x; }

class C4; method m {}  { field $x; }

Now I'm confused. Corinna deliberately uses a postfix-block syntax for the initial pass to ensure we have no leakage of the class, field, role or method keywords beyond that scope. It also helps to ensure compatibility because, barring a few edge cases, the syntax is illegal in older versions of Perl.

The above appears to suggest we're now considering something file-scoped. As near as I can tell, this violates what the PSC asked for. When I spoke to them about including Corinna in the core, I was very disappointed that they asked for a much more limited breadth of work (I say "breadth" rather than "scope" to avoid confusion with "file-scoped" or "block-scoped"). However, I have to admit that the PSC was correct. The smaller the scope, the fewer bugs we're likely to have and the easier we can test the overall behavior. We've had enough bugs even with smaller, simpler features (pseudohashes, try/catch, etc) that with a change as large as Corinna, it's worth being more careful.

@leonerd also wrote:

Various thoughts above seem to be converging on the idea that we can have both toplevel fields at full class scope, and buried fields within smaller scopes or individual methods. I'm a little concerned we haven't been very clear on how this will work.

I'm very concerned about this. We spent years trying to get a good, solid definition of Corinna and much of it follows the developer's motto: "we do these things not because they are easy, but because we thought they are easy." There's still plenty of edge cases we haven't considered (introspection hasn't been fully-defined, for example).

I think some of these ideas considered here are powerful, but if we get the specification wrong, we not only risk introducing serious bugs, but could also risk the second-system effect.

I think our principle of parsimony is our friend here. In short, don't put in changes we can't back out of unless those changes are absolutely necessary for Corinna to function. If we later find out that we're being too restrictive, we can loosen up those changes. But if we loosen them up at first, we don't want people relying on what could be a mistake (or worse, discovering we could have done it better in a different way, but one that's not compatible with previous choices).

So to reiterate what I think would follow from this:

  • Postfix-block syntax for the first pass
  • No duplicate field variables in a class
  • Only allow field definitions at the top-level
  • Violations are fatal

Yes, it's frustrating, but it's safe and we can revisit those ideas later when we see how the system works as a whole.

@HaraldJoerg
Copy link
Contributor

Making my fields directly available to a subclass feels like the same thing: a massive encapsulation violation and now I've tightly coupled the classes.

That's why I like @leonerd's suggestion to have both parent and child class explicitly declare that.

In my opinion, inheritance should only be used if there is some coupling between the classes because there's always the chance that a change in the base class steps on the sub-classes' toes. You should be able to say :isa with a straight face.

Example: A Box :isa Object. Also, a Box { has @edges }. Now I want to extend like this: A Box :isa Polyhedron and a Polyhedron :isa Object. The Polyhedron { has @edges } now. If class Box wants to define where its particular edges are... then I must write class Polyhedron { has @edges :writer }. Which in turn makes the edges writable for everyone and an application may choose to re-shape the box at will.

If I can write something like Box :isa Polyhedron { has @edges :inherit }, then I can initialize or ADJUST the array like I could before. I do not consider this a massive encapsulation violation, but rather as getting my privacy back

I do not request unconditional availability of fields to sub-classes. But right now the language forces me to widen the public interface, and that's something I'd like to see changed in the future.

@Ovid
Copy link
Collaborator

Ovid commented Jul 5, 2022

@HaraldJoerg If we go this route, I would instead suggest something like this:

class Box :isa(Object) {
    # field attributes only permitted on scalars
    has $edges :writer(:trusted) :reader(:trusted);
}

class Polyhedron {
    method surface_area () {
        my $edges = $self->edges;
        ...
    }
}

In the above, by declaring $edges to have protected/trusted readers and writers, our subclass can see those methods, but other classes cannot, limiting our interface's size. Thus, rather than inventing something new, we can use well-known OO semantics already established in other languages such as C++ or Java.

@Ovid
Copy link
Collaborator

Ovid commented Jul 5, 2022

Per convesation with @leonerd on IRC:

10:57 LeoNerd: To be clear: I'm not looking at adding these things now, I'm thinking about what we might like in future so as not to paint ourselves into a "we can't have that" corner later on

That makes perfect sense and I misunderstood that.

@leonerd
Copy link
Collaborator Author

leonerd commented Jul 5, 2022

@Ovid

If we go this route, I would instead suggest something like this:
...

That would be quite a significant departure from what we have now. Currently Perl classes (of any flavour - classical or Corinna-shaped) don't have the concept of methods that aren't a public free-for-all. We could consider what it would mean to have non-public methods, and then the idea of non-public accessors for fields might fall out of that; but that feels like a different path than also exploring the idea of optionally opening up field visibility to subclasses.

@leonerd
Copy link
Collaborator Author

leonerd commented Jul 5, 2022

Other thoughts (copied from IRC) on the subject of inheritable fields:

If inheritable fields exist, they provide a way for subclasses to provide different initialisation expressions for fields:

class Cat {
  field $noise :inheritable { "meow" }
  method greet { say $noise }
}

class Lion :isa(Cat) {
  field $noise :inherit { "roar" }
}

Without inheritable fields, it's hard to imagine how to write that without making some sort of builder method. About the best I can come up with is

class Cat {
  field $noise { $self->_build_noise }
  method _build_noise { "meow" }
  ...
}

class Lion :isa(Cat) {
  method _build_noise { "roar" }
}

which feels lot worse. It's syntactically noisier (pardon the phrasing), it exposes extra methods in the API (even if they're discouraged from public use by that leading underscore), and it's much slower to execute because now the constructor has to perform a dynamic method dispatch, whereas previously it did not.

@Ovid
Copy link
Collaborator

Ovid commented Jul 6, 2022

That would be quite a significant departure from what we have now. Currently Perl classes (of any flavour - classical or Corinna-shaped) don't have the concept of methods that aren't a public free-for-all.

I don't understand that. Corinna defines :private and in current Perl I can do my $method = sub {...} and later call $method. So it sounds like I'm not understanding your meaning.

So the pros of making variables inheritable:

  • Compile-time checks
  • Performance boost
  • The symmetric :inheritable/:inherit makes safer than just making it available (I like this idea)
  • A somewhat more concise syntax.

And the cons?

  • "Methods as API" are well-known and well-understood
  • How do we protect against things like field $im_an_invoice :inherit { $im_a_filehandle };?
  • Or a more subtle example: field $celsius :inherit { $fahrenheit };
  • Marking something as :trusted can have a corresponding :trust in the subclass for added safety
  • It's much easier to change the internals of a method than to change a variable
  • Worse: we're building even more scaffolding to support inheritance and that's a problem

One of the primary ideas of Corinna is to offer affordances for safe behaviors, but letting the developer go through just a touch of extra work if they want to do something bad (e.g., the builders in @leonerd 's last example).

Fundamentally, this is the reason why Alan Kay left inheritance out of the definition of OO. The idea of "subclasses as specialized parents" isn't enforceable. My field $im_an_invoice :inherit { $im_a_filehandle }; above shows how easy it is to get wrong, but field $feet :inherit { $meters }; makes this even harder. Both $feet and $meters can be floats and it can be very hard to detect this error and the Mars Climate Orbiter burns up on re-entry.

We know how troublesome inheritance is, to the point where some OO languages (Go, VB variants) don't even allow it. I don't advocate eliminating inheritance, but we shouldn't be encouraging it by making bad design decisions more convenient. Unless ...

What would make me change my mind and be more open to inheritance would be anything, ANYTHING, ANYTHING which would make it safer. We all know that we shouldn't $reach->{inside} and my clients pay me to fix people doing that again and again. Yet this mistake is not just allowed, but encouraged by some developers because "it's easy." In many ways, Perl developers are lazy in a bad way. Rather than spend any time thinking about a design, we think of clever hacks to work around system limitations.

Between safe and easy, give me safe. For small systems, this isn't as much of an issue, but systems grow and it becomes an issue. I work on big systems and issues like this are a train wreck. So what can we do to make it safer?

Types/type constraints would be a huge boon, but we don't have them. Some way of enforcing contracts? Nope. Most developers still don't understand the Liskov Substitution Principle, so dropping an instance of a subclass somewhere where we have an instance of the parent isn't safe.

So we have no safety at all in inheritance, but least I can do this:

class ConstrainedPoint {
    field ($x, $y) :param :predicate :reader; # no defaults, so it's not *really* safe
    field $limit :param {10}; # oops, no types :(

    method set_x ($new_x) { if ( -$limit < $new_x < $limit ) { $x = $new_x } else { die ... } }
    method set_y ($new_y) { if ( -$limit < $new_y < $limit ) { $y = $new_y } else { die ... } }
}

class PointWithDefault :isa(ConstrainedPoint) {
    ADJUST () {   # use default
        $self->set_x(0) unless $self->has_x;
        $self->set_y(0) unless $self->has_y;
    }
}

In the above, I could have just used the :inheritable/:inherit syntax and it would have made it much easier:

class PointWithDefault :isa(ConstrainedPoint) {
    field ($x,$y) :inherit {0}
}

But now, any code in PointWithDefault can set $x and $y directly and ignore the constraints. This violates the entire point of inheritance: creating a specialization of the parent. With the hoops the set_x and set_y make you jump through, it's much safer. No, I don't like that syntax either, but it's what we have to work with now.

If you can provide a way that :inheritable/:inherit can be done safely by respecting the parent contract, I would be much more open to this idea. With the current limits of Perl, I don't see how this could be done.

@Ovid
Copy link
Collaborator

Ovid commented Jul 6, 2022

I think another way of putting this: I would much prefer if inheritance in Perl could be more akin to a subtype. In a subtype, you know that it respects the parent's contract. In Perl, a subclass is just syntactic sugar for reusing some of the parent's behavior. But many developers don't realize that. In fact, we often assume that a subclass would be a subtype, but there's no guarantee of this in Perl, so our expectations don't match our reality.

But I also see no way we can get our expectations to match our reality because currently, Perl is so crippled by lack of types that building large system is hard. However, Moo/se along with Type::Tiny has made this so much easier and safer, so to a certain extent, Corinna is a step backwards (albeit a temporary one). Since we can't have types or contract enforcement, don't give the person writing a class tools to make the issue worse.

@thoughtstream
Copy link

For whatever it's worth, my own view on these topics is very clear.

30+ years of writing and teaching OO has convinced me beyond a shadow of a doubt that providing any kind of mechanism to access encapsulated data outside its own class...is a disaster. Even when such a mechanism is provided with only the very best intentions and in the expectation that it would only be used in cases of dire need and design extremity.

Whenever I teach OO design or implementation, I make it crystal clear that allowing any other class or external code to directly manipulate the internal state or implementation of an object (whether that's via protected inheritance [C++], explicit trust mechanisms [Raku], or just inadequate encapsulation semantics [Perl]) is a critical design flaw: either in the OO language itself, or at least in the (mis-)use of that language within a particular OO design.

Yes, it can be extremely convenient and efficient to access a base class's fields from within its derived classes. But in my experience of my own systems, and the systems of hundreds of students and clients, eventually such direct accessing leads to subtle bugs and irretrievable painted-into-a-nasty-corner situations.

I am so adamant about this, that I even argued against :writer, as this provides direct access to state (and not just to child classes, but to everyone).

In designing Corinna, we did everything we could to discourage inheritance, having learned from long and bitter personal experience that inheritance cannot successfully be made to serve the two incompatible masters of interface consistency and implementation sharing. Inheritance of implementation is a failed model, which is why we see the ever-increasing popularity of aggregation (role-based) mechanisms instead.

Hence, I strongly believe that we should not even consider the possibility of providing at some future date a mechanism to make fields accessible outside their classes.

(Having ranted sufficiently on my personal OO bête noire ;-) let me address the original question(s) posed in this discussion. I strongly believe that:

  • fields should be strictly lexically scoped
  • it should be a fatal error to redeclare the same field in the same lexical scope
  • only fields in a direct scope of the class (i.e. in the outermost block of the class, or in a nested basic block within the outermost block of the class) should be permitted to take interface attributes; in other words, it should be illegal to declare a field inside any block governed by a method keyword, or indeed any other keyword except class
  • later we might wish to allow other per-object variables to be declared inside methods, to act as their private per-object caches; these, however, should be declared with a keyword other than field (perhaps subfield or objstate or cache?)
  • classes should not be able to be split and defined across two or more class declarations (the second one should always be a fatal error)
  • the class Name; form of class declaration should not be allowed (or, if it must be allowed, it should at least be vociferously discouraged and vituperatively disparaged!)

Most of those beliefs follow self-evidently from my preceding rant, but I do need to clarify the very first point (and my response to the original question): why I think fields should be strictly lexical, rather than method-like or package scoped...

Good OO design is all about maximizing encapsulation and decoupling, by minimizing the number of code lines/scopes/files in which internal object state is accessible. Making fields lexical accomplishes that goal: the particular piece of state stored in a field variable is only accessible from the line it was declared to the end of the surrounding block. That's simple and predictable for users and can be efficiently implemented as well. It allows different items of object state (i.e. fields) to be segregated into separate sub-blocks, thereby minimizing intra-class coupling and reducing the chance of unintended enbugging within a single class. You simply can't accidentally mess up the state of a field if that field isn't even in the lexical scope of the current method.

If fields were to have method-like scope, then that greatly complicates the compiler's task of checking and resolving field names across separate scopes. And, more importantly, it greatly complicates the user's/tester's task of making the same checks and resolutions. Moreover, it immediately eliminates any possibility of restricting a given field to anything less than an entire class scope, which removes a powerul and safe tool for decoupling and encapsulating state within a class.

For these same reasons, it would be disadvantageous to implement fields with package-like scope.

Moreover, if fields are anything other than lexically scoped, and if we're going to allow split class declarations (but please don't!) then we're explicitly providing a mechanism to allow users to completely side-step encapsulation.
If you decide you'd like direct access to the $bar field of class Foo, then you just add an extra:

class Foo {
     # Mess about with $bar in here
}

If field $bar is lexically scoped, any attempt to access $bar in that new (partial) declaration would provoke an instant compiler error. On the other hand, if $bar is scoped like a method or a package variable, then it has to be freely accessible to this "burgler" class redeclaration as well. That's not good.

Whether it's this evil trick, or an official syntax for :inherited fields, doesn't really matter. Providing any kind of mechanism to directly access class or object data outside the class itself is always a terrible idea. Not just because it undermines encapsulation, encourages interclass coupling, and inevitably leads to hard-to-find bugs. But also because providing such a mechanism always initially seems like a terrific idea...which subsequently makes it feel like directly accessing class or object data outside the class is a good (or at least an acceptable) practice. Which it never is.

An OO language can only be as good as its worst feature, and public/protected data is by far the worst feature of most existing OO languages...including "classic" OO Perl.

So please let's not add "modern" OO Perl to the ranks of the damned. ;-)

@Ovid
Copy link
Collaborator

Ovid commented Jul 6, 2022

Damian (@thoughtstream), thank you for that.

Side note: Corinna switched from double inheritance to single inheritance and now I'm beginning to regret any inheritance in its design. It's too buggy, doesn't do what people think it does (particularly in a language like Perl), and people abuse it for all sorts of uses. Sadly, I've had some larg(ish) OO systems I've tried to evict inheritance from and I've found edge cases where it's really not feasible.

method foo($bar) {
    # some code here
    $self->next::method($bar);
    # more code here
}

Inheritance gives me very fine-grained control over the timing of when I can call a superclass method. Roles make it much harder.

That being said, I'm still sorely wishing I could have killed inheritance entirely, but I was pretty sure that Corinna would have been rejected had I gone that far.

@HaraldJoerg
Copy link
Contributor

@Ovid:

Types/type constraints would be a huge boon, but we don't have them.

I am a bit tired of hearing this argument. In Perl, The bignum "types" and things like Type::Tiny make the border between types and objects a somewhat blurry one. They also nicely show the possibilities - and limits - of this approach. As shown by your $celsius example, types are no silver bullet for scoping considerations.

I think another way of putting this: I would much prefer if inheritance in Perl could be more akin to a subtype.

That makes it another overlap between types and objects.

In my understanding the wording :isa(Parent) makes this subtype intention a lot clearer than extends Parent; and all my Object::Pad code follows this pattern. I find it helpful for teaching OO, and for reading code, when the constructs can be understood with plain language. (I'd also love has for attributes but this has been ruled out because it reminds people too much of has in Moo*.) I can't help to quote "Programming Perl" here:

Languages were first invented by humans, for the benefit of humans. In the annals of computer science, this fact has occasionally been forgotten.

I also agree with the observation:

Fundamentally, this is the reason why Alan Kay left inheritance out of the definition of OO. The idea of "subclasses as specialized parents" isn't enforceable.

Indeed, Perl is not particularly good at enforcing things. Corinna can encourage the use of inheritance as subtypes by wording (:isa), by documentation, by guidance and by features, but it can not enforce it. But alas, this also goes back to Perl's roots. Here's a quote by Larry Wall:

Perl doesn't have an infatuation with enforced privacy. It would prefer that you stayed out of its living room because you weren't invited, not because it has a shotgun.

I am including those classic quotes because I think that Corinna is bound by the spell "Perl will stay Perl".

So what if we drop inheritance? Inheritance isn't for formal strictness, it is for convenience. I am aware that this convenience has been used and abused in various ways. I can write my example without inheritance:

class Polyhedron {
    has $object { Object->new }
    has $edges :param { [] } # was: has $edges :inheritable
}

class Box {     # was: class Box :isa(Polyhedron)
    has $polyhedron { Polyhedron->new( edges => [...] ) } # was: has $edges :inherit { ... }
}

That way, the $edges are nicely encapsulated. However, the "subtype" relation is no longer visible, and in the Box class I need to add delegations to the $polyhedron field because I can't apply its methods directly to Box objects. That's code I need to write, document, test, and bugfix. Replacing code by declarations is one of the things which made Moo* so attractive.

I understand that it is very desirable to read code which has been written using a clean OO system. But I also expect that the motivation to write code for a too restricted system in the first place isn't overwhelming. Object inheritance has been available to Perl programmers for quite some time now. Not including it in Corinna because it can be abused or because Alan Kay didn't like it seems to target the wrong audience.

@thoughtstream
Copy link

As regards @leonerd's Cat/Lion example...

I understand why $sound is being implemented as a heritable field (trading efficiency in the greet method for extra memory footprint in every object), but I have several problems with the approach. Besides the obvious one that we're paying for a marginal performance improvement in greet with an extra memory overhead in every single object. ;-)

My first issue is that the hierarchy itself is wrong. A lion isn't a cat. Not in the housecat-that-meows sense.
"Meow" isn't the universal cat noise that lions subsequently override with "roar".
In fact, there is no universal basic feline noise.

Which means our inheritance hierarchy should be more like this:

class Feline {
    method greet { say $self->sound }
    method sound;   # abstract
}

class HouseCat :isa(Feline) {
    method sound { 'meow' }
}

class Lion :isa(Feline) {
    method sound { 'roar' }
}

But this shows us that both greeting and sound are behaviours, not states.
Making a sound is not something housecats and lions are; it's something they each do differently.
Similarly, performing a greeting is not something felines are; it's something they all do in the same way.
Moreover, greet is not something that only felines do. Canines and corvids and cetaceans do it too.

So greeting is really a role they perform:

role Greeting {
    method greet { say $self->sound }
    method sound;   # abstract
}

class HouseCat :does(Greeting) {
    method sound { 'meow' }
}

class Lion :does(Greeting) {
    method sound { 'roar' }
}

class Raven :does(Greeting) {
    method sound { 'Nevermore.' }
}

As for the original space-vs-time efficiency tradeoff...that's only necessary because Corinna currently lacks parameterizable roles.

In Raku, for example, the same classes could be implemented without the performance hit of calling the sound method within greet and without the memory overhead of allocating a per-object $sound field. Like so:

role Greeting[$sound] {
    method greet { say $sound }
}

class HouseCat does Greeting['meow']       {}
class Lion     does Greeting['roar']       {}
class Raven    does Greeting['Nevermore.'] {}

If we're thinking about future compatibility,
I believe that's the direction we should be heading.

@perigrin
Copy link

I recently started writing something sufficiently complex with the class feature in 5.37.9, and I ran smack into this problem.

Let's take Damian's most recent example1:

role Greeting {
    method greet { say $self->sound }
    method sound;   # abstract
}

class HouseCat :does(Greeting) {
    method sound { 'meow' }
}

class Lion :does(Greeting) {
    method sound { 'roar' }
}

class Raven :does(Greeting) {
    method sound { 'Nevermore.' }
}

Every class must have a sound method for internal state keeping purposes, not because it makes any sense to the class's API. We're simply repeating Moose's mistakes.

I agree with Damian that the correct answer is Polymorphic Traits, but then I've read SCIP twice (most recently the Javascript version) and I'm all aboard the substitution model party bus. But, I'm not sure when we will have polymorphic traits simply because we don't generally have polymorphic modules in Perl so it's not a common concept for Perl programmers.

IMO we should have a solution to this that isn't polymorphic traits, and doesn't require polluting the public API for state maintenance that should be internal to the class. I think we do, and I'll get to that in a minute but first let's start with Damian's … vhement … commentary on encapsulation.

Yes, it can be extremely convenient and efficient to access a base class's fields from 
within its derived classes. But in my experience of my own systems, and the systems of 
hundreds of students and clients, eventually such direct accessing leads to subtle bugs 
and irretrievable painted-into-a-nasty-corner situations.

I am so adamant about this, that I even argued against :writer, as this provides direct 
access to state (and not just to child classes, but to everyone).

I entirely agree with him, and it's because I agree with him that I really dislike the solution of having methods simply to transmit state for inheritance. But I disagree that this is violating the encapsulation of the parent class because Perl isn't Go.

In Go you define an object by declaring a struct and methods that act upon that struct:

type base struct {
	value string
}

func (b *base) say() {
	fmt.Println(b.value)
}

You can do "inheritance" in Go by embedding a struct, the child struct can be substituted for the parent and it will simply pass all calls along to just that embedded struct.

type child struct {
	base  //embedding
	style string
}

func main() {
	base := base{value: "somevalue"}
	child := &child{
		base:  base,
		style: "somestyle",
	}
	child.say()
}

In this system the border between what is the child and what is the parent is very obvious, and the encapsulation between them makes some sense. They're fundamentally different data structures.

Perl also isn't Java. Java is a bit more muddled but still has a distinct line between parent and child classes.

class Vehicle {
  private String brand = "Ford";

  public void honk() {
    System.out.println(this.brand + " says Honk!");
  }
}

class Car extends Vehicle {
  private String model = "Mustang";

  public void honk() { 
      System.out.println(this.brand + " " + this.model + " says Honk!");
  }
  
  public static void main(String[] args) {
    Car myFastCar = new Car();
    myFastCar.honk(); // throws error: brand has private access in Vehicle
  }
}

The public, private, protected access system sets up the boundary for us, and the opaque instance layout reinforces that distinction.2 This is re-inforced in Java by not inheriting constructors, and that a child class can hide the parent class variables. The language is designed around the idea that child and parent classes are separate entities, each with their own first class attributes.

Perl doesn't work that way, nor is it just an implementation detail … we chose to have it work this way. Primarily because until now we chose not to have attributes as first class citizens, and now because we're choosing not to have constructors.

use 5.39.7
use Data::Dumper;
use experimental `class`;

class Base { field $value :param = "Ford"; method honk() { say $self->value }
class Child :isa(Base) { field $style :param; }

my $child = Child->new(value => "somevalue", style => "somestyle"); # doesn't blow up because we referred to `value`

Child doesn't contain an embedded instance of Base, nor did we ever suggest it would via our syntax. A instance of Child is a data structure that combines fields defined in both Base and Child, but it's a single data structure. In My Opinion3 access to that data from inside Child isn't violating the encapsulation of Base because there is no Base as distinct data. I would argue that by not providing a mechanism to Child to access fields defined in Base we're saying that an instance of Child can't have access to it's own state simply because that state definition was made outside the scope of the Child class definition.4

I think the time for enforcing a distinction between parent and child class data was in 1993 when we chose not to have first class object data to begin with.

I like Aquanight's suggestion about our as a model for attributes. We're creating a lexical alias to data that exists in a larger context than the scope of the lexical. It may be the Javascript I've been writing recently but to me instances are functionally more akin to globrefs than anything else currently in Perl.

package A { our $b; sub b { $b } } 
package B { our @ISA = qw(A) } 
B->foo(); # do the method lookup so it gets cached in the B:: stash
say for keys *{B::}->%*; # prints ISA, foo, and can
say for keys *{A::}->%*; # prints b and Foo

B doesn't just point to A … it pulls a copy of the foo coderef into it's own stash. The fact we don't do this with package data is fundamentally why packages aren't classes IMO.

Returning to the problem at hand, I think the Cat example is doing us a disservice because it leads us to think that our imperfect metaphor is the problem (e.g. Not All Cats). I'm going to replace it with an example from some recent code of mine.

class Games::ECS::System {
    field $ecs;
    method ecs($new=undef) { $new ? $ecs = $new : $ecs }

    method components_required { [] }
    method update(@entities) { ... }
}

class HealthBarRendererTest :isa(Games::ECS::System) {
    method components_required { [qw(Position Health)] }
    # ... removed test-harness code
}

The components_required for a System are intrinsic to a System, not all systems require the same components, but it's an error if they don't require any.5 The fact that I have public access to that state as well is an encapsulation decision that shouldn't be forced upon me because of the nature of our inheritance system. And yet, like with Moose, the current system forces that. If we treated the fields as a lexical alias though:

class Games::ECS::System {
    field $ecs;
    field @components_required = qw();

    method ecs($new=undef) { $new ? $ecs = $new : $ecs }
    method components_required { @components_required } 

    method update(@entities) { ... }
}

class HealthBarRenderer :isa(Games::ECS::System) {
    field @components_required :inherit = qw(Position Health);
    # ... removed test-harness code
}

Now I can choose my public API and which pieces of state I wish to encapsulate independently of each other.

I remember being distinctly frustrated when I wrote the ECS::System code because the first idioms I reached for didn't work and then I had to implement what was in my mind a kludge. I remembered having these discussions and saying "let them eat cake" when it wasn't my head on the block, I think however the lexical alias to instance storage makes the most sense to me … and if we want to make a more solid distinction between parent / child /role data … we need to discuss that at a more fundamental level because we've got a much bigger change to Perl that needs to be made.

Footnotes

  1. Funnily enough this exact problem came up in some "real world" code I was working on.

  2. The result I got for java object dumper that I got is a thing to behold

  3. It's not humble at this point, though I don't have Damian's academic experience in either direction here

  4. I phrased this carefully because we'll have the same issues with stateful traits. I remember having these arguments on the Moose mailing list with Ovid.

  5. I think this just made the argument for class data for me, which I've always hated the idea of class data but here we are … sigh

@Ovid
Copy link
Collaborator

Ovid commented Mar 26, 2023

I think we have an X/Y problem, but it's not your fault. There's a fundamental issue in Perl that keeps cropping up, we've never fixed it, so we just ignored it.

class Base { field $value :param = "Ford"; method honk() { say $self->value }
class Child :isa(Base) { field $style :param; }

my $child = Child->new(value => "somevalue", style => "somestyle"); # doesn't blow up because we referred to `value`

Child doesn't contain an embedded instance of Base, nor did we ever suggest it would via our syntax. A instance of Child is a data structure that combines fields defined in both Base and Child, but it's a single data structure.

Sorry for a very tiny nitpick, but I think it's important here. The Child class does not combine those fields. (Even if it currently does, that's an implementation detail we should be free to change) It does not have access to the fields of the base class because all field $value does is declare a lexical instance variable. It's the attributes after it which expose it. Also, your code is a touch incorrect. method honk() { $self->value } will blow up because you didn't provide a reader. It should be method honk() { say $value }.

The unavoidable coupling between the classes occurs in the constructor (perhaps this is a design mistake).

The avoidable coupling occurs when developers choose to expose fields via :reader, or :writer, or by wrapping them in methods that let the outside world inspect/modify them.

I would argue that by not providing a mechanism to Child to access fields defined in Base we're saying that an instance of Child can't have access to it's own state simply because that state definition was made outside the scope of the Child class definition.

This is where this breaks down for me. Let's say we do that. and let's say I decide to port jsxl to Perl. I want to know that I can provide a public API and have a private implementation. If various developers choose to subclass my class, I can't know all of them, nor can I know what they're doing. This is where my client code constantly breaks because different teams are (ab)using inheritance all the time and a change to the base class breaks children.

My hypothetical JSXL class works great, people like the functionality, and my public API is the contract I have given to people. Barring something significant, it should not change. But people complain that the module is slow and some kind soul provides me with an XS implementation. So now I have JSXL::PP, JSXL::XS, and my JSXL module loads the XS if it's there, or falls back to the pure Perl if it's not. Awesome performance speedup!

And then the hate mail comes in because people have been relying on my private implementation and they can't do that with the XS version. A class is only there to share its API. The class is not just a bundle of state with some behaviors on top. Inside of the class, the state and behavior are tightly coupled and allowing you to mess with one without respecting the other is like changing the spark plugs in a car and installing copper spark plugs without knowing the engine requires platinum spark plugs.

Encapsulation is about respecting that coupling of state+behavior and only relying on the public API. That goes for subclasses, too. In fact, inheritance should be about sharing an API, not the implementation, but I won't win that fight (and I confess I often don't respect that for my own code, so it would be a weak argument, but then, I try to avoid inheritance now).

The real problem is the problem that Perl keeps running into again and again: respecting boundaries. If I publish some code, some rando out there can do just about anything they want to with it, including doing things against my will.

By the dying light the seven sublimating deities of the twilight realm, do not touch my privates!

Call me Woke.

But meh, do whatever you want with your own privates. That's your business, not mine. Your Perl can be TIMTOWTDI, but let me trust my own code.

And that brings us to what I think is the X/Y problem:

class JSXL::Extended :isa(JSXL) {
    ...
}

If JSXL is my class, whose is JSXL::Extended? If it's my class, I should be allowed to shoot myself in the foot. If it's your class, no footshooting allowed. But Perl has never had a method of handling that. This is a fundamental problem Perl has never solved (witness the various inside-out OOP systems on the CPAN which try to address that).

As an aside, if we think of web servers and web browsers as objects, they are completely separated by an API. They provide isolation, a complete hiding of state+process which goes beyond encapsulation. It's so powerful that if the web server crashes on a request, your browser does not. However, I never proposed that for Corinna because I am a Bear of Little Brain and I had no idea how to do that.

Getting back to our problem.

I've given plenty of thought about how this could be accomplished. I toyed with :authority(OVID) and friends, but wasn't sure how we could make that work without significant changes. Another thought was:

class JSXL {
    # field declarations
    # method declarations
    class JSXL::Extended :inner {
        ...
    }
}

In the above, JSXL::Extended would have access to the lexical variables in JSXL, perhaps along with the private methods. You're writing the code, so you can do what you want, even though I have doubts about the approach.

But as systems grow, that's not going to be scalable. You really should not have deep object hierarchies and should instead resort to composition over inheritance. But Perl is Perl and Perl is the Perl community, so this might be an acceptable compromise:

class JSXL {
    # field declarations
    # method declarations
    class JSXL::Extended :trusted;
}

# in another file
class JSXL::Extended :isa(JSXL) {
    ...
}

In that example, the JSXL class must explicitly list what classes are permitted to extend it and the JSXL::Extended can have some kind of access. If we don't have a class Random::Class :trusted; declaration inside of JSXL, than you writing class Random::Class :isa(JSXL) {...} fails at compile time.

Thus, the owner of JSXL has full control. That being said, this is not well-thought out, so I'm not suggesting this approach.

However, if someone goes into JSXL and locally hacks it to have class Random::Class :trusted;, then all bets are off. However, they've chosen to do something naughty and that's on them.

And they leave the company and someone else comes along and updates to a new version of JSXL to fix a critical bug and is trying to figure out why code breaks left and right. This is not a hypothetical problem. We discovered at one company that the lead dev, who had left the company, had hacked CGI.pm on the production servers to handle some new features they wanted, but since it wasn't in source control, we had no idea what the hell was going on.

This, in fact, brings us to a second problem that I've wanted to deal with, but we can't do that right now. Imagine a cpanfile with the following:

requires_verified JSXL => 'v1.2.3';

Imagine if that pointed to a canonical, secure set of SHA-3 hashes. When JSXL and all related files are loaded, hashes are calculated and things blow up if the hashes don't match. No one can go in and locally hack JSXL to do naughty things.

They could then hack the cpanfile to have requires JSXL => '= v1.2.3';, so we give them a way of working around the issue, but at least they have to jump through multiple hoops before they do something naughty.

At the end of the day, we can't build perfectly secure, safe systems, but we should at least provide tools which make it easier for me, as a module author, to know that I can provide an interface and have the freedom to change internals. Later, we want a way for you, as a module consumer, to know you're getting only the module you requested.

In other words, we can respect boundaries both ways.

Your Example

Looking at your example (and yes, please bring an ECS system to Perl! Pretty please!)

class Games::ECS::System {
    field $ecs;
    method ecs($new=undef) { $new ? $ecs = $new : $ecs }

    method components_required { [] }
    method update(@entities) { ... }
}

class HealthBarRendererTest :isa(Games::ECS::System) {
    method components_required { [qw(Position Health)] }
    # ... removed test-harness code
}

Here's what I would do (assuming I've understood correctly, which might not be true):

class Games::ECS::System {
    # this should probably be a method for strict validation
    # because we do not yet have data constraints
    field $ecs;
    field $components_required :param = [];
    method update(@entities) { ... }
}

class HealthBarRendererTest {
    use Games::ECS::System;
    field $ecs = Games::ECS::System->new( components_required => [qw(Position Health)] );

    ... more code here
}

In the above, we compose rather than inherit, and we rely on the public interface without violating encapsulation. Further, if other people are allowed to instantiate Games::ECS::System, you've already been eating your own dog food, so you can have greater confidence that you've provided what is needed in the contract.

@Ovid
Copy link
Collaborator

Ovid commented Mar 26, 2023

As an aside, my "compose rather than inherit" example might be even more generalizable if we use dependency injection, but it really depends on how the class is set up.

@perigrin
Copy link

perigrin commented Jul 16, 2023

Ok, this is verging on necro-threading but I honestly have been trying to formulate my thoughts for a response. It boils down to two problems I have, one technical and one philosophical.

Problem 1: Technical

Encapsulation is about respecting that coupling of state+behavior and only relying on the public API. That goes for subclasses, too. In fact, inheritance should be about sharing an API, not the implementation [...]

One of the criticisms leveraged, correctly, at Moose was that it required the creation of Public APIs that had no purpose other than to simplify internal logic.

has _bar => (
    is => 'ro', # now my class has a public `_bar()` method for internal purposes
    lazy => 1,
    default => sub { HTTP::Thin->new()->request(GET 'http://example.com')->as_json() }
);

Basically you're saying inheritance shouldn't/cannot violate encapsulation without explicitly creating behavior to mediate it. Unfortunately currently the only solution is to have public behavior to mediate private implementation details from parent to child. Yuck. I think you (@Ovid) agree with "yuck" because you mention how you've gone through a lot of exercises to claw back some semblance of encapsulation via protected or trusted or inner classes.

I'm utterly onboard with the first part, state access must be mediated by behavior; I'm utterly appalled with the second, behavior must be public. We also have a a second problem … 

Problem 2: Philosophical

The Child class does not combine those fields. (Even if it currently does, that's an implementation detail we should be free to change) It does not have access to the fields of the base class because all field $value does is declare a lexical instance variable.

What is an instance? Given the following code (borrowed just now from an early draft of a tutorial):

class Action { 
    field $entity :param; 
    method log_action($logger) { 
        $logger->printf('%s performed %s', $entity->name, blessed $self);
    }
    method perform() { ... } 
}

class MovementAction :isa(Action) { 
    field $dx :param; 
    field $dy :param; 
    method perform() { # TODO: $entity->move($dx, $dy) } 
}

my $action = MovementAction->new(entity => $player, dx => 0, dy => 1);

What is $action an instance of? Is it an instance of both Action and MovementAction?1 Based on the logic of what you're saying, that state belongs only to the class that defines it, this is the only rational definition I can think of. That is going to be a new formalism for a most programmers2. Also, we immediately imply otherwise in our public constructor syntax, where we pass state for the parent class via the child.

I think ultimately the answer is that inheritance is a mistake period. Unfortunately we're not off the hook because we're going to see similar problems with private state in roles which in my experience have similar issues with behavior3. State should be lexically scoped, but also sometimes needs to be communicated via non-public channels in a controlled manner. Ultimately any mechanism we include to provide polymorphism short of a actual type system is going to cause this problem I think.

Where does that leave us (me?)?

Assuming the above I think then that the right answer to me is to have a protected keyword similar to my:

    protected method entity() { $entity }

which either compiles to something like:

   method entity() { 
       die "invalid caller" unless caller() eq __PACKAGE__;
       $entity;
   }

or better is entirely invisible outside of the scope of a class or it's sub-classes. I think Damian concerns would be mitigated if is only be declarable on methods, but I can foresee a world with a :reader attribute and an analogous :protected that simply expose behavior for consumption within a limited scope … if you want to allow manipulation of the parent state you need to write your own protected method to explicitly allow that.

This would allow providing some way to communicate some private state in a controlled fashion without opening the world to public consumption of our implementation details.

Footnotes

  1. I'm not trying to be pedantic. I'm trying to understand the paradigm well enough to explain it to others.

  2. see also my struggling with it … and I have more than an average level of exposure to Object Oriented thinking

  3. see also: arguments on the Moose mailing list back in the day

@perigrin
Copy link

Having spent more time thinking and reading: alternatively we could just enable shadowing of parent fields.

class Base { 
    field $f :param; 
    method f() { $f } 
}
class Child { 
    field $f :param; # currently fails: f => is already allocated in the constructor
}

class Child2 { 
    field $f :param = 1 # currently fails: f => is already allocated in the constructor
     method f() { $f }
}

my $child = Child->new(f => 1);
say $child->f; # would print undef because the f() method wasn't shadowed
my $c2 = Child2->new(); 
say $c2->f(); # prints 1\n

We get 0 behavior re-use but we have a much more solid contract with behavior than anything else I can think of.

@Ovid
Copy link
Collaborator

Ovid commented Jul 17, 2023

Agreed that :protected (or :trusted) would only apply to methods. The main problem I foresee is this:

method execute () {
    $self->one;
    $self->two;
    $self->three;
}

method two () :trusted { ... }

In the above, two is called after one and before three. The question is whether that's a behavioral-flow requirement or incidental. If it's the latter, maybe we're OK. If it's the former, then subclasses overriding that (and they should have :override or else it's fatal), might call two without correctly calling one and three. It could be a tricky thing to get right. However, the parent class, by having a :trusted attribute on the method, is saying "I'm designed to be subclassed", so any trusted method must have its use documented and that becomes part of the class contract with subclasses, but not with the outside world.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feedback Cor Proposal Feedback
Projects
None yet
Development

No branches or pull requests

7 participants