Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visibility of fields in other instances #64

Open
leonerd opened this issue Jul 12, 2022 · 27 comments
Open

Visibility of fields in other instances #64

leonerd opened this issue Jul 12, 2022 · 27 comments

Comments

@leonerd
Copy link
Collaborator

leonerd commented Jul 12, 2022

We've already talked lots on the contentious subject of whether subclasses can see fields of their parent class (#61, etc...). Lets ignore that for now and just stick to fields within the same class.

It would be useful if certain code within a class could see fields of other instances, within that class. Primarily this is useful for things like operators.

Consider our dear old standard friend, the simple 2D point:

class Point {
   field $x :reader :param;
   field $y :reader :param;
}

We'd like to write an equality test operator for this. It could easily be written using those :reader methods:

class Point {
   ...
   use overload '==' => method ($other) {
      $x == $other->x and $y == $other->y;
   };

It is a little unsatisfying though to have to use the public reader API to access these fields. It relies on us always having those methods, and it means that every test requires two more dynamic method dispatches.

Compare to the way you'd probably write this in classical perl, which doesn't need any method invocations at all:

package Point {
   ...
   use overload '==' => sub ($self, $other) {
      $self->{x} == $other->{x} and $self->{y} == $other->{y};
   };

I can't help thinking we might be missing a trick here. Some nice way we could write this better.

@abraxxa
Copy link

abraxxa commented Jul 12, 2022

But that implies, that the compared $other object is of the same class.

@Ovid
Copy link
Collaborator

Ovid commented Jul 12, 2022

Should this be in the Discussions instead? They're available on this repo.

@Ovid
Copy link
Collaborator

Ovid commented Jul 12, 2022

Liskov implies that that overloading must be inherited (and can be overridden in some cases). Thus, this:

   use overload '==' => method ($other) {
      $x == $other->x and $y == $other->y;
   };

Must work even for a subclass because ->x and ->y are part of the public interface. But this is problematic:

   use overload '==' => sub ($self, $other) {
      $self->{x} == $other->{x} and $self->{y} == $other->{y};
   };

The above isn't not guaranteed to work with subclassing because the public interface of the subclass is not guaranteed to correspond to any particular internals:

 class PointDelta :isa(Point) {
    field ( $dx, $dy ) :param;

    method x { return $dx + $self->x }
    method y { return $dy + $self->y }
}

I think it's very important we focus on getting our interfaces correct before we start looking at optimizations.

Update: Also, I realize you're talking about fields within the same class, but as @abraxxa points out, this requires asserting that $other is exactly the same class, not a related class, or a class with the same interface/role, or a subclass.

@Ovid
Copy link
Collaborator

Ovid commented Jul 12, 2022

Steve McConnell writes in Code Complete, 2nd Edition:

It’s almost impossible to identify performance bottlenecks before a program is working completely. Programmers are very bad at guessing which four percent of the code accounts for 50 percent of the execution time, and so programmers who optimize as they go will, on average, spend 96 percent of their time optimizing code that doesn’t need to be optimized. That leaves little time to optimize the four percent that really counts.

He also writes:

Focusing on optimization during initial development detracts from achieving other program objectives. Developers immerse themselves in algorithm analysis and arcane debates that in the end don’t contribute much value to the user. Concerns such as correctness, information hiding, and readability become secondary goals, even though performance is easier to improve later than these other concerns are. Post hoc performance work typically affects less than five percent of a program’s code. Would you rather go back and do performance work on five percent of the code or readability work on 100 percent?

His book is incredibly well-researched and provides plenty of research to back up his assertions. We programmers are awful at guessing what's actually going to be our main issues.

@haarg
Copy link

haarg commented Jul 12, 2022

What about attributes without public readers?

@Ovid
Copy link
Collaborator

Ovid commented Jul 12, 2022

If the attribute doesn't have a public reader, it's not part of the API. Thus, subclasses won't worry about it, nor should related classes.

Pseudocode:

class Point {
    has ($x, $y) :param; # no reader

    # for the sake of argument, we assume that the $MOP is also injected
    # into methods
    user overload '==' => method ($other_point) {
        # get corresponding $MOP
        my $other_mop = $MOP->mop_for_same_class($other_point); # exception if not the same class
        my $x2 = $other_mop->get_field('$x')->value; # guaranteed to exist
        my $y2 = $other_mop->get_field('$y')->value; # guaranteed to exist

        return ( $x == $x2 && $y == $y2 );
    }
}

The above is cumbersome and not well thought out, but the MOP is there to handle edge-cases. So I would suggest that we will need more work designing an effective MOP rather than injecting new language features into Corinna.

Happy to hear counter-arguments. I see lots of code smells in my example.

@Ovid
Copy link
Collaborator

Ovid commented Jul 12, 2022

Hmm, I need more practical examples. I can't see the point (no pun intended) of a Point class which doesn't allow readers on $x and $y. As a result, if we should not expose this state, why would we expose equality of state?

Thus, without seeing some kind of concrete use case, it's hard for me to come up with a better solution at this time.

@haarg
Copy link

haarg commented Jul 12, 2022

The original discussion was not about the API or about subclasses. It was about the current class being able to access fields from other instances.

Any solution requiring the MOP that is not an extension or debugging tool is a failure IMO. A class should never have to use it against itself.

Cloning is an example of an operation where you would want to modify the internals of another instance without providing any direct external access to the fields.

@haarg
Copy link

haarg commented Jul 12, 2022

I could see a solution to this being twigil-like. $other!field to access $field in $other. And then $!field could imply $self, and be basically the same as $field. The specific spelling of this could probably be better.

@Ovid
Copy link
Collaborator

Ovid commented Jul 12, 2022

Any solution requiring the MOP that is not an extension or debugging tool is a failure IMO.

Could you define "extension" in this context?

@haarg
Copy link

haarg commented Jul 12, 2022

An extension to the object system itself, along the lines of MooseX modules.

@abraxxa
Copy link

abraxxa commented Jul 12, 2022

If it is planned to support serialization of objects, this serialization can be used to compare objects by serializing both and do a simple string eq.

@Ovid
Copy link
Collaborator

Ovid commented Jul 12, 2022

I think that since a class is an expert in its problem domain and thus an expert on its internals, it should be reasonable to allow a class to read the instance data of other instances of itself (not subclasses). After all, we can assume that if a class is instantiated, any other instantiations it encounters should be identical. However, that presupposes:

  • If an instance is a thawed, earlier version with different internals might cause failures
  • A class isn't modified by some runtime role application (this implies that roles may not inject state)

Another possibility is that a class can call private methods on other instances of itself, but that implies that those are effectively trusted methods and that violates encapsulation (a private method often changes internal state, but the control flow may be changed by another instance class calling that method, so this is a very bad idea).

@Ovid
Copy link
Collaborator

Ovid commented Jul 12, 2022

The simplest solution is to say that if a class wishes to compare itself to some other class, it must provide some public way of providing comparable data (provide a :reader or a method which returns a string representation or data structure). For the cases where we need to compare objects, our interface, and thus our contract, grows slightly, but we have safety in falling back to a well-understood model rather than jumping through hoops to provide something new and possibly making bad design choices (such as some of my ideas above).

@abraxxa
Copy link

abraxxa commented Jul 12, 2022

Regarding freeze/thaw (serializing/deserializing): we could use semantic versioning and only allow thawing of frozen state up to the same minor or major version.

@Ovid
Copy link
Collaborator

Ovid commented Jul 12, 2022

@abraxxa That requires the developer to manually add version numbers, doesn't it?

@abraxxa
Copy link

abraxxa commented Jul 12, 2022

I was still thinking class = package which has a version.
But yes, that would require that a class has a (meta?) version attribute that‘s used to ensure this kind of safety.
But freeze/thaw isn‘t widely used, so I‘m not sure if we should invest too much brainpower into it.
Nothing stops anyone from adding a role that handles it for their class.

@HaraldJoerg
Copy link
Contributor

But yes, that would require that a class has a (meta?) version attribute that‘s used to ensure this kind of safety.

Depending on the use case, you might need that recursively for all fields which are objects, and also for the class inheritance hierarchy and every role (BTDT).

Nothing stops anyone from adding a role that handles it for their class.

In the current concept (and the current Object::Pad implementation), a role is of very limited use with regard to freeze/thaw. A role has no special access to object fields. A role can require a method from the consuming class to produce the class fields in a serializable way: this would also work for inheritance. But it does not work for fields from other roles consumed by the class unless each of these fields has a public reader method.

The current MOP of Object::Pad, on the other hand, allows to examine any object because you can pass it an object instance. Therefore, you can just write a serializer in a separate piece of code and don't need to add it to a role.

Passing another instance would also be required for a MOP-based comparison of two objects as drafted by @Ovid above. Compared to just making the relevant fields readable, this looks icky.

@thoughtstream
Copy link

The point of encapsulation is to ensure that the state of an object cannot be
changed by unrelated code. That’s why code that is outside the class (or, more
precisely, outside the class definition’s block scope) should not have access
to any field of that class. This is both Right and Good. ;-)

But the fields of another object are in the block scope of their shared class,
so whether an object’s fields should be accessible to another object of the same
class then becomes less a question of ethics...and more a question of theology.

From one (entirely reasonable) point of view, a declared field of any object
clearly is in the lexical scope of all subsequent methods of the defining class.
Hence there is no reason it should not be accessible to those methods, even if
some other object of the same class is currently invoking said methods.

However, from another (equally reasonable) point of view, fields are per-object
entities. So they are technically only in scope of the current instance methods
of the corresponding object. That’s why you can’t access an object’s field from
a :common method of the same class. Common methods are per-class, not per-object.
Likewise—in at least some OO formulations—ordinary methods are considered
to be per-object, not per-class. Hence they should only have access to the
current instance of the field belonging to the current object.

Looking at it from another perspective, encapsulation is all about being able to
determine the behaviour of a class, and of its instances, by looking only at the
code defined in that class. Allowing an instance method to access the fields of
other instances does not break such a contract.

But from the perspective of an individual object, the implied guarantee of its
state consistency is at least weakened when other objects can access that state.
An object can no longer rely on its state only changing when a method is called
on that object itself; its state may now change when a method is called on some
other instance of its class, (or, indeed, on a class method, since a class method
can also presumably access the internal fields of any object passed to it).

This, to my mind, quickly starts to become dubious from a software engineering
perspective. Individual objects can no longer rely on their own integrity of state;
state consistency is reduced to only a class-level property.

In practice, of course, direct access to another object’s fields is only
problematical if it is write-access. If we permit only read-only access
to those fields, objects can never fall out of a consistent state unexpectedly.
So, from a philosophical point of view, I have no objection to read-only
inter-object field access within a class.

However, there are also practical issues. The most significant of which is,
I think, that Corinna classes will not (at least initially) support type
constraints on method parameters. And will almost certainly not require
such constraints, if they ever do become available.

To see why this is an issue, let’s look at @leonerd’s example:

   use overload '==' => method ($other) {
      $x == $other->x and $y == $other->y;
   };

The parameter $other has no type specifier, so that second argument to
the == operator could be almost anything:

    my $origin     = Point->new( x=>0, y=>0 );
    my $chromosome = DNA::Seq->new( pairs => \@seq );

    if ($origin == $chromosome) {
        say "Matched";
    }

This (one hopes!) would fail with a “No such method (x) in class DNA::Seq”
exception. Unless, of course, that class does indeed have x() and y()
accessors. In which case, it just silently does the wrong thing. :-(

But suppose we adopt some mechanism to allow direct field access.
Perhaps, @haarg's proposed $object!field syntax (though I would want
to think about possible syntaxes at much greater length before committing
to any particular one!)

Now the equality operator code is:

   use overload '==' => method ($other) {
      $x == $other!x and $y == $other!y;
   };

and the exception is something like: “Cannot access field $x of class DNA::Seq
from within a method of class Point”
. And we get that exception even if
DNA::Seq does happen to have an $x field. That’s a pretty clear win.

However, the win comes at a cost. In order to produce that exception, every
direct $object!field access has to check the class of $other against the
current class...at runtime...every time. Which means that the attempted field
access can’t be hardwired into the code by the compiler so as to improve
performance and enable compile-time error detection. Which seems to reduce
the value of the feature somewhat.

At least, until we can write:

   use overload '==' => method (Point $other) {
      $x == $other!x and $y == $other!y;
   };

...and shift the checking and access-hardwiring to compile-time.

All in all, I’m not entirely convinced that a slightly safer, but
probably no more efficient, mechanism for inter-object field access
is worth the additional cognitive load of adding yet another variable
syntax to Perl, or the inherent disadvantages of weakening inter-object
isolation.

That said, I’m not completely opposed to the idea either...provided that
any such direct field access is strictly read-only, and strictly confined to
the lexical scope of the field itself.

And, of course, I’d be far more enthusiastic if Corinna also had
parameter types, so that direct accesses could provide
better performance and compile-time error-detection.

@Ovid
Copy link
Collaborator

Ovid commented Jul 13, 2022

@thoughtstream wrote:

That said, I’m not completely opposed to the idea either...provided that any such direct field access is strictly read-only, and strictly confined to the lexical scope of the field itself.

I almost like this, but references are what's killing us every time. We would need deep cloning (which isn't always possible) and that is expensive. Copy-on-write would be lovely, but we don't have that and it's unlikely we'll get it soon. So we have no clean solution here.

@thoughtstream
Copy link

@Ovid noted:

references are what's killing us every time.

Indeed. And since Corinna does not (as yet) allow field @array or field %hash,
we can assume that a non-trivial percentage of fields will contain array or hash references. :-(

@Ovid
Copy link
Collaborator

Ovid commented Jul 13, 2022

@thoughtstream Actually, Object::Pad allows that and I'm pretty sure it's going to be in Corinna (but probably not with attributes).

@leonerd
Copy link
Collaborator Author

leonerd commented Jul 13, 2022

https://github.com/leonerd/perl5/blob/feature-class/t/class/field.t#L54

@thoughtstream
Copy link

Thanks for the correction, @Ovid and @leonerd.

However, I think my point still stands. If container fields take no attributes, then their contents can't be publicly initialized (via field @container :param) or retrieved (via field %container :reader), so we're probably going to see scalar fields co-opted for that: field $container_ref :param :reader

And @Ovid is quite right that this isn't even the most problematic case. Deep-copying such "container reference fields" (to ensure that they're read-only to other objects of the same class) would be feasible, albeit inefficient. But we're also going to see fields that contain references to closures, to methods, to filehandles, or to other objects. Deep-cloning those into immutability is not feasible at all.

@HaraldJoerg
Copy link
Contributor

In the same spirit I'd expect that a class method would also be able to use fields of its objects? That would make sense for functions which operate on a list of objects (e.g. sort objects according to a field).
Anyway, the feature is limited to simple cases where the fields in question are neither in a parent class nor in a role... and it would rise the question whether proper subtypes are allowed to stand in for an object.

@aquanight
Copy link

Regarding the issue of cross-instance field access... at least in the case of comparison there is a possible alternative approach:

class Point {
   ...
   method $equals ($other_x, $other_y) {
     $x == $other_x && $y == $other_y;
   }
   use overload '==' => method ($other) {
      $other->$equals($x, $y);
   };

Similar technique can probably be leveraged in most other cases. Note there's no dynamic dispatch here, and no need to dynamically recover fields individually or by name. There's (hopefully) a runtime check on $other but you needed that anyway.

@Ovid
Copy link
Collaborator

Ovid commented Dec 29, 2022

A blog post will be coming later, but for now ...

I ported two-decade old code to use the new object-oriented syntax for Perl. This is a modestly popular module and at least 34 other CPAN distributions use it. I often find it being used at client sites. In porting it, I learned the Corinna spec is flawed because I couldn't port it cleanly. Oh shit!

I am the lead designer of Corinna. Years of effort, building on top of extensive Perl history, taking the knowledge gleaned from Moo/se and many other OOP alternatives, books read, articles read, and an entire design team putting this together, but there are still some pretty glaring conceptual design flaws in Corinna. How did we screw up so badly?

After a couple of hours of writing and rewriting the explanation of what Corinna did wrong, including exploring ways of fixing the issue, I discovered Corinna is not flawed. It was my original OO design that was broken. Badly. The design of Corinna exposed design flaws in my code that I had never seen before.

Many Perl developers are going to discover they don't actually understand class-based OOP (including me, apparently).

I've already gotten grief from people who are pissed that OOP best practices that have been accepted for decades are provided in Corinna. I expect more grief in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants