Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access to constructor params from within ADJUST and field-initialiser blocks #77

Open
leonerd opened this issue Aug 27, 2022 · 40 comments

Comments

@leonerd
Copy link
Collaborator

leonerd commented Aug 27, 2022

Current state in Object::Pad

First, a little background on the current state of Object::Pad. There are ADJUST blocks, and there are field-initialiser blocks. They behave in very different ways.

The ADJUST keyword creates an entire (anonymous) function which behaves as such when invoked. It is passed a hashref to the constructor params as its first argument. It can access that with $_[0] or shift or optionally by providing a signatured parameter name:

ADJUST { say "The foo param was $_[0]->{foo}"; }
ADJUST { say "The foo param was ", shift->{foo}; }
ADJUST ($params) { say "The foo param was $params->{foo}"; }

(Of course, these examples are stupid in that they're doing things that :param would be far better suited to, but I didn't want to cloud up the examples with more complex real-world examples.)

By comparison, field-initialiser blocks are simple blocks, being parsed by no small amount of parser trickery into believing they're all just sequential blocks of the same function. That function is a single (anonymous) function stored as part of the class, used to initialise all the fields. These initialiser blocks do not currently have any access to the constructor parameters; which is somewhat limiting.

Both ADJUST and field-initialiser blocks get access to a $self lexical, the same as other methods.

Current state in feature-class branch

The current feature-class branch provides ADJUST blocks that allow either of the first two forms, but not the third form (with a signature parameter). The branch does not currently provide field-initialiser blocks at all.

The Question

And now we get on to my question. I would like to unify these two things together and at the same time make them better. In particular, I would like both ADJUST and field-initialiser to be simple blocks within one (implied) function rather than having one entersub overhead for every block. Additionally, I would like all of them to have the same access to the constructor parameters.

In particular, it would be great if we could say that a field-initialiser block is really just the same as an ADJUST block that assigns the field from its result, give or take some code that inspects the parameters hash to see if the caller already provided a value.

field $f { EXPR }   <==>   field $f;
                           ADJUST { $f = exists $params{f} ? $params{f} : do { EXPR }; }

The trouble though is how to provide this params hash(ref). It can't just come as the first value in @_ because any block would too easily be able to break access to it for all the later ones by just doing shift, or something equally dumb. It needs to be provided by something guaranteed to be visible to all the blocks, that no earlier block can get in the way of. So far I can only think of two ideas, neither of them are great:

Special %params lexical

In the same way that $self is special, make %params special in these blocks:

  field $x { delete $params{x} }

  field $y; ADJUST { $y = delete $params{y} }

It's a simple idea and easily understandable, but it does mean that no class is permitted a field called %params itself, because then there'd be no way for an ADJUST block to see it. Of course we're already in that situation at the moment with $self, so it doesn't make the problem much worse. But perhaps enough classes might themselves want a field %params, that this would become awkward.

Use the %_ superglobal

By considering analogy to the @_ superglobal already in perl, we can consider using the %_ hash as a storage of these name/value pairs:

  field $x { delete $_{x} }

  field $y; ADJUST { $y = delete $_{y} }

I think there's a certain neatness to this, and a certain symmetry with using $_[0], $_[1], etc... However, some folks might object to it on grounds that "newbies to perl might be confused by lots of symbol syntax". Personally I'm not very swayed by this particular argument, but it seems important to some. Another potential argument against doing this is that it seems weird to introduce a new use for %_ while at the same time trying to get rid of @_ in favour of function signatures.

Other Designs

Another way to look at it entirely, is to observe that the main reason for wanting the hash(ref) of params available in these blocks in the first place, was to do things with constructor parameters that aren't just simply "copy the value into a field". This was originally discussed for Object::Pad in RT137209. I say "discussed" - I thought out loud on a few ideas but nobody else has commented so far.

In many ways the current solution of just passing a hash(ref) around isn't very nice. It means that the actual parameter names used by the construction process overall are not manifestly expressed anywhere, and only become apparent in side-effects of the actual process of constructing an object. You can only find out those names by running it and seeing if it complains about missing or extra ones. It would be nice to find a nicer overall design for this sort of pattern, but so far one has not emerged.

@leonerd
Copy link
Collaborator Author

leonerd commented Aug 29, 2022

Latest thoughts: While it doesn't help the field-initialiser block, an ADJUST block could still work with something that happens superficially to look like a signature. The same parser hackery that applies to allowing each ADJUST block to just resume parsing of a single containing sub can be employed to adjust the pad names before resuming. It would permit:

ADJUST (%params) { say join "|", keys %params }
ADJUST (%args)   { say "Foo is $args{foo}" }
ADJUST           { say "params aren't visible here at all" }

which would allow ADJUST blocks to request access to the params, while neither eating up a reserved name nor using the %_ superglobal. Though it still doesn't solve the wider metadata/declarative question.

@haarg
Copy link

haarg commented Aug 29, 2022

I don't like the idea of "Special %params lexical". %params is not obviously magic, and someone could easily want to name a field or other variable with that name. $self is only acceptable as a reserved variable because it is so well established in perl that we can't really use anything else for the instance variable. But no such consensus exists for constructor params.

%_ is better, in that it is obviously "special" and maps well to the old @_. But we're trying to move away from @_.

Of the options presented so far, something declared by the user seems the only really acceptable option. As stated, it's not obvious how to combine that with initializer blocks, but I'm not sure that combining them is actually a necessary feature.

@leonerd
Copy link
Collaborator Author

leonerd commented Sep 1, 2022

Thinking further about it, I'm not sure I like the exact syntax of this latter approach. It's nice to imagine giving an ADJUST block a way to request the name of a lexical hash created in scope that aliases the parameters hash, so it can delete on it, but I don't think the syntax given here is quite the right thing:

ADJUST (%params) { say join "|", keys %params }

Mostly because this really does something very different to the similar-looking syntax of

method f (%params) { ... }

What we're doing in the ADJUST block is really something a bit more like a refalias. Signatures don't support that yet, but if they did I imagine it'd look like method f (\%params) { ... } and so maybe that suggests similar for ADJUST. But that's what we'd always write, so it seems a bit noisy.

Another potential downside to this current notation is that it doesn't leave us any room for adding more things in there. Again by analogy to slurpy parameters, it feels like %params ought to be last, with nothing else after it. I don't know what else we might want to pass in, but currently there isn't space for it. So boo. :(

I almost wonder if the ADJUST block wants an attribute to give it the name of a new lexical to create, instead:

ADJUST :please_provide_me_the_parameters_in_a_hash_called(%params) { ... }

I'll take opinions on a nicer name ;)

@leonerd
Copy link
Collaborator Author

leonerd commented Sep 1, 2022

And now some more examples on the "declared metadata" front...

Looking around at various actual ADJUSTPARAMS blocks I have in real code right now, it seems most of them operate on a theme of taking some fixed set of named constructor arguments, and using them to set up the values of some fields by using code, rather than simply storing the values as passed in.

For example, a few operate on a theme of taking the name of a file to be opened, opening it in the ADJUST block, then storing the filehandle in a field of the instance:

https://metacpan.org/release/PEVANS/Electronics-PSU-DPSxxxx-0.02/source/lib/Electronics/PSU/DPSxxxx.pm#L67-80

Somewhat paraphrased:

field $fh :param { undef };
ADJUSTPARAMS ( $params ) {
   unless( $fh ) {
      my $path = delete $params->{path};
      # code here to open $path and store it in $fh
   }
}

It feels like this is a fairly common sort of pattern - parameters that are passed in, not to be stored directly, but instead to be used as part of a larger block of code that is evaluated at construction time which eventually yields the initial value for a field to be stored.

It's annoying that these parameters aren't declared any how, and only work as a side-effect of running the actual code. There's no predelcared metadata, so things like name-clashes aren't visible like they are for :param:

$ perl -MObject::Pad -E 'role R1 { field $x :param } role R2 { field $x :param } class C1 :does(R1) :does(R2) {}'
Named parameter 'x' clashes with the one provided by role R2 at -e line 1.

While it isn't yet a thing, it is tempting to imagine that perhaps some way to attach documentation snippets could be found for :param-annotated fields. If such a thing existed, then perhaps a sufficiently-advanced IDE would be able to display such information about named parameters being passed into constructors when they were encountered. These additional parameters being eaten by ADJUST blocks wouldn't be visible in that way either. Notice in the above example code, the string "path" only appears once, in the delete $params->{path} expression. It doesn't appear anywhere else - specifically, not in any compiletime-declared metadata about the class.

It's tempting to think that perhaps what is wanted is a new class-level keyword for declaring that named parameters exist, so that they can have such information attached.

param path :apidoc(The path to the filesystem device to open, if 'fh' is not provided);

However, that's just dangling out in space somewhere separate, there's nothing to attach it to the specific value within the ADJUST block. It'd be too easy to become separated from it.

It feels like the core of why this is hard to design comes from the fact that "named params" want to be two things at once:

  • Values passed in lexical variables directly into the body of an ADJUST block
  • Toplevel named declared things that are members of a class, known about at compiletime

There doesn't at the moment appear to be a syntax idea that somehow addresses both concerns.

@aquanight
Copy link

One possible answer to this: revisit the notion of "lexical fields" (at the very least still outside of methods) and through this, not needing to provide any parameters to ADJUST:

Using the file example above:

field $fh :param { undef };
{
    field $path :param { undef };
    ADJUST {
        defined($fh) != defined($path) or die;
        $path and (open $fh, "<", $path or die "$path: $!\n");
        # In theory, when this ADJUST ends, $path is no longer visible to anything.
        # Thus it could be released from the object.
    }
}

@Ovid
Copy link
Collaborator

Ovid commented Sep 13, 2022

I think we might revisit this by not looking at ADJUST as a phaser. The original intent was to look at this similar to Moo/se, but use different names for the methods to ensure they weren't viewed as the same thing as Moose methods. In particular, Moo/se has:

# called before new
around "BUILDARGS" => sub ( $orig, $class, @args ) { ... }; 

sub new ($class, $args) { ... } # implicit. We don't write this directly

# called after new
sub BUILD ($self, @args) { ... }

For Corinna, I envisioned something vaguely like this (with naming suggested by @thoughtstream, with NEW calling each of these in turn, but respecting inheritance by calling parent versions, too):

# before NEW
method CONSTRUCT :common (@args) {
     # $class is available, but not $self
     # must return even-sized list of key/value pairs that
     # are passed to new()
}

method NEW (%args) { ... } # called implicitly using args returned by CONSTRUCT

# after NEW
method ADJUST (%args) {
    # both $class and $self available. Do what you will before
    # new object is returned
}

This is somewhat described here in the Wiki, but I see now that it didn't make it into the official RFC. When I saw you were using ADJUST differently, I said nothing because I was tired of how divisive so many arguments about Corinna were. That's my fault. I should have said something. Burn out.

CONSTRUCT is used to to take whatever is passed to new and return K/V pairs. After all CONSTRUCT methods are called, the final result goes to NEW (internal method). It's awkward because it looks like a phaser (and those don't take args) and it kind of looks like an instance method, but it's not quite one.

ADJUST is described here and is called after NEW.

However, the Moo/se paradigm is only awkward (in my opinion) because Moo/se is limited by Perl's current syntax. The overall idea seems solid.

@leonerd
Copy link
Collaborator Author

leonerd commented Sep 13, 2022

Your suggestion there doesn't seem to any clearer answer how to handle these different params. Perhaps a concrete example is needed.

Lets imagine some UI widget that needs a title and a colour specified in red/green/blue components. The title is just stored directly as text, but the red/green/blue are actually forwarded to the constructor of a "Colour" object, which is then stored. Right now in Object::Pad or feature-class I can write the following and it technically works:

class Widget {
   field $title :param;

   field $colour;
   ADJUST ($params) {
      $colour = Colour->new( delete %{$params}{qw( red green blue )} );
   }
}

my $widget = Widget->new( title => "The text", red => 100, green => 30, blue => 0 );

This has some nice properties:

  • All the ADJUST blocks across every role/component class are run, independently. Each gets to inspect the params and delete from it the ones they consume
  • Any left-over params that no block deleted are known to be unhandled, so we can complain about them.

E.g. consider what happens to someone who does Widget->new( cyan => 123 ).

It has some downsides to it though. Most notably, the words red, green and blue only appear in the behavioural code; in the body of the ADJUST block in that hash-slice delete expression. They don't otherwise appear anywhere that is inspectable as declarative metadata. For example, we can imagine what might eventually happen when IDEs and code editors start to understand this syntax.

class Widget {
   field $title :param :is(Text) :apidoc("The title text to display at the top");
....

my $widget = Widget->new( title => ... 

If the user hovers the mouse over title here perhaps they'll be reminded the type and the apidoc help string for this parameter.

Where in the class declaration can we attach similar information about the red, green and blue constructor params?

With reference to your suggested methods above (CONSTRUCT, NEW, ADJUST) would you be able to find any better answers on this? Specifically, where red/green/blue are handled, where any metadata about them might be declared (currently we have nowhere), and where the unrecognised cyan param would be complained about.

@Ovid
Copy link
Collaborator

Ovid commented Sep 14, 2022

With reference to your suggested methods above (CONSTRUCT, NEW, ADJUST) would you be able to find any better answers on this? Specifically, where red/green/blue are handled, where any metadata about them might be declared (currently we have nowhere), and where the unrecognised cyan param would be complained about.

I'm still not seeing the problem. Currently, this can be done in Moose with MooseX::StrictConstructor and BUILDARGS. The lack of metadata about the documented arguments is either:

  • Just fine because that's what the class designer intended or ...
  • Terrible because they didn't think about it

So in the latter case:

class Widget {
   field $title :param;
   field ($red, $greed, $blue ) :param;
   field $colour;

   ADJUST ($params) {
      $colour = Colour->new( delete %{$params}{qw( red green blue )} );
   }
}

my $widget = Widget->new( title => "The text", red => 100, green => 30, blue => 0 );

Note that the colors don't have readers or writers, as per your original code, so I think this satisfies that and it's more correct because the user must supply them. (You can always return undef from an initializer block if they're optional).

If my code sample doesn't address your issue, can you help me understand why? If this is solvable under Moose but not Corinna, can you show me a Moose example?

@HaraldJoerg
Copy link
Contributor

The issue with your code example is that the "fields" $red, $greenand $blue are just waste after the construction of the object, but can't be disposed of. This is not a showstopper, but I consider it a minor flaw.

As for the comparison with BUILDARGS: This will get more pronounced as soon as inheritance comes into play. In Moose, BUILDARGS is a method, thus can be chained to call the parent's BUILDARGS whenever it sees fit. Also, BUILDARGS can be used to define arbitrary constructor APIs, not just key/value pairs. I think we might need more procedural control in ADJUST blocks, too.

Here's an example which can be made to work with Moose, but not with Object::Pad:

use Object::Pad;

class Complex {
    field $real :param;
    field $imaginary :param;
}

class Real :isa(Complex) {
    ADJUST {
       # Here, I should be able to set $imaginary to 0.  But how?
       # $imaginary isn't available, and setting imaginary => 0
       # has no place to set it
    }
}

my $real = Real->new(real => 123);
# Required parameter 'imaginary' is missing for Complex constructor

@leonerd
Copy link
Collaborator Author

leonerd commented Sep 14, 2022

The issue with your code example is that the "fields" $red, $greenand $blue are just waste after the construction of the object, but can't be disposed of. This is not a showstopper, but I consider it a minor flaw.

It could be considered a major flaw. Right now in this particular example they're just numbers being the colour components; so it's a fairly harmless problem. But consider if those were large object references themselves. Those dead fields needlessly holding them are going to inflate the refcount, delaying their destruction. That at least wastes memory and depending on what other behaviours might be attached to destruction time of those objects, might actually be a bug, if some other code was relying on timely destruction of them.

We really mustn't store construction-time parameters as fields unless we intend to store them as fields. Using the fields just as temporary storage during construction time and then not touching them afterwards isn't really a good solution.

@leonerd
Copy link
Collaborator Author

leonerd commented Sep 14, 2022

If my code sample doesn't address your issue, can you help me understand why?

@HaraldJoerg mostly explained that in the reply; I expanded on it above.

If this is solvable under Moose but not Corinna, can you show me a Moose example?

Well I don't know Moose anywhere near enough to suggest how to do it there, but consider this in classical perl:

package Widget;
sub new ($class, %params) {
   my $title = delete $params{title};
   my $colour = Colour->new( delete %params{qw( red green blue )} );

   die "Unrecognised construction parameters - " . join(", ", sort keys %params) if keys %params;

   my $self = bless {
      title => $title,
      colour => $colour,
   }, $class;
   return $self;
}

We haven't pointlessly stored the plain red, green or blue values, other than by forwarding them to the constructor of the Colour instance. We've still complained to our caller about any other unrecognised construction-time params.

@leonerd
Copy link
Collaborator Author

leonerd commented Sep 14, 2022

@HaraldJoerg

Here's an example which can be made to work with Moose, but not with Object::Pad:

Ah, that example is showing off an equally-valid, yet quite different problem. That's more about how subclasses can alter the definitions of existing fields on their parent classes. It's quite common in Moose for example to alter an existing field; in your example case:

has '+imaginary' => default => sub { 0 };

The + symbol says not to add a new field here, but instead we're altering the definition of an existing one for members of this subclass.

For your usecase, it's as if we'd want to allow something like:

class Complex {
  field $re :param :reader :writer;
  field $im :param :reader :writer;
  ...
}

class Real :isa(Complex) {
   inherited field $im { 0 };  # give it a default value
}

though ideally you'd want to try to remove the ":param" part of it; making it a construction-time error to explicitly pass it.

However, that being the case it does sortof suggest at least this example, saying that a Real number is a subclass of a Complex one, is perhaps not the best example. What should be the correct behaviour of

my $z = Real->new( re => 4, im => 3 );

?

@HaraldJoerg
Copy link
Contributor

@leonerd:

my $z = Real->new( re => 4, im => 3 );

That should be an error. I could live with a sloppy implementation where im is silently ignored, but not with one which would accept the value of 3.

I might be able to come up with a more complex, but more relevant example at some time.

I can not quote any relevant literature on that, but in my belief a class is responsible for the complete API of its constructor. Saying "ah, and in addition you can pass all parameters of the base class" is frequent practice and sort of acceptable if you don't want to duplicate documentation, but not correct. The fact that class Real is based on Complex is an implementation detail, a user of Real should not need to know about this.

So, some instance within class Real should be able to set up the construction parameters for its base class (and also, set up initialization parameters for every role it :does). If we have an explicit mechanism for that, then we can also solve the problem of conflicting :param names in base classes and roles. Unlike Moose, we do not have the problem that everything needs to live in one single hash after construction, roles and parent classes have their own spaces for fields!

@Ovid
Copy link
Collaborator

Ovid commented Sep 14, 2022

The issue with your code example is that the "fields" $red, $green and $blue are just waste after the construction of the object, but can't be disposed of. This is not a showstopper, but I consider it a minor flaw.

It could be considered a major flaw. Right now in this particular example they're just numbers being the colour components; so it's a fairly harmless problem. But consider if those were large object references themselves. Those dead fields needlessly holding them are going to inflate the refcount, delaying their destruction. That at least wastes memory and depending on what other behaviours might be attached to destruction time of those objects, might actually be a bug, if some other code was relying on timely destruction of them.

Now I see your concern, but I don't see it as a "major flaw" in terms of Corinna. It's a major flaw in terms of the general expressiveness of most OO systems. With those systems that define a constructor as the name of the class, you could do this (pseudo-code):

class Widget {
   field $title :param;
   field ($red, $greed, $blue ) :param;
   field $colour;

   method Widget :constructor ($params) {
      $colour = Colour->new( delete %{$params}{qw( red green blue )} );
   }
}

For those types of OO systems (e.g., Java), no field would have a value without being explicitly assigned in the Widget constructor. The problem goes away.

With current Corinna syntax, it can still be fixed (keeping in mind that ADJUST is called after the constructor:

class Widget {
   field $title :param;
   field ($red, $green, $blue ) :param;
   field $colour;

   ADJUST ($params) {
      $colour = Colour->new( delete %{$params}{qw( red green blue )} );
      undef $_ foreach $red, $green, $blue;
   }
}

In the above, we get undefine those values, but the metadata is still available. That solves the issue, yes?

I think (I could be wrong) that this is an edge case that's less likely to be a serious issue, but it's easy to solve if it arrives. I acknowledge that you might write code that often hits this issue. I'm unsure of the last time that I have. However, I do admit that I've written plenty of code where arguments are required for construction, but not post-construction.

@Ovid
Copy link
Collaborator

Ovid commented Sep 14, 2022

However, that being the case it does sort of suggest at least this example, saying that a Real number is a subclass of a Complex one, is perhaps not the best example. What should be the correct behaviour of

my $z = Real->new( re => 4, im => 3 );

Well, that's annoying  😃 I'm not sure of the best approach here.

In classical Perl terms, the constructor for Real might look like this:

sub new ( $class, %args ) {
    my @keys = keys %args;
    if ( 1 == @keys && 're' eq $keys[0] ) {
        return bless $class->SUPER::new(%args, im => 0), $class;
    }
    croak("...");
}

In Moose, I think this would work:

has '+im' => (
    default  => 0,
    init_arg => undef,
);

You'd probably want MooseX::StrictConstructor with that.

So for Corinna, possibly something like this?

class Complex {
  field $re :param :reader :writer;
  field $im :param :reader :writer;
  ...
}

class Real :isa(Complex) {
   field $im :overrides { 0 };  # give it a default value
}

If we wanted $im to remove the :writer, we don't have a syntax for that. But Liskov tells us that any place we have an instance of a class, we should be able to substitute an instance of the subclass and have it work. So we'd have to inherit all of the parent field behavior except if we have :overrides, it removes the :param for its definition (leaving all other attributes intact, unless otherwise specified) and if we wanted that as a :param, we'd have to add that attribute back in explicitly.

Or we could say that :overrides removes all parent attributes from the field and if we want to respect Liskov, we have to manually add them back: field $im :overrides :reader :reader {0}; That's probably a cleaner solution, even if it means that Liskov will often be unhappy if we copy the attributes incorrectly.

@HaraldJoerg
Copy link
Contributor

If we wanted $im to remove the :writer, we don't have a syntax for that.

I could argue that the pure existence of the writer makes the class unsuitable for subclassing with a class where the value is supposed to stay fixed. As the author of a subclass I have to check the complete contract of the parent class. If I want to subclass anyway, I have an explicit way to express that. Depending on the actual situation, either of these implementations might be appropriate:

method set_im :override ($whatever) {
    die "Reality alert: You promised this would not happen for this Real object.";
}
method set_im :override ($whatever) {
    $whatever and warn "Enhanced reality mode active for this Real object."
}

I think it is totally acceptable if cases like this can not be solved in a declarative way.

But also: We digress :) The auto-generation of readers and writers is a really nice convenience feature, but unrelated to object construction.

@aquanight
Copy link

aquanight commented Sep 15, 2022

The Real vs Complex issue looks like it would be one solved by leveraging CONSTRUCT rather than ADJUST. (Fortunately, CONSTRUCT is still relevant to constructor params!) CONSTRUCT looks to me to be the phaser/method that occurs at exactly the right time to give the idea that the top-level class controls the shape of new. To that end I would suggest that only the top-level class's CONSTRUCT is called by default: no other class or role runs CONSTRUCT phases unless the toplevel class explicitly calls them. (This is, incidentally, currently how Object::Pad's BUILDARGS is working.)

Consider, for example, this implementation of Complex and Real:

class Complex {
    field $re :param :reader;
    field $im :param :reader;
    CONSTRUCT { # Not called by Real::CONSTRUCT
        if (@_ == 1) { die unless shift() =~ /^(.*)[+-](.*)i/i; return re => $1, im => $2; }
    }
}
class Real :isa(Complex) {
    # Remove im as a constructor parameter.
    CONSTRUCT {
        if (@_ == 1) { return re => shift(), im => 0; }
        my %args = @_;
        exists $args{im} and die;
        $args{im} = 0;
        return %args;
    }
    method im :override () { 0; }
}

To make this work would require changing the specification to say that only the top-most class's CONSTRUCT phase is called. (Note that Complex's CONSTRUCT dies. This means Real's attempt to rewrite could only work if Real is the first (and possibly only) class to receive CONSTRUCT.) That phase is responsible for making the named argument list look like whatever it needs to look like for the benefit of itself and also its superclass(es) and role(s). To that end, it should have the option of being able to explicitly call superclass/role CONSTRUCT phases directly, and dispensing with the returned argument list as it sees fit (even as far as saying it's on the caller to deal with an odd-sized list). E.g. %args = (%args, $self->ParentClass::CONSTRUCT(...));

(The idea that a phase called CONSTRUCT exists to reshape the new argument list and ADJUST for second-stage field initializing is starting to seem a little weird to me the more I keep typing this. If we go this route, perhaps we should consider reversing those names?)

Speaking of, I mentioned previously the idea that we could allow "constructor-only fields" (fields that may or may not be :param, but are only used in field initializers and ADJUST blocks) to be released from an object when the constructor finishes. This would alleviate the "held reference" problem @leonerd mentioned previously, and it would have an interesting secondary effect: in the above Real class, by overriding $im's :reader with a basic method that doesn't reference $im at all, $im could theoretically be dropped from Real objects (unless of course something like $r->Complex::im() is a thing some jerk could do), making a Real take up less space than Complex.

@thoughtstream
Copy link

I'm planning a much (much much) longer response to @leonerd's original question,
with what I hope may be a clean and acceptable solution to the various issues raised
in this discussion.

But, for now, just a quick word about Square inheriting Rectangle or Real inheriting Complex.

Unless the classes in question produce immutable objects, these are the classic examples
of invalid OO design. They shouldn't work, and we shouldn't help them to work.

Note, however, that the solution to @leonerd's original question that I will be proposing
(later today, I hope!) will also cleanly handle this inheritance problem correctly
...at least for immutable objects.

@thoughtstream
Copy link

Unsurprisingly, I have observations and comments on this issue.
I apologize in advance for their extent. ;-)

Some preliminary thoughts

  • ADJUST (and whatever mechanism is provided to pre-tweak constructor args)
    both really need to be phasers, not methods. If they’re methods, the methods
    will appear to be magical (being auto-redispatched up/down the hierarchy).
    Phasers are called automatically, in specific sequences, so there’s nothing
    magic about ADJUST doing that too.

  • Furthermore, if they’re methods, they will pollute the class’s API
    and could also be called explicitly after construction,
    which is almost certainly a Bad Idea.

  • The current design allows us to fake an argument-list pre-tweaking behaviour
    via something like:

         method new_other :common (@args) {
             return $class->new( tweak_those(@args) );
         }
    
  • This is not a great solution, however, as nothing marks these methods
    as being constructor-ish. Except perhaps the naming convention...which is
    inherently fragile, and not helpful to automated code analysis tools.

  • Object::Pad currently offers the BUILDARGS() method, which “lurks inside”
    any call to new() (a good thing!) and is automatically pre-applied to
    the constructor args (another good thing!), but which also requires
    explicit redispatch to ensure that any inherited BUILDARGS() also get
    called. This is inherently fragile and pollutes the API (as above).
    Like every other aspect of automated construction, it would be cleaner
    and safer as a phaser instead.

  • More generally, though, allowing more than one class to pre-tweak
    the arguments of new() introduces complications and opportunities
    for enbugging the class hierarchy.

  • It also increases the coupling between classes in a hierarchy,
    which means that those enbuggings can be extremely hard to detect
    or correct, and can affect distant and seemingly unrelated code.

  • That’s why we originally designed Corinna’s constructors to accept
    only a properly formatted key/value list, from which every class
    up the hierarchy then extracts its own initializer values, by name.

  • Given all those considerations, I believe that any argument
    pre-tweaking mechanism needs to be integrated directly “behind”
    new() (for affordance), but must be a phaser (for robustness).

  • Moreover, only the most derived pre-tweaking mechanism should
    ever be invoked, and it should be incapable of redispatch
    (another advantage of making it a phaser).

  • That mechanism should also be required to return a properly formatted
    key/value list, which the hierarchical initialization mechanism
    can then use (or complain about, if it’s still wrong).

  • The other requirement in this issue is a way to declaratively
    specify all expected/required/optional named constructor args
    (hereafter “NCAs”), and automatically track whether they have
    been “consumed” by the various automatic initialization mechanisms
    (i.e. by field initializer blocks or by ADJUST blocks).

  • This is necessary so that the constructor can complain about
    required NCAs that are missing and/or about unexpected extra NCAs.

  • With regard to tracking unexpected/unused constructor arguments,
    I strongly agree that we need to find a way to make this feature
    fully automatic and declarative. If argument tracking requires the
    user code to manually delete entries from some parameter hash in
    an ADJUST and/or a field initializer block...that is inherently
    fragile and will lead to endless misery amongst everyday developers.
    We definitely need a better answer than that.

Design requirements

Given all of the above, here’s my list of design requirements
for object initialization:

  1. All initialization syntax must be as declarative as possible.

  2. As much behaviour as possible must also be automatic
    (i.e. does not have to be explicitly coded by the user).

  3. Named constructor arguments (“NCAs”) must be automatically
    marked as having been “consumed” by the initializer in all cases.
    It must not be possible to accidentally manually mark or unmark
    a “consumed” NCA.

  4. It must be also possible for an ADJUST or a field initializer block
    to access an NCA without it being automatically marked as “consumed”
    (so that two or more initializer blocks can both access that NCA,
    with only one of them actually “consuming” it).

  5. Which NCA(s) a given field “consumes” must be inferable by humans,
    compilers, IDEs, and other tools...purely from static syntax.

  6. Initializer blocks should be a kind of implicit ADJUST phaser.
    That is, they should both use exactly the same underlying mechanism,
    merely providing different syntactic affordances to it.

Design

My thinking for this design was as follows: If all initialization options
need to be declarative and automatic, then they all need to be hooked
onto the declarations of automated components. Specifically: hooked
onto the declarations of fields and ADJUST phasers.

So far, in Corinna, we’ve done that by the clever trick of using
the name of a field variable to determine the name of its
associated named constructor argument, as well as the
names of any autogenerated accessors for that field.

So why not use the same trick in reverse? Why not associate specific
named constructor arguments with identically named variables that have
been declared as part of initialization blocks and phasers?

In other words: if field initializer blocks and ADJUST phasers
had parameters, like subs and methods so, then the names of
those parameter variables could tell the initialization process
which NCAs to pass into each initialization block.

Better still, those parameter names would also declare
(statically, in the source code) that the corresponding
named constructor arguments should be marked as having
been “consumed” by that initialization block.

The only problem is that, in current Perl syntax,
raw blocks and phaser blocks can’t have parameters. :-(

So we simply change that reality...

A. Pre-tweaking constructor arguments

  • The BUILDARGS method becomes a phaser...and maybe gets a better,
    less-Moosey name. @aquanight referred to something similar
    as CONSTRUCT, but I think that name’s misleading. This phaser
    doesn’t construct the object; it simply provides the
    proper key/value list of named arguments that the underlying
    constructor was expecting to get from new().

  • So, in this proposal I’ll name it: NEWARGS.
    (Though, to be honest, I’m not strongly attached to,
    or even particularly happy with, that name.)

  • Multiple classes in a hierarchy can each have a NEWARGS phaser,
    but only the most-derived available NEWARGS phaser is ever called.
    And because they’re phasers, there’s no way to redispatch to an
    ancestral NEWARGS. (This is a feature, not a bug!)

  • In practice, this means that any call to any NEWARGS must
    return a key/value list that is suitable for initializing
    the fields of its own class, and those of all ancestral classes
    as well.

  • This makes each class entirely responsible for adapting
    its chosen new() API(s) to the standard key/value NCA format
    automatically accepted by all of its ancestors. This maximizes
    class independence and minimizes undesirable coupling between
    classes within a hierarchy.

  • Unlike other Perl phasers (so far!), the NEWARGS phaser
    must be defined with a signature/parameter list,
    which is placed between the keyword and the block.
    This parameter list uses the same syntax and provides
    the same features as modern Perl subroutine signatures.
    For example:
    NEWARGS($x, $y, $z, @etc){ tweak_those_args_here() }

  • The automatically provided new() constructor can now accept
    any list of arguments, but it immediately calls the
    most-derived NEWARGS (if any), binding new()’s
    argument list to the parameters of that NEWARGS.

  • Or, to look at it another way, the signature of a class's NEWARGS
    constrains the signature of the class's new() method.

  • After the NEWARGS block executes, the key/value list
    it produces is then passed on to the initialization process.
    If there was no available NEWARGS phaser to invoke,
    the original constructor argument list is passed along
    unchanged.

  • In order to produce its key/value list, NEWARGS must arrange
    for that list to be the final evaluated expression in its block.
    Because, despite having a parameter list, a NEWARGS phaser
    is still a block, not a subroutine, and you cannot use
    a return in it. In fact, a return in a NEWARGS block
    should be flagged as a fatal compile-time error.

  • Runtime errors within a NEWARGS – for example: missing arguments,
    or a failure to extract a suitable key/value list – can be signaled
    by throwing an exception within the phaser block.

  • Note that, because they cannot use return statements,
    NEWARGS phasers are inevitably going to be structured
    as tedious if...elsif...else cascades, as can be seen
    in the first few examples below.

  • This infelicity could be avoided if classes were allowed to define
    multiple NEWARGS phasers, but with only one of them
    is ever called for any given construction. Specifically,
    the initialization process would execute only the first
    NEWARGS phaser whose signature successfully binds
    to the actual argument list that was passed to the
    original call to new.

  • See Speculative examples (if multiple NEWARGS were supported)
    below for why this would be a markedly superior approach.
    (And, yes, this would require a kind of limited multiple dispatch,
    so I don't actually expect we'll get it, despite its manifest
    superiority. Breathe, @leonerd, breathe! ;-)

B. Field initialization via ADJUST phasers

  • The ADJUST phaser now allows for an optional signature/parameter list,
    specified between the keyword and the block, with the same
    syntax and features as Perl subroutine signatures. For example:
    ADJUST($size){ $max_size = $size < 0 ? $DEFAULT_SIZE : $size }

  • When a parameterized ADJUST is invoked, each of its named scalar
    parameter variables is bound to the corresponding named constructor
    argument, and that NCA is then marked as having been “consumed”.
    So, the above example would mark the 'size' NCA as “consumed”.

  • If there is no suitable NCA for a parameter, it is a fatal
    error: Missing 'size' argument in call to new()...
    unless that parameter was also specified with a default value,
    which is then used instead. For example:
    ADJUST ($size= 10) { $max_size = $size < 0 ? $DEFAULT_SIZE : $size }

  • If the parameter list for the ADJUST ends in a slurpy array
    or hash, all other named constructor arguments (i.e. that were not
    bound to any preceding scalar parameter) are copied into that final
    slurp parameter. These extra arguments are not marked as
    having been “consumed”. This allows an ADJUST to access named
    constructor arguments that “belong” to (i.e. are formally
    “consumed” by) other initializers.

  • Having received the appropriate named constructor arguments
    via its parameter list, the ADJUST block can then use
    those parameter values to initialize any field variable currently
    in scope (or, indeed, to perform any other appropriate action
    or side-effect).

  • Ideally, a given class could have as many distinct ADJUST phasers
    as the user wished, with each of them being called in turn and
    being passed the appropriate subset of constructor arguments,
    as determined by their individual signatures.

  • As mentioned earlier, the signature/parameter list on an
    ADJUST phaser is entirely optional. ADJUST phasers specified
    without a signature receive no constructor arguments and
    do not mark any as “consumed”.

  • Such unparameterized ADJUST phasers are still useful.
    For example, to verify post-initialization object integrity.

C. Field initialization via field initializer blocks

  • The optional initializer block for a field also needs a way to
    declare a signature/parameter list, so as to provide it with
    access to the appropriate named constructor argument(s).

  • Unfortunately, whilst there is a feasible syntactic slot for adding
    signatures to the phaser syntax (i.e. between its keyword and its block),
    there is no feasible syntactic slot for adding a signature to
    a bare Perl block.

  • Rather than invent a special parameter syntax just for field initializer blocks,
    I propose that we add an extra attribute to field declarations,
    within which the (optional) signature of the field’s initializer block
    may be declared.

  • Specifically, we add an :adjust(...) attribute, in whose parens
    the signature for the field initializer block is placed.
    For example:
    field $title:adjust($label){ ucfirst lc $label }

  • This would be exactly equivalent to:
    field $title;ADJUST ($label){ $title = ucfirst lc $label }

  • Just as with a parameterized ADJUST, if a field is declared with
    an :adjust(...) attribute, when the field’s initializer block is
    executed the appropriate named constructor argument would be marked
    as having been "consumed”. Hence, in the above example, the presence
    of :adjust($label) means that the value of the 'label' constructor
    argument would be passed into the field’s initializer block
    and the 'label constructor argument would also be marked as
    having been “consumed”.

  • Apart from the DRY-ness of not having to repeat the field variable
    within the block, the advantage of using an :adjust($name) attribute,
    rather than an ADJUST phaser, is that the attribute directly
    associates the 'label' constructor argument with the $title field,
    allowing the object initialization process, (not to mention humans,
    IDEs, and other software entities) to easily detect this association.

  • Just like an ADJUST block, an :adjust(...) can specify two or more
    scalar parameters, with optional default values, as well as a final
    slurpy array or hash. All those parameters are then available in the
    field's initializer block, and all the corresponding named constructor
    arguments are marked as having been used (except, of course, for any
    that end up in the trailing slurpy).

  • If a field specifies both a :param and an :adjust(...), then the
    direct assignment of the identically named constructor argument
    (as implicitly specified by the :param) is attempted first,
    and the :adjust(...) and initializer block are ignored,
    unless there is no suitable named constructor argument for
    the :param. Only those named constructor arguments that are
    actually used (i.e. either by the :param or else by the
    :adjust(...)) are marked as “consumed”.

  • If a field specifies an initializer block but no :adjust(...)
    attribute, the initializer block is passed no constructor args,
    and does not mark any as “consumed”.

Examples

Phew! That’s an awful lot to assimilate.

Let’s see if a few examples can make these ideas clearer...

Example 1: Coloured widgets

    class Widget {

        NEWARGS (@args) {
            # Allow Widget->new( $title, $rgb )...
            if (@args == 2) {
                (title=>$args[0], red=>$args[1], green=>$args[1], blue=>$args[1]);
            }

            # Allow Widget->new( $title, $r, $g, $b )...
            elsif (@args == 4) {
                (title=>$args[0], red=>$args[1], green=>$args[2], blue=>$args[3]);
            }

            # Otherwise expect Widget->new( %named_args )...
            else {
                return @args;
            }
        }

        # Automatic initialization from 'title' constructor arg...
        field $title :param;

        # This field is NOT automatically initialized....
        field $colour;

        # Pull out the 'red', 'green', and 'blue' named constructor args
        # and initialize the $colour field with a Colour object...
        ADJUST ($red, $green, $blue) {
            $colour = Colour->new($red, $green, $blue);
        }

        # Pull out the 'label' named constructor arg to initialize this field,
        # or else use the value of the earlier $title field, if no 'label' arg provided...
        field $ID :adjust($label = $title) { $label =~ s/\W/_/gr }
    }

    # Exercise the class...
    $widget1 = Widget->new( title => "The text", red => 100, green => 30, blue => 0 );
    $widget2 = Widget->new( "The text", 100, 30, 0 );
    $widget3 = Widget->new( "The text", 0 );
    $widget4 = Widget->new( "The text" );       # ...DIES

Example 2: Coloured widgets (with cleaner and smarter named colour options)

    class Widget {
        field $title  :param;

        # The :param will first try to initialize $colour via a 'colour' named constructor arg.
        # Otherwise, the :adjust(...) will try to initialize via 'r'/'g'/'b'/ NCAs...
        field $colour :param :adjust($r, $g, $b) { Colour->new($r, $g, $b) };
    }

    # Specify individual colour components directly...
    my $widget1 = Widget->new( title => "The text", r => 100, g => 30, b => 0 );

    # Specify colour as a pre-build Colour object...
    my $widget2 = Widget->new( title => "The text", colour => Colour->new(100, 30, 0) );

    # Error because no NCA to initialize $colour field...
    # ("Missing arguments 'colour' or 'r', 'g', and 'b' in call to new()...")
    my $widget2 = Widget->new( title  => "The text" );

    # Error because 'colour' NCA will pre-empt 'r'/'g'/'b', leaving them "unconsumed"
    # ("Unexpected arguments 'r', 'g', and 'b' in call to new()...")
    my $widget2 = Widget->new( title  => "The text",
                               colour => Colour->new(100, 30, 0),
                               r=>0, g=>0, b=>0);

Example 3: “Ceci n’est pas un Complexe”

    class Complex {
        # Immutable object state (or this would be a crime against Liskov!)...
        field $re :param :reader;
        field $im :param :reader;

        NEWARGS ($N, @etc) {
              @etc > 0                            ?  ($N, @etc)                  # ->new( re=>1, im=>2 )
            : $N =~ /^(?<re>.*)[+-](?<im>.*)i$/i  ?  (%+)                        # ->new( '1+2i' )
            :                                        die "Unexpected arg: $N";   # ->new( 'snorkel' )

        }
    }

    class Real :isa(Complex) {
        NEWARGS (@args) {
            if (@args == 1) {
                (re => $args[0], im => 0);
            }
            else {
                my %args = @args;
                die "Can't specify imaginary component for real number"
                    if exists $args{im};
                (%args, im=>0);
            }
        }
    }

Speculative examples (if multiple NEWARGS could be supported)

    class Widget {

        NEWARGS ($title, $rgb,     ) { title=>$title, red=>$rgb, green=>$rgb, blue=>$rgb }
        NEWARGS ($title, $r, $g, $b) { title=>$title, red=>$r,   green=>$g,   blue=>$b   }
        NEWARGS (%named_args)        { %named_args }

        field $title :param;

        field $colour;
        ADJUST ($red, $green, $blue) { $colour = Colour->new($red, $green, $blue) }

        field $ID :adjust($label = $title) { $label =~ s/\W/_/gr }
    }


    class Complex {
        field $re :param :reader;
        field $im :param :reader;

        NEWARGS ($num)  { $num =~ /^(?<re>.*)[+-](?<im>.*)i$/i  ? %+  : die }
        NEWARGS (%args) { %args }
    }

    class Real :isa(Complex) {
        NEWARGS ($num)  { re => $num, im => 0 }
        NEWARGS (%args) { die if exists $args{im}; (%args, im=>0); }
    }

Conclusion

Well, that’s what I’d like to see, anyway.

It’s certainly not perfect, but I think it cleanly balances power and safety,
and I believe that it covers most of @leonerd’s issues and desiderata,
without mangling Perl syntax too badly.

I am, of course, as always, more than happy to answer any questions,
clarify any mysteries, fill in any omissions, debate any controversies,
or brainstorm any possible improvements to this proposal.

Thanks for coming to my TED talk. :-)

@Ovid
Copy link
Collaborator

Ovid commented Sep 15, 2022

The current design allows us to fake an argument-list pre-tweaking behaviour via something like:

 method new_other :common (@args) {
     return $class->new( tweak_those(@args) );
  }

This is not a great solution, however, as nothing marks these methods as being constructor-ish. Except perhaps the naming convention...which is inherently fragile, and not helpful to automated code analysis tools.

Agreed, but this could be done:

 method create :new (@args) {
     return $class->new( tweak_those(@args) );
 }

That would automatically be a class method and would be expected to return an instance of something. This could be useful for factory classes where it's clear that what is returned is not an instance of $class.

That being said, I'm not a fan of a proliferation of different ways of naming constructors. The multi-method approach you describe seems warranted.

I'm also not a fan of the name NEWARGS, though I can't think of anything better.

@thoughtstream
Copy link

That being said, I'm not a fan of a proliferation of different ways of naming constructors.

Yes, I think there's much to be said for avoiding the potential Babel of:

    $obj = MyClass->new(%args);

    $obj = YourClass->new(red=>$r, green=>$g, blue=>$b);
    $obj = YourClass->new_from_rgb($r, $g, $b);
    $obj = YourClass->new_from_colour($colourobj);

    $obj = TheirClass->obj(@args);
    $obj = TheirClass->make(@args);
    $obj = TheirClass->reify(@args);
    $obj = TheirClass->create(@args);
    $obj = TheirClass->realize(@args);
    $obj = TheirClass->manifest(@args);
    $obj = TheirClass->actualize(@args);
    $obj = TheirClass->instantiate(@args);
    $obj = TheirClass->corporealize(@args);

Or, to put it another way, a :new attribute would certainly help the compiler, the IDE, and other tools
to understand the method's intended purpose, but might actually make things worse for humans.

I'm also not a fan of the name NEWARGS, though I can't think of anything better.

BUILDARGS is better, or perhaps even INITARGS, because this phaser
builds the required arguments for the initialization process. I just worry that
BUILDARGS is already saddled with too many Moose-y associations/preconceptions.
And that INITARGS is ungainly and a little obscure.

Arguably, this phaser implements a user-redefinable interface for the signature of new(),
so it could possibly be named: NEWSIG

But none of these really ”sparks“ for me, so I left it at NEWARGS,
which is sufficiently infelicitous as to encourage us all to strive
for something better. ;-)

@leonerd
Copy link
Collaborator Author

leonerd commented Sep 16, 2022

Wow, much good thought here I see. I'll respond to a few quick bits to get them out of my head but I will write up a longer response later too.

All of @thoughtstream's "Preliminary Thoughts" look good. Nothing I disagree with there.

Of the design itself: It's far from a "no", but I am slightly cautious about trying to add too much meaning to the names of signature parameters. Not so much because it's a bad idea here, but simply that if we do it here we should consider wider Perl language overall and what it might mean to do similar things elsewhere. It may well be something we decide we want to do elsewhere too, but if we do we should aim to make something consistent across the whole language.

I could easily live with multi NEWARGS(...) support in S:K:MultiSub. I think it would be quite neat and powerful as an extension of the other multi-sub thoughts actually.

@thoughtstream
Copy link

@leonerd wrote:

I am slightly cautious about trying to add too much meaning
to the names of signature parameters.

Caution is always a wise starting position. :-)

Not so much because it's a bad idea here, but simply that if we do it here we
should consider wider Perl language overall and what it might mean to do similar
things elsewhere. It may well be something we decide we want to do elsewhere too,
but if we do we should aim to make something consistent across the whole language.

The consistent thing (that we’re having to work around here, because Perl doesn’t have it)
would be universal support for parameters being bound by name rather than by position.

In Raku we can say:

sub foo (:$r, :$g, :$b) {...}    # Named parameters...

foo(b=>1, r=>2, g=>3);           # ...are bound to named arguments

...and each value ends up bound the the correct parameter, despite
the args not being passed in the same order as their appropriate params.

That’s effectively what I’m suggesting for ADJUST (...), and :adjust(...),
except that I’m not proposing a special parameter syntax to specify “by name”
rather than “by position” binding. Instead, I'm suggesting that these constructs
just have that behaviour as a special (magic) case.

Of course, it would be vastly better if this didn’t need to be a special case.
But that implies a rather significant addition to the Perl signature syntax and
semantics. Which I was extremely reluctant to suggest, given how hard
it’s been so far to get general buy-in and approval for adding any subset
of Corinna to Perl.

If Perl already had named parameter binding, then the solutions
I would have proposed might have been something like:

        ADJUST (red=>$r, green=>$g, blue=>$b) {   # Hypothetical named param syntax
            $colour = Colour->new($r, $g, $b);
        }

        field $ID :adjust(label => $l = $title) { $l =~ s/\W/_/gr }

In which case those “match the named constructor arg to the identically named adjuster parameter”
behaviours would no longer be special magic, but just the regular everyday semantics
of “bind by name” parameters.

I could easily live with multi NEWARGS(...) support in S:K:MultiSub.
I think it would be quite neat and powerful as an extension of the
other multi-sub thoughts actually.

That, of course, is the other thing I was trying to work around in my proposal:
the complete lack of multiple dispatch in standard Perl.

If Perl already supported multimethods, we wouldn’t need NEWARGS at all.
We could simply specify that the automatically supplied default new() method
is always implemented as a multi.

Then, if class designers wanted to support alternative constructor signatures,
they would simply add extra variants of new(), each of which would simply
redispatch to the default new:

    class Widget {

        multi method new ($title, $rgb) {
            $class->new( title=>$title, red=>$rgb, green=>$rgb, blue=>$rgb);
        }

        multi method new ($title, $r, $g, $b) {
            $class->new( title=>$title, red=>$r,   green=>$g,   blue=>$b );
        }

        ...
    }


    class Complex {
        field $re :param :reader;
        field $im :param :reader;

        multi method new ($num) {
            die if $num !~ /^(?<re>.*)[+-](?<im>.*)i$/i;
            $class->new(%+);
        }
    }

    class Real :isa(Complex) {
        multi method new ($num)  { $class->new(re => $num, im => 0) }
    }

(Though, to be honest, the NEWARGS phaser mechanism I proposed
is still far more declarative and automatic that the above alternative.)

Alas, Perl has neither “bound-by-name” parameters, nor multimethods.
So I had to colour within the lines of existing Perl constructs and syntax.
Which led directly to those pesky special cases and behaviours in proposal
I offered. That’s kinda what I meant when I said: “It’s not perfect...”

But given the improbability of convincing the Powers That Be to add either
named parameter binding or multiple dispatch to Perl, I feared that in this case
Perfect might well be the enemy of Good (and, worse still, the fatal nemesis
of Approved ).

@Ovid
Copy link
Collaborator

Ovid commented Sep 16, 2022

Alas, Perl has neither “bound-by-name” parameters, nor multimethods. So I had to colour within the lines of existing Perl constructs and syntax.

At the risk of being controversial by not being cautious, @leonerd has written Syntax::Keyword::Multsub which gives us multiple dispatch. There are a couple of bugs which need to be resolved, but those don't look insurmountable.

I would, however, prefer sticking to the KIM syntax because multi is an adjective, not a noun.

method new :multi ($num) { ... }

Evolving Perl to stick with this syntax will make it more consistent and predictable over time. I suspect this would also make Paul's proposal for a metaprogramming API a touch easier.

@Ovid
Copy link
Collaborator

Ovid commented Sep 16, 2022

(Though, to be honest, the NEWARGS phaser mechanism I proposed is still far more declarative and automatic that the above alternative.)

True, but multi-method constructors seem (to me) to be easier to understand because they omit some of the complexity and having :multi is extremely useful outside of this one context.

@thoughtstream
Copy link

thoughtstream commented Sep 17, 2022

@Ovid observed:

At the risk of being controversial by not being cautious, @leonerd has written
Syntax::Keyword::MultiSub which gives us multiple dispatch. There are a couple
of bugs which need to be resolved, but those don't look insurmountable.

My understanding is that Syntax::Keyword::Multsub currently only provides
multiple dispatch for sub declarations. In this context, I believe
we would need method declarations instead.

BTW, ever since a much earlier private discussion with @Ovid on the need for
multimethods in Corinna, I too have been working on a multi implementation
(a module called Multi::Dispatch).

In contrast to @leonerd’s wisely cautious and elegantly minimalist approach,
my version is wildly ambitious, absurdly overpowered, and includes every
conceivable MD feature culled from half a dozen other high-level languages,
plus at least three additional kitchen sinks of my own devising.
Including, as it happens, multimethods.

It is a pure Perl implementation, so it runs appreciably slower that @leonerd’s
XS-driven version. Though, at just under 1 million dispatches-per-second
on my M1 Powerbook, it’s certainly no slouch and definitely Real World usable.

(Not entirely) coincidentally, Multi::Dispatch also solves the named parameter
binding problem, because one of its kitchen sinks is full strict destructuring
of keyed values within hashlike parameters.

I had been planning to unleash this module on an unsuspecting Perl world
some time in the next month or two, but would be happy to offer @Ovid
and @leonerd a sneak preview in the next few weeks, if either of them
were interested.

However, despite all that, I am not really convinced that either
Syntax::Keyword::Multsub or Multi::Dispatch are the right solutions here.
See below.

I would, however, prefer sticking to the KIM syntax because multi is an adjective,
not a noun.

method new :multi ($num) { ... }

Evolving Perl to stick with this syntax will make it more consistent and predictable
over time. I suspect this would also make Paul's proposal for a metaprogramming API a
touch easier.

I can’t speak to the meta-API issue, and despite my previous championing of KIM syntax,
I actually think KIM would be wrong here...and, more especially, wrong for a hypothetical
generalized Perl-wide multiple-dispatch mechanism.

Multisubs and multimethods are fundamentally different entities from subs and methods,
with fundamentally different declaration constraints and fundamentally different
dispatch behaviours. They even have (or, at least, should have) fundamentally different
internal control constructs (specifically for redispatch); constructs that ought to be
illegal in regular subs and methods.

In short, they are sufficiently distinct that I believe they need a separate keyword.
Not a prefix modifier (like Syntax::Keyword::MultiSub provides), but actual distinct
declarators. In Multi::Dispatch after much soul-searching, and many trials-and-errors,
I eventually chose multi (for MD subs) and multimethod (for MD methods).

But, once again, I fear we are straying from the main issue here.
When @Ovid begins his campaign to add MD to Perl (breathe, @Ovid, breathe! ;-)
I will be more than happy to argue KIM-vs-keyword with him at much greater length.

But multi-method constructors seem (to me) to be easier to understand
because they omit some of the complexity

Agreed. Though they also thereby omit some of the safety and robustness and
convenience and syntactic mellifluousness of an automated declarative mechanism.

and having :multi is extremely useful outside of this one context.

This to me is the clincher. Multiple dispatch is indeed tremendously useful
in other contexts (including other in-class contexts) and it would be ideal
if Perl were to support it.

However, I fear that designing, marketing, selling, and then implementing
a full MD mechanism for Perl is going to be even more controversial,
more challenging, and more “illegitimi carborundum” than shepherding
through this wonderful new OO mechanism has proved to be. :-(

Hence, I suspect it will be some considerable time, if ever, before
we can rely on a built-in MD solution to solve this problem.

And, with the most genuinely profound respect to @leonerd
(and somewhat less respect to myself), I do not believe that
we can – or should – rely on CPAN solutions for issues like this
that we discover within the current design of Corinna.

Happily, in this particular case, multimethods are not actually required.
If we decide that the NEWARGS approach is simultaneously both too complex
and too limited in scope, then we should just shelve the entire idea of
pre-tweaking named constructor arguments...at least for the time being.

Instead, we can mandate that the constructor argument API for Corinna
is fixed and immutable (new must always be passed a key/value list)
and then users who feel they need some other API can just supply their own.

Either via a singly dispatched new method with an internal if cascade:

    class Widget {

        method new :common (@args) {
            if (@args == 2) {
                my ($title, $rgb) = @args;
                $class->next::method(title=>$title, red=>$rgb, green=>$rgb, blue=>$rgb);
            }
            elsif (@args == 4) {
                my ($title, $r, $g, $b) = @args;
                $class->next::method(title=>$title, red=>$r,   green=>$g,   blue=>$b);
            }
            else {
                $class->next::method(@args);
            }
        }

        ...
    }

...or else via some third-party MD solution (such as Multi::Dispatch):

    class Widget {
        use Multi::Dispatch;

        multimethod new :common ($title, $rgb) {
            $class->next::method(title=>$title, red=>$rgb, green=>$rgb, blue=>$rgb);
        }
        multimethod new :common ($title, $r, $g, $b) {
            $class->next::method(title=>$title, red=>$r,   green=>$g,   blue=>$b);
        }
        multimethod new :common (%args) {
            $class->next::method(%args);
        }

        ...
    }

@Ovid
Copy link
Collaborator

Ovid commented Sep 17, 2022

If we decide that the NEWARGS approach is simultaneously both too complex
and too limited in scope, then we should just shelve the entire idea of
pre-tweaking named constructor arguments...at least for the time being.

I'm actually OK with that and saying it's out-of-scope for the MVP. I don't particularly see a problem with saying, "we have a consistent syntax and you have to use it for now." I'd rather we see how simple we can make things for the MVP and verify that the overall thing works. Once we release, we can see how it plays out in the real world. (though this means people are going to write method create :common (...) {...} to achieve this).

I suspect that this might be a minority opinion.

@abraxxa
Copy link

abraxxa commented Sep 17, 2022

+1 from me
Better to have no API (for now) than one we want/need to deprecate later.

@HaraldJoerg
Copy link
Contributor

The NEWARGS approach enables a case I have occasionally: A subclass with a restricted constructor. I'll take @thoughtstream's solution to my Real example, but drop the multi dispatch feature:

class Complex {
    field $re :param :reader;
    field $im :param :reader;
}

class Real :isa(Complex) {
    NEWARGS (%args) { die if exists $args{im}; (%args, im=>0); }
}

This is nice, and the only downside is that there's no declaration to show that im is not available as a constructor attribute.
A "loss" of constructor param declaration in the %arg soup also happens if the fields in question are provided by a base class or role. I'm modifying the widget example to add a role:

role Appearance {
    field $colour :param;
}
class Widget :does(Appearance) {
    field $title :param;
    NEWARGS (%args) {
        (title => $args{title},
         colour => Colour->new(@args{qw(red green blue)}),
        )
    }
}
my $widget = Widget->new(title => "The text", red => 100, green => 30, blue => 0 );

This works, but as in @leonerd's initial example:

Most notably, the words red, green and blue only appear in the behavioural code

...only this time it is in NEWARGS, not in ADJUST.

The obvious solution - Appearance could accept red, green and blue - might not always be available.

@thoughtstream
Copy link

@HaraldJoerg observed:

the only downside is that there's no declaration to show that im
is not available as a constructor attribute.

Setting aside the crimes against Liskov Substitutability (;-),
a sufficiently advanced multiple dispatch technology
would allow this constraint on the derived class constructor
to be expressed declaratively. For example:

    use Multi::Dispatch;

    class Complex {
        field $re :param :reader;
        field $im :param :reader;
    }

    class Real :isa(Complex) {
        multimethod new (%{ re => $re }) { # Named args, but only allowed to have 're' key
            $class->next::method(re=>$re, im=>0);
        }
    }

A "loss" of constructor param declaration in the %arg soup also happens if the fields
in question are provided by a base class or role.

Again, an S.A.M.D.T. can restrict named argument lists
to specific pre-declared keys:

    role Appearance {
        field $colour :param;
    }

    class Widget :does(Appearance) {
        field $title :param;

        multimethod new (%{ title=>$t, red=>$r, green=>$g, blue=>$b}) {
            $class->next::method( title => $t, colour => Colour->new($r, $g, $) );
        }
        multimethod new (%{ title=>$t, color=>$c }) {
            $class->next::method( title => $t, colour => $c );
        }
    }

    my $widget = Widget->new(title => "The text", red => 100, green => 30, blue => 0 );

@Ovid
Copy link
Collaborator

Ovid commented Sep 18, 2022

As an aside, I see an issue with this:

    class Widget {

        multimethod new (%{ title=>$t, red=>$r, green=>$g, blue=>$b}) { ... }
        multimethod new (%{ title=>$t, color=>$c }) { ... }
    }

In particular, maintainability. There's nothing to stop someone from writing this:

    class Widget {

        multimethod new (%{ title=>$t, red=>$r, green=>$g, blue=>$b}) { ... }

        # thousands of lines of code

        multimethod new (%{ title=>$t, color=>$c }) { ... }
    }

And then the poor maintenance programmer (often me, damn it), is unaware that there's something else they have to think about when maintaining the code.

There's an old joke about doctors where the patient complains that "it hurts when I do X" and the doctor replies "then stop doing X."

The Perl developer response to my concern is typically "then don't do X." There's a certain merit to this. class Person :isa(Invoice) is a big bucket of "no," but the computer can't know the meaning of what I'm writing, so we're told "don't do X." But we have unclear specs. We have organically growing code. We have deadlines. We make mistakes. I tend to get hired to work on large systems and they're always a mess—always—even when it's clear that the original code was solid.

When the software can protect us, it should. In this case, we have things that are semantically related, but that's not expressed in the code. So instead, I've long thought I'd prefer something like this (abominable) syntax:

    class Widget {
        multimethod new {
            args (title=>$t, red=>$r, green=>$g, blue=>$b) { ... }
            args (title=>$t, color=>$c)                    { ... }
            args ($title)                                  { ... }
        }
    }

With the above, developers wouldn't be grouping multis by happenstance. They'd be doing so because the syntax forces them to be more organized, making the code easier for others to develop.

@thoughtstream
Copy link

class Person :isa(Invoice) is a big bucket of "no,"
but the computer can't know the meaning of what I'm writing

Hmmmmmmmmmmmmm.

Given the astonishing advances we’re currently seeing in machine learning
(e.g. plausibly relevant synthetic images generated from pure text descriptions),
I wonder how hard it would actually be to create a module that points out
these types of code malapropisms??? One might imagine:

    use reasonable 'semantics';

    class Car :isa(Wheel) {...}
    # Compile-time error: Category mistake (a car is not a wheel)

    class Real   :isa(Complex)   {...}
    class Square :isa(Rectangle) {...}
    # Compile-time error: Liskov violations

    class Utils :does(Maths) :does(Stats) :does(Transforms) {...}
    # Compile-time warning: Possible "God object" detected

Hmmmmmmmmmmmm.

When the software can protect us, it should.
... So instead, I've long thought I'd prefer
something like this (abominable) syntax:

That’s extremely interesting. I don’t recall having seen that particular approach
to multiple dispatch in any programming language I’ve studied.

However, although I certainly agree this kind of restriction
would be much kinder to hard-pressed maintenance personnel,
it would also undermine one of the great advantages of
multiple dispatch: the clean mechanism MD provides for extending
the behaviour of an existing multi to cover new cases as they are added.

For example, imagine a multi-based implementation of some
Data::Dumper-like facility:

    multi dd (\@a)    { '['.join(',', map {dd($_)} @a) . ']' }
    multi dd (\%h)    { '{'.join(',', map {dd($_).' => '.dd($h{$_})} keys %h).'}'}
    multi dd (\$s)    { '\(' . dd($$s) . ')' }
    multi dd (Num $n) { $n }
    multi dd (Str $s) { '"' . quotemeta($s) . '"' }

When someone creates a new class, and would like the dd multi
to subsequently handle objects of that class as well, they can simply write
(in their own module):

    class MyClass {
        method serialize () {...}

        multi dd :export  (MyClass $obj) { $obj->serialize }
    }

And now the client code’s dd understands those kinds of objects as well.

More relevantly, however, is the fact that, just like regular methods,
multimethods are inherited within class hierarchies, and composed via roles.
So you inevitably get fragmentation of your multimethod anyway:

    class MyBase {
        multimethod do_something ($x, $y) {...}
    }

    # thousands of lines of code
    # or in a completely separate source file

    role Something {
        multimethod do_something ($z) {...}
    }

    # thousands more lines of code
    # or in another completely separate source file

    class MyDer :isa(MyBase) :does(Something) {
        multimethod do_something ($x, $y, $z) {...}
    }

This has to be supported, no matter what, so if you attempted to impose
contiguity on variant declarations via syntax, I suspect developers
would simply take to splitting their scattered multimethod declarations
out into scattered roles (just to spite you ;-)

Nevertheless, I can see the desirablity of encouraging developers
to “cuddle” their multimethod variants. So (in Multi::Dispatch at least)
I now plan to offer an loading option that complains at compile-time
when it finds non-contiguous variant declarations within a single namespace.

Thank-you for the idea.

[Obligitory "We're straying from the topic again" reminder...ironically, by the main offender]

@Ovid
Copy link
Collaborator

Ovid commented Sep 19, 2022

In that case, it seems we're down to three options (someone correct me if I'm wrong)

  1. Damian's suggestion of using ADJUST/NEWARGS
  2. The above, with multidispatch lovelies
  3. Do nothing and require only one way of key/value initialization

I don't think we're getting multidispatch soon, so scratch that. The "do nothing" raises the question: does this create an unsolveable problem? I'm not convinced that it does, and for some workarounds, how likely are they to be in practice?

That being said, "do nothing", while sounding tempting, doesn't seem like the best approach to me because at the very least, we need ADJUST and if we implement just that, we might find it not-forward compatible with our needs. However, this means adding arguments to phasers, along with adding an :adjust attribute to fields. @leonerd: thoughts?

@HaraldJoerg
Copy link
Contributor

I have been musing for some time now whether it is possible to merge the :param and :adjust attribute into one.

Rationale: Both attributes describe part of how a field gets its initial value. :param states that a value can be provided by the caller of the constructor, and we already have :param(NAME) if we want different names for the field and parameter. So maybe we could extend :param to carry the declaration of the parameters which are passed to the initializer block?

Here are some examples:

   field $name :param;
   field $name :param(name);          # same behavior as previous line

The caller must pass a value for name => $scalar to the constructor (unchanged to the current RFC). The key "name" is marked as used by the constructor.

    field $name :param(label)

The caller must pass a value for label => $scalar to the constructor (also unchanged). The key "label" is marked as used by the constructor.

Now let's add initializer blocks:

    field $name :param { $name // 'N.N.' } # default is 'N.N.'
    field $name :param { $name } # defaults to undef
    field $name :param(label) { $label } # label is optional, defaults to undef

This is a change to the current spec (and to the behavior of Object::Pad): The caller may pass a value name => $scalar in the parameter list. The parameter name is available as a scalar $name in the initializer block. The initializer block, if present, is always executed (unlike Object::Pad does today), the value of its last statement is the initial value of the field.

So one example we had before could be written like this:

    field $title :param(title,label) { $title // ucfirst lc $label }

Both title and label are marked as used by the constructor.

The widget example:

class Widget {
   field $title :param;
   field $colour :param(red,green,blue) { Colour->new( $red,$greem.$blue ) };
}

my $widget = Widget->new( title => "The text", red => 100, green => 30, blue => 0 );

Like in @leonerd's initial example, it is not possible to specify a Colour object directly. You need NEWARGS wizardry if you want to allow both types of invocation.

The fact that initializer blocks are always run has some interesting consequences.
I guess I myself will make that error sooner or later:

class Breakfast {
    has $drink :param { Coffee->new; }
}

This is no longer specifying a default. Instead, it ignores whatever parameter has been given to the constructor.

As another example, array fields can be initialized from constructor parameters like this:

class Polygon {
    field @points :param { @$points }
}

Also, initializer blocks can now be used to validate the input, which is kinda nice because it happens early in the construction process and in many cases there will be a 1:1 relationship between a param and a field.

A construction parameter can be used to build more than one field:

class Prism {
    field $height   :param { $height // 1.0 }
    field $size     :param { $size   // 1.0 }
    field @vertices :param(sides,size,height) { ...; }
    field @faces    :param(sides) { ...; }
}
my $p = Prism->new(sides => 6) # create a hexagonal prism

There's still a need for ADJUST phasers in case the fields can not be initialized one-by-one, or when constructors should have side-effects (e.g. logging), or when validating "the whole thing" is necessary.

@thoughtstream
Copy link

I think @HaraldJoerg’s “initializer-blocks-always-run” proposal
has some good points, but I also think it has some issues
(even apart from his own has $drink :param { Coffee->new; }
counterexample).

Most importantly, the current behaviour of initializer blocks
(even without my proposed :adjust($init_param) addition)
provides a critical feature: they specify a default initialization
behaviour when the corresponding named constructor arg (“NCA”)
is missing.

For example:

    field $colour :param :adjust($r=0.5, $g=0.5, $b=0.5) { Colour->new(r=>$r, g=>$g, b=>$b) }

...allows users to either pass a colour => $colourobj argument to the
constructor, or else pass r=>$red, g=>$green, b=>$blue instead,
or else pass nothing and get grey as a default.

I can’t really see how to achieve that under @HaraldJoerg’s proposal.

And here's an even simpler example:

    # Initialize from the 'name' NCA, or else from the 'label' NCA...
    field $name :param  :adjust($label) { $label }

Moreover, this example tells the compiler to mark the 'param' NCA as consumed
if it was present, in which case the additional presence of a 'label' NCA
is automatically reported as an error. And if the 'param' NCA is missing,
then that's a (different) error, unless the 'label' NCA is present. And if
they're both missing, then that can be reported accurately too ("...was expecting
'name' or 'label' argument...")
. I can’t see how to achieve all that under
@HaraldJoerg’s proposed unified semantics.

The issue here is that :param means: “expect an NCA with the specified name”,
whereas :adjust means if you didn’t get the expected NCA, do this instead
with this other specified NCA”
. Those two behaviours are intrinsically sequential,
and the second is contingent on the first. Hence they are not really unifiable.

I would also point out that my previous proposal already allows for the
kinds of “always executed” initializer blocks @HaraldJoerg is proposing.
You just leave out the :param specifier, in which case the initializer
block is always executed. Here are all his examples, under my previous proposal:

    field $name :param        { 'N.N.' } # default is 'N.N.'
    field $name :param        { undef  } # defaults to undef
    field $name :param(label) { undef  } # label is optional, defaults to undef
    # If no 'title' NCA, use 'label' NCA instead...
    field $title :param :adjust($label) { ucfirst lc $label }
    class Widget {
        field $title  :param;
        # Must pass 'red', 'green', and 'blue' NCAs; can't pass 'colour' NCA...
        field $colour :adjust(red,green,blue) { Colour->new($red,$green,$blue) };
    }
    class Breakfast {
        has $drink :param { Coffee->new; }   # Now not an error (Coffee is the default)
    # or...
        has $drink        { Coffee->new; }   # Also not an error (Coffee is mandatory...and obviously so)
    }
    class Polygon {
        field @points :adjust($points) { @$points }
    }
    class Prism {
        field $height   :param { 1.0 }
        field $size     :param { 1.0 }
        field @vertices :param :adjust(sides) { ...; }
        field @faces    :param :adjust(sides) { ...; }
    }
    my $p = Prism->new(sides => 6) # create a hexagonal prism

In summary: I completely understand (and approve of) the desire to
reduce new constructs wherever possible, especially by combining
them when they turn out to actually be just two “views” of a single
underlying concept. And I'm very glad that smart and dedicated people
like @HaraldJoerg are thinking in those terms about this design.

However, I don't believe that explicit initialization (:param) and
fallback default initialization (:adjust) are merely two aspects
of one thing. They are two distinct and contingent steps in the overall
initialization process...and I believe they need two distinct and contingent
constructs to specify them.

I specifically chose the name :adjust to imply that the second phase
happens after normal initialization (just like with ADJUST phasers),
but perhaps another name would make it clearer that such adjustments
only occur when :param initialization is absent...or fails. Perhaps :default?
(I'm less keen on :fallback, but I guess that would work too.)

@HaraldJoerg
Copy link
Contributor

HaraldJoerg commented Oct 7, 2022

@thoughtstream says:

    field $colour :param :adjust($r=0.5, $g=0.5, $b=0.5) { Colour->new(r=>$r, g=>$g, b=>$b) }

...allows users to either pass a colour => $colourobj argument to the
constructor, or else pass r=>$red, g=>$green, b=>$blue instead,
or else pass nothing and get grey as a default.

I can’t really see how to achieve that under @HaraldJoerg’s proposal.

I omitted that intentionally (mumbling about NEWARGS). Actually, this is possible in an initialiser block:

field $colour :param(colour,r=0.5,g=0.5,b=0.5) { $colour // Colour->new(r=>$r, g=>$g, b=>$b) }
# or (as attributes are a pain to parse)
field $colour :param(colour,r,g,b) { $colour // Colour->new(r=>$r//0.5, g=>$g//0.5, b=>$b//0.5) }

I can narrow down the difference in interpretation with @thoughtstream's quote:

The issue here is that :param means: “expect an NCA with the specified name”,

In my interpretation, :param means "The initial value for this field is to be derived from the NCAs given in parens". It does not need a distinction between NCAs specified in :param and NCAs specified in :adjust. After all, :adjust also means "expect an NCA with the specified name" in this example:

field @points :adjust($points) { @$points }

@thoughtstream
Copy link

field $colour :param :adjust($r=0.5, $g=0.5, $b=0.5) { Colour->new(r=>$r, g=>$g, b=>$b) }
I can’t really see how to achieve that under @HaraldJoerg’s proposal.

field $colour :param(colour,r=0.5,g=0.5,b=0.5) { $colour // Colour->new(r=>$r, g=>$g, b=>$b) }

But this doesn’t fully achieve the same effect. For a start, unlike my example, it doesn’t specify
that the 'param' NCA and the 'r'/'g'/'b' NCAs are mutually exclusive, so the compiler
(or an IDE, or a static analyser, or a linter, or a refactorer, etc.) would have no way to
automatically detect or report the error in: Widget->new(colour=>$c, r=>$red, g=>$green, b=>$blue)

So then we’d start seeing:

    field $colour :param(colour,r=0.5,g=0.5,b=0.5) {
        die "Can't use 'colour' and 'r'/'g'/'b' at the same time"
            if defined $colour && grep {defined}, $r, $g, $b);
        $colour // Colour->new(r=>$r, g=>$g, b=>$b)
    }

More generally, the behaviour of the :adjust version:

    field $colour :param :adjust($r=0.5, $g=0.5, $b=0.5) { Colour->new(r=>$r, g=>$g, b=>$b) }

is maximally declarative. The only code required is the code that provides the actual default/fallback value.
Whereas, the behaviour of:

    field $colour :param(colour,r=0.5,g=0.5,b=0.5) { $colour // Colour->new(r=>$r, g=>$g, b=>$b) }

is emergent. The default/fallback only occurs because of the // in the code.

Which means it’s much easier to enbug that version:

    field $colour :param(colour,r=0.5,g=0.5,b=0.5) { $colour || Colour->new(r=>$r, g=>$g, b=>$b) }

The same issues apply even more to the alternative formulation @HaraldJoerg suggested:

    field $colour :param(colour,r,g,b) { $colour // Colour->new(r=>$r//0.5, g=>$g//0.5, b=>$b//0.5) }

This has all the same non-declarative disadvantages as the previous version but, in addition,
now the compiler (or IDE, or static analyser, or linter, or refactorer, or a human reader of the code)
can’t even determine syntactically that the 'r'/'g'/'b' NCAs are optional or what
their individual fallback values are.

And the possibility for unintended effects is even greater:

    field $colour :param(colour,r,g,b) { $colour || Colour->new(r=>$r//0.5, g=>$g//0,5, b=>$b/0.5) }

In my interpretation, :param means "The initial value for this field is to be derived from
the NCAs given in parens".
It does not need a distinction between NCAs specified in :param
and NCAs specified in :adjust.

I’m arguing that we do need that distinction. In order to generate correct error messages.
And, more importantly, in order to avoid the kind of emergent (mis)behaviour that
this declarative OO mechanism is supposed to be supplanting.

@aquanight
Copy link

I'm starting to think we're looking at putting WAY too much into a :param attribute, especially for something that needs an MVP implementation. What started as a simple "put this NCA into a field for me" is looking like it's being slowly turned into a field-by-field subroutine signature.

I'm going to once more reiterate my own proposal for this solution space.

  1. Any NCA will go into exactly one field marked with :param (or :param(NAME) to have the NCA named different from the field). No two fields at any point in construction may examine the same NCA (adjudicated per hash-key lookup).
  2. :param fields with an initializer block produce an optional NCA. If the NCA is present, the initializer block is skipped. If a :param field appears and the NCA is not present, a fatal error is produced at that point and no further initializer or ADJUST blocks run.
  3. No NCA provided to the top level class is seen by a base class or role: instead the class being constructed shall use $self->Base::new(...) or $self->Role::new(...) to initialize base classes and roles.
  4. No parameters are provided to ADJUST or initializer blocks. Not via @_, not via %_ or some magic lexical, none of that. If you want to look at NCAs, you look at previously declared fields marked with :param.
  5. A single hash field can be marked :param with no name permitted. If this is done, the hash receives and consumes all NCAs not yet consumed by any prior field. Since processing happens in lexical order, no field declared after this hash field could usefully use :param. If no hash field is marked :param, any unused NCA will produce a fatal error no later than after the last field with :param is processed or the last ADJUST block runs.
  6. Any field not used in any method will be freed (and thus any reference it contains released) before new returns to its caller. This may require multiple passes to prune out fields used in private methods that are themselves only used during construction.

The idea here is to achieve declarative NCAs yet keep things simple enough to be possible for an MVP.

Since our colour example keeps coming around, that results in:

field $r :param;
field $g :param;
field $b :param;
field $colour :param { Colour->new(r => $r, g => $g, b => $b) };

Item 3 has the goal of allowing derived classes to have absolute control over the initialization of their superclasses. This was also the motivation behind item 5 (to allow the derived class to delegate unknown NCAs to base class). If a base class or role isn't given a ->new() call by the subclass, the overall construction logic should automatically construct the base and roles as if they had no parameters.

@aquanight
Copy link

aquanight commented Oct 7, 2022

(Just to re-iterae something: the "fields not used beyond construction can be freed" would extend to ANY field, :param or not. Among other things, this means something like this is not as wasteful as it looks:

field $with_rgb { 1 };
field $r :param { $with_rgb = undef; };
field $g :param { $with_rgb = undef; };
field $b :param { $with_rgb = undef; };
field $with_colour { 1 };
filed $colour :param { $with_colour = undef; $with_rgb or die; Colour->new(r => $r, g => $g, b => $b); };
ADJUST { !$with_rgb != !$with_colour or die; }

)

@thoughtstream
Copy link

I agree with @aquanight that my :adjust proposal and @HaraldJoerg's extended :param proposal
are both inappropriate in the MVP (let alone in the MMVP).

I'd be fine if the first release of Corinna has only simple :param and unparameterized initializer blocks.
For the MMVP, it doesn't matter that we can't (yet) specify alternative initialization NCAs for a field.

But this discussion is also about what Corinna should provide long-term, as it evolves.
It's critical to work though such matters now, so that we don't make decisions in the MMVP
that paint us into a suboptimal corner further down the line.

That said, if you truly believe that:

    field $with_rgb { 1 };
    field $r :param { $with_rgb = undef; };
    field $g :param { $with_rgb = undef; };
    field $b :param { $with_rgb = undef; };
    field $with_colour { 1 };
    field $colour :param { $with_colour = undef; $with_rgb or die; Colour->new(r => $r, g => $g, b => $b); };
    ADJUST { !$with_rgb != !$with_colour or die; }

...is an adequate long-term alternative to something like:

    field $colour :param :adjust($r, $g, $b) { Colour->new(r => $r, g => $g, b => $b) };

...or perhaps:

    NEWARGS ($r, $g, $b) { colour => Colour->new(r => $r, g => $g, b => $b) };
    field $colour :param;

...then I suspect our criteria for programming language design are just too fundamentally different
for us ever to agree.

Mind you, I'm not saying which of us is right; just that one design approach must die
at the hand of the other...for neither can live while the other survives. ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants