Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix returning from constructors #845

Merged
merged 4 commits into from
Apr 8, 2021

Conversation

ChayimFriedman2
Copy link
Contributor

  1. Do not allow returning with a value
  2. Return the instance, correctly, even when the user returned explicitly

 1. Do not allow returning with a value
 2. Return the instance, correctly, even when the user returned explicitly
@mhermier
Copy link
Contributor

Can you separate the 2 changes of behavior? While I understand 1 and find it interesting, I don't see what you are trying to fix with 2 and you don't provide test cases for it.

@MasFlam
Copy link
Contributor

MasFlam commented Nov 24, 2020

What about foreign constructors? A way to ensure they don't return anything other than this is to check if the slot 0 was changed when the foreign function returns

@mhermier
Copy link
Contributor

The foreign constructors are likely to be another subject (related though). The automatically generated static part of the constructor relies on the fact that the instance is returned as the first entry on the stack. It is good for performance, but questionable in term of security/usability. Ideally the stack should be shifted by 1 cloning entry 0, so we can safely ignore any returned values of the constructor, but it has speed implications.

@mhermier
Copy link
Contributor

Even if interesting, I have 2 big cons about that change:

  • Technically speaking, forbidding returns in constructors should also forbid one liner method block and as a consequence the one used for default constructor: construct new() { }
  • While it tries to formalize the body of constructors, it does contradict with something that it is not currently possible to do (because @munificent was not sure about the usage, when I exposed it, so he banned it): constructor functions. The concept is that since constructor names are arbitrary, a constructor can also act has a regular function. It means that when called from the class it returns and instance, while it still remains callable as a regular method. The only usage I found, that is not really made up, is about method chaining, but I suspect it should able to be generalized for other usage.

@ChayimFriedman2
Copy link
Contributor Author

Can you separate the 2 changes of behavior? While I understand 1 and find it interesting, I don't see what you are trying to fix with 2 and you don't provide test cases for it.

Previously, the following code:

class C {
  construct new() {
    return
  }
}
System.print(C.new())

Printed null, and it's derived directly from how return without value is desugared: return null. Now, it'll be desugared to return this inside constructor, keeping the correct behavior.
And I have a test for it, actually: give a look at https://github.com/wren-lang/wren/pull/845/files#diff-cb6241655f8ee0f56fd7fab4d61978f1a97e99f053d607f6abac8919f554155c.

Technically speaking, forbidding returns in constructors should also forbid one liner method block and as a consequence the one used for default constructor: construct new() { }

This is not true: this case was already handled. See

wren/src/vm/wren_compiler.c

Lines 1659 to 1666 in 44d6d20

if (isInitializer)
{
// If the initializer body evaluates to a value, discard it.
if (isExpressionBody) emitOp(compiler, CODE_POP);
// The receiver is always stored in the first local slot.
emitOp(compiler, CODE_LOAD_LOCAL_0);
}
, esp. line 1662.

While it tries to formalize the body of constructors, it does contradict with something that it is not currently possible to do (because @munificent was not sure about the usage, when I exposed it, so he banned it): constructor functions. The concept is that since constructor names are arbitrary, a constructor can also act has a regular function. It means that when called from the class it returns and instance, while it still remains callable as a regular method. The only usage I found, that is not really made up, is about method chaining, but I suspect it should able to be generalized for other usage.

I don't understand your point, sorry.

The foreign constructors are likely to be another subject (related though). The automatically generated static part of the constructor relies on the fact that the instance is returned as the first entry on the stack. It is good for performance, but questionable in term of security/usability. Ideally the stack should be shifted by 1 cloning entry 0, so we can safely ignore any returned values of the constructor, but it has speed implications.

Yes that's right: when I had to implement return without value from constructor, I had two options: do what I did, or do what you suggested. I decided to go with the first option, even though I didn't realized that there will be performance implications to the second, because I found it consistent with rejecting returning a value.

Detecting invalid foreign constructors is hard, because they can put what they want in slot 0. Instead I opt for trusting the user, like Wren does for most C APIs:

Wren’s embedding API defines the border between those worlds, and takes on some of the characteristics of C. When you call any of the embedding API functions, it assumes you are calling them correctly. If you invoke a Wren method from C that expects three arguments, it trusts that you gave it three arguments.

In debug builds, Wren has assertions to check as many things as it can, but in release builds, Wren expects you to do the right thing. This means you need to take care when using the embedding API, just like you do in all C code you write. In return, you get an API that is quite fast.

(from https://wren.io/embedding/).

@mhermier
Copy link
Contributor

mhermier commented Nov 24, 2020 via email

@ChayimFriedman2
Copy link
Contributor Author

First, I don't understand why this PR prevents that.

Second, IMO, the right way to do that is to use normal methods:

class HTML {
  construct new() {}
  header(h) {
    _header = h
    return this
  }
  body(b) {
    _body = b
    return this
  }
}
var html = HTML.new().header("foo").body("bar")

This is how fluent interfaces are built in general.

@mhermier
Copy link
Contributor

mhermier commented Nov 24, 2020 via email

@ChayimFriedman2
Copy link
Contributor Author

What???

@ruby0x1
Copy link
Member

ruby0x1 commented Nov 24, 2020

class HTML {
  contruct header(h) {
    _header = h
  }
  construct body(b) {
    _body = b
  }
}
var html = HTML.header("foo").body("bar")

This is equivalent, as constructor returns this without a need to return it manually.
The confusing thing here is that you're allocating multiple instances (this is not typical in method chaining...)

@ChayimFriedman2
Copy link
Contributor Author

Yeah, that what I meant when I said:

I don't understand why this PR prevents that.

@ruby0x1
Copy link
Member

ruby0x1 commented Nov 24, 2020

As @avivbeeri mentions elsewhere, HTML.header("foo").body("bar") is not valid either, you can't call a constructor on an instance. It's a class method only.

class HTML {
  contruct header(h) {
    _header = h
    return HTML
  }
  construct body(b) {
    _body = b
    return HTML
  }
}
var html = HTML.header("foo").body("bar")

This case would make that code work, but this goes back to returning random/unrelated values from a constructor which has a well defined purpose of creating an instance.

@ChayimFriedman2
Copy link
Contributor Author

ChayimFriedman2 commented Nov 24, 2020

class HTML {
  contruct header(h) {
    _header = h
    return HTML
  }
  construct body(b) {
    _body = b
    return HTML
  }
}
var html = HTML.header("foo").body("bar")

This case would make that code work, but this goes back to returning random/unrelated values from a constructor which has a well defined purpose of creating an instance.

Nope, it will create an instance with only _body set but _header will be null.

@ruby0x1
Copy link
Member

ruby0x1 commented Nov 24, 2020

The confusing thing here is that you're allocating multiple instances (this is not typical in method chaining...)

@ChayimFriedman2
Copy link
Contributor Author

ChayimFriedman2 commented Nov 24, 2020

But if you operate on the instance, you're not.

If you return the class, then it's not different from creating a new instance, yup.

Edit:
So this code:

class HTML {
  construct new() {}
  header(h) {
    _header = h
    return this
  }
  body(b) {
    _body = b
    return this
  }
}
var html = HTML.new().header("foo").body("bar")

And this code:

class HTML {
  contruct header(h) {
    _header = h
  }
  construct body(b) {
    _body = b
  }
}
var html = HTML.header("foo").body("bar")

Err now, but if this will be supported, they'll work.
But this code:

class HTML {
  contruct header(h) {
    _header = h
    return HTML
  }
  construct body(b) {
    _body = b
    return HTML
  }
}
var html = HTML.header("foo").body("bar")

Doesn't err (without this PR), but ending up creating multiple instance and not setting all properties.

@mhermier
Copy link
Contributor

mhermier commented Nov 25, 2020 via email

@ChayimFriedman2
Copy link
Contributor Author

I agree that this could be useful. Something like Python's __new__() method.

However, @ruby0x1 already said that this is a bug. And even if we want to allow returning a value, we probably still want return without value to return the instance and not null.

@mhermier
Copy link
Contributor

mhermier commented Nov 26, 2020 via email

@ruby0x1 ruby0x1 merged commit 041f1ba into wren-lang:main Apr 8, 2021
@ruby0x1
Copy link
Member

ruby0x1 commented Apr 8, 2021

While this particular solution creates a minor edge case*, I've run into this and seen testers run into this enough times that I think having an explicit error message is better. Each time I've seen it, it caused a lot of confusion.

Returning from a constructor is valid, and it shouldn't break the class instance you tried to construct when you do so.

It should also make it clear when you're using the return in a constructor incorrectly, as returning a value from a constructor isn't valid - the return result is always supposed to be 'this'. Silently ignoring a return value is not as good as an error message in this case.

*: return this throws an error, even though it's technically valid. For now this is a better state than silent and confusing failure.

@ChayimFriedman2 ChayimFriedman2 deleted the fix-return-in-constructors branch April 8, 2021 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants