Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong method resolution when using default values #10231

Closed
Sija opened this issue Jan 9, 2021 · 16 comments · Fixed by #10711
Closed

Wrong method resolution when using default values #10231

Sija opened this issue Jan 9, 2021 · 16 comments · Fixed by #10711
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:semantic

Comments

@Sija
Copy link
Contributor

Sija commented Jan 9, 2021

class Foo
  def bar(a = 0, b = 0, c = 0, d = 0)
  end
 
  def bar
  end
end
 
foo = Foo.new
foo.bar 1, 1, 1, 1

fails to compile with error:

error in line 10
Error: wrong number of arguments for 'Foo#bar' (given 4, expected 0)

Overloads are:
 - Foo#bar()

https://carc.in/#/r/a8sd


It works when we switch the order of definitions:

class Foo
  def bar
  end
 
  def bar(a = 0, b = 0, c = 0, d = 0)
  end
end
 
foo = Foo.new
foo.bar 1, 1, 1, 1

https://carc.in/#/r/a8sf


Originally reported on IRC by hightower2

@straight-shoota
Copy link
Member

straight-shoota commented Jan 9, 2021

Wow, I'd have expected an error like this would have been noticed before.

Maybe everyone just writes their argless overloads first...? ;)

@straight-shoota straight-shoota added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:semantic labels Jan 9, 2021
@asterite
Copy link
Member

asterite commented Jan 9, 2021

Because the second definition collides with the first one, it wins. The first one disappears.

@straight-shoota
Copy link
Member

Why can it collied when the first one has four arguments and the second one has none?

@Blacksmoke16
Copy link
Member

@straight-shoota foo.bar should it use the one with no args or the one with 4 args that all have defaults?

@Sija
Copy link
Contributor Author

Sija commented Jan 9, 2021

@Blacksmoke16 IMO the last one defined should be used, OTOH calling foo.bar(1, 2, 3, 4) is supposed to pick definition with a proper argument arity - in this case Foo#bar(a, b, c, d).

@asterite
Copy link
Member

asterite commented Jan 9, 2021

The question is: what happens when you call it without args? The second one always wins, and that's the one it stays.

I don't know how this should be changed. Maybe give an error in this case because it's likely it's a mistake.

@Sija
Copy link
Contributor Author

Sija commented Jan 10, 2021

I don't know how this should be changed. Maybe give an error in this case because it's likely it's a mistake.

@asterite In order to allow mentioned case to work, an obvious thing to do IMO would be to leave both variants accessible. In this way the last defined method would win but the previous would still be accessible via different argument arity, does that make sense?

@asterite
Copy link
Member

I don't think so, it will be confusing.

@straight-shoota
Copy link
Member

@asterite When both are defined in the same scope, #10071 produces an appropriate error. That's definitely good because you can easily change the methods.

But I'm not sure about what to do when they're not in the same scope. A typical use case would probably be reopening a type to add a method def overriding only the zero-args overload. An example would be Int#round and Number#round(digits = 0, base = 0) if they were defined on the same type.

Usually an overload with different arguments adds a different way to call a method. An overload with equal arguments overrides the previous one. If you add an overload with both characteristics (different arguments and equal arguments at the same time because of default values), I would intuitively expect it to be interpeted in each way differently. The overload with identical arguments would be overriden, but the overload with different arguments is added as a new one.

It would probably be fine if that was a proper error, but IMO it should be more explicit. The error message from the OP is also really confusing.

@asterite
Copy link
Member

Sounds good. The logic for this is in restrictions.cr if someone wants to fix this.

@straight-shoota
Copy link
Member

I fear changing the restriction logic isn't that simple. For method with optional arguments, it really depends on the call arguments.

In this example, bar is only a restriction of bar(x = true) when it is called without an argument. But when called with an argument, it's not a restriction.

def bar(x = true)
  1
end

def bar
  'c'
end

bar 1 # => 1
bar   # => 'c'

So it doesn't seem trivial to change this. I'm not super familiar with the code, so maybe I'm missing something.

I'm not sure how to move forward with this. Restrictions for methods with optional arguments can't really be resolved statically. So I suppose that would require changes to the method lookup which would then have to filter out non-restriction overloads. That should probably work.

Another idea would be to factor out optional arguments, so that bar(x = true) is duplicated as bar and bar(x). That would allow proper restrictions based on arity, but it would also add exponential numbers of overloads for methods with many optional arguments. So it's probably not a good idea unless we could limit the number to conflicting overloads.

@HertzDevil
Copy link
Contributor

HertzDevil commented Feb 7, 2021

(I suggest changing the terminology here so that if a.restriction_of?(b) then a is subsumed by b, and if additionally !b.restriction_of?(a) then a is more specialized than b.)

The second bar overload is still more specialized than the first one, because every admissible set of arguments for def bar is also admissible for def bar(x = true), with no arguments being the only possibility; but not vice-versa, as providing any argument would match only the first overload, not the second.

The original issue stems from the fact that both overloads subsume each other, so Crystal regards them as redefinitions. Method resolution will work as long as either overload does not subsume the other. Informally speaking:

Because the second definition collides with the first one, it wins. The first one disappears.

Crystal considers that the first definition also wins, which is the real reason the two definitions collide, hence the first one disappearing. This is how the compiler currently uses arity to determine subsumption order:

# A def with more required arguments than the other comes first
if min_size > other.max_size
return true
elsif other.min_size > max_size
return false
end

The two overloads in OP's example have min_size = 0, max_size = 4 and min_size = 0, max_size = 0 respectively. Neither overload can be determined to subsume the other at this stage, and no parameters are ever considered afterwards, so by default both overloads "win" and there is a redefinition.

But this is incorrect even in the case without default arguments; if two overloads have incompatible numbers of (required positional) parameters, method resolution will work regardless of their relative order:

def foo(x, y); end
def foo(x, y, z); end

def bar(x, y, z); end
def bar(x, y); end

foo 1, 2 # always matches the 2-arg overload; always refutes the 3-arg overload
bar 1, 2 # always matches the 2-arg overload; always refutes the 3-arg overload

@HertzDevil
Copy link
Contributor

Another thing is you can erase an overload by successively defining overloads that Crystal thinks are redefinitions of the previous one:

def foo; end
def foo(x = 0); end
def foo(x); end

foo # Error: wrong number of arguments for 'foo' (given 0, expected 1)

The minimal fix is as below:

# If self can take fewer arguments than other's required arguments, self isn't stricter than other
return false if min_size < other.min_size

# If self can take more arguments than other's required + optional arguments, self isn't stricter than other
return false if max_size > other.max_size

However this breaks a few places, for example these overloads of Enumerable#join:

  • def join(separator = "")
  • def join(io : IO, separator = "")

Neither overload subsumes the other; only the first overload matches if 0 arguments are passed, and only the second matches if 2 arguments are passed. We would still like the second overload to have higher priority regardless of definition order, because its first parameter is required (the type restriction doesn't matter here) compared to the first overload where it's optional. This leads to the following:

# A def with more required positional arguments than the other comes first
if min_size > other.min_size
  return true
elsif other.min_size > min_size
  return false
end

# A def with fewer optional positional arguments than the other comes first
if max_size < other.max_size
  return true
elsif other.max_size < max_size
  return false
end

This too is still broken, this time in File.new:

  • private def initialize(@path, fd, blocking = false, encoding = nil, invalid = nil)
  • def self.new(filename : Path | String, mode = "r", perm = DEFAULT_CREATE_PERMISSIONS, encoding = nil, invalid = nil)

Once again, neither overload subsumes the other because the second argument is required in one overload and optional in the other. This time the private overload has higher priority.

Perhaps we should do the arity checks after the pairwise type restriction checks, or we could order public defs before private defs (with protected ones in between). I'll return to this tomorrow...

@HertzDevil
Copy link
Contributor

We could break the logic down into two kinds of subsumption here:

  • Parameter subsumption: The parameters of two overloads are compared pair-wise, and their type restrictions determine whether the parameter in one overload could take a wider set of values than the corresponding parameter in the other overload. (In an ideal world this would also take free variable substitutability into account.) Then a is subsumed by b if a's parameters, when substituted into b in a canonical manner (i.e. not passing positional parameters as named arguments), satisfy b's parameter type restrictions.
  • Argument subsumption: Two overloads that accept compatible arguments differ in how arguments are matched to parameters; the more specific an overload's parameters are, the fewer combinations of arguments it could take. This is where we prefer required parameters over named ones, over single splats; and prefer required named parameters over double splats.

The two kinds of rules cause a conflict in cases like:

def foo(x : IO, y = nil); end # a
def foo(x     , y      ); end # b

a is more specialized than b because x : IO is subsumed by x and not the other way round, yet b is more specialized than a because y is more specific than y = nil. I think we do want to assign an overload order here (otherwise a lot of code involving default restriction-less parameters and different overload arities will break).

The examples in the standard library show that we cannot fix argument subsumption in isolation (the OP's issue) without considering parameter subsumption. The current restriction logic interleaves the two kinds of rules, but I think we should completely prioritize parameter rules over argument rules. That is,

  • First compare all the corresponding type restrictions.
  • Only if the type restrictions cannot determine overload order does the compiler check for presence of default parameter values and splats and so on. (OP's defs will not be considered redefinitions as long as this part includes the minimal fix in my previous comment.)

@HertzDevil
Copy link
Contributor

I have a work-in-progress fix in this branch, and some specification to go with it. Essentially, if parameter restrictions cannot show that either def is stricter than the other, the compiler would arrange signatures of positional parameters in the following order:

# ...
def foo(x0, x1); end
def foo(x0, x1, x2 = 0); end
def foo(x0, x1, x2 = 0, x3 = 0); end
# ...
def foo(x0, x1, x2 = 0, x3 = 0, *xs); end
def foo(x0, x1, x2 = 0, *xs); end
def foo(x0, x1, *xs); end
def foo(x0); end
def foo(x0, x1 = 0); end
def foo(x0, x1 = 0, x2 = 0); end
# ...
def foo(x0, x1 = 0, x2 = 0, *xs); end
def foo(x0, x1 = 0, *xs); end
def foo(x0, *xs); end
def foo; end
def foo(x0 = 0); end
def foo(x0 = 0, x1 = 0); end
# ...
def foo(x0 = 0, x1 = 0, *xs); end
def foo(x0 = 0, *xs); end
def foo(*xs); end # least specific overload

and signatures of named parameters in the following order:

def foo(*, n); end
def foo(*, n, **ns); end
def foo; end
def foo(*, n = 0); end
def foo(*, n = 0, **ns); end
def foo(**ns); end

Then it combines the signatures from the positional parameters and each of the named parameters to determine which def comes first. If two signatures are different, they will never be considered redefinitions of each other.

The fix would probably be considered a breaking change due to reasons mentioned in #10594.

@HertzDevil
Copy link
Contributor

Macros are never reordered, but the compiler still detects redefinitions and overwrites old ones, albeit using different checks and not supporting previous_def, so they are subject to the same issue:

macro foo(x = nil)
end

macro foo(x)
end

foo # Error: wrong number of arguments for macro 'foo' (given 0, expected 1)

It is very easy to adopt the same algorithm for macros by simply skipping the checks for restrictions (because there are none, see #596) and for blocks (because all macros implicitly allow them even without a block parameter).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:semantic
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants