Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Support __len metamethod for tables and rawlen function #536

Merged
merged 4 commits into from
Jun 28, 2022
Merged

Conversation

zeux
Copy link
Collaborator

@zeux zeux commented Jun 13, 2022

@zeux zeux added the rfc Language change proposal label Jun 13, 2022
@@ -20,7 +20,7 @@ Supporting `__len` would make it possible to implement a custom integer based co

## Design

`#v` will call `__len` metamethod if the object is a table and the metamethod exists; the result of the metamethod will be returned.
`#v` will call `__len` metamethod if the object is a table and the metamethod exists; the result of the metamethod will be returned if it's a number (an error will be raised otherwise).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit inconsistent with other metamethods.

__concat, __unm, and all the binary/math operator metamethods, don't seem to do any checks.
__eq, __le and __lt convert the return to a boolean silently, seemingly via truthiness rules.

On the other side of the argument, __tostring does enforce a string return.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think __unm/__concat are comparable:

__unm doesn't have to return a number - in fact, it should likely return the value of the same type as its argument for arithmetic consistency.
__concat is likely to be used in contexts where "concatenation" similarly should preserve the source type instead of manufacturing a string; for example, container types may override .. to mean "container concatenation", eg returning a new container that contains elements from both sides.

I agree with eq etc, they should ideally be restricted to booleans but it's not strictly a necessary change (since every type is convertable to a boolean implicitly anyway), so there's already a guarantee that a caller has that a == b always evaluates to a boolean.

However, not every type can be converted to a number, so # either has to fail on non-numbers, or return the value of arbitrary type to the caller. The latter is worse from the type safety perspective; since today # can only return a number and it's not clear why # should be able to return a value of another type, it seems beneficial to keep it as such.

Essentially the general rule would be that if it makes sense semantically to return an arbitrary type (like __unm/__concat), then we should not restrict the returned type; if there's realistically only one semantically meaningful returned type, then it's better to guarantee to the caller that you either get the result of that type or an error.

Copy link

@Blockzez Blockzez Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the # operation if ran with __len cast the return to unsigned 8/16/32/64/128/etc bit integers (if possible) akin to some standard library functions (e.g. math.random for signed and bit32 for unsigned) or would # if ran with __len also allow all values from the VM 'number' data type (most commonly IEC 559 binary64) so instead of just nonnegative ℤ, it'll include ℝ (actually a subset of ℚ with the denominator of 2^k), ±Infinity, and ±NaNQ(x)/±NaNS(x)/Ind?
I believe that currently # only returns nonnegative integers so it might as will cast it to a nonnegative integer data type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path of least resistance / maximum performance would be to preserve the number as is (which would allow the full range of IEEE values). I'm not sure truncating the number silently is very useful, and we don't really have a precedent of erroring on inexact conversions I believe.


## Motivation

Lua 5.1 invokes `__len` only on userdata objects, whereas Lua 5.2 extends this to tables. In addition to making `__len` metamethod more uniform and making Luau
Copy link

@Blockzez Blockzez Jun 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sure that Lua 5.1 invokes the __len metamethod on everything except for tables and strings, not just userdata only, akin to __add metamethod invoking on everything except numbers and strings that can be casted to numbers.
So did I miss something then with the statement "Lua 5.1 invokes __len only on userdata objects", but #number errors attempt to get length of a number value so __len ought to be able to invoke on numbers as this is a sign that no raw operation is performed so it falls backs to metamethod (with the exception of __tostring which invokes on strings!)?

Should __len also invoke on strings as well? To add to that, should __add invoke on numbers and should __concat invoke on strings? Should we check metamethod first then perform raw operation later? (though vanilla Lua never had this and this is likely a feature too esoteric to be added). e.g.

function metatable_of_the_strings.__len(self)
    return tonumber(self) or 0
end
print(#"1234") --> 1234
print(#"42") --> 42
print(#"hello world") --> 0
print(rawlen("1234")) --> 4
print(rawlen("42")) --> 2
print(rawlen("hello world")) --> 11

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're technically correct in that a host program, or debug.setmetatable, can change metatable on other object types, Luau doesn't support debug.setmetatable so this is written assuming a user-space program that doesn't use debug. API.

@zeux zeux changed the title Support __len metamethod for tables and rawlen function RFC: Support __len metamethod for tables and rawlen function Jun 17, 2022
@zeux zeux merged commit fd82e92 into master Jun 28, 2022
@zeux zeux deleted the zeux-len branch June 28, 2022 16:07
AllanJeremy pushed a commit to AllanJeremy/luau that referenced this pull request Jun 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfc Language change proposal
Development

Successfully merging this pull request may close these issues.

4 participants