Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support number types #816

Open
xiaq opened this issue Apr 7, 2019 · 7 comments
Open

Support number types #816

xiaq opened this issue Apr 7, 2019 · 7 comments

Comments

@xiaq
Copy link
Member

@xiaq xiaq commented Apr 7, 2019

Numbers in Elvish, like 42, -4, 3.14159, 0xcafe, 1e100, Inf etc., are just strings. This is a compromise; the basic idea is to support writing filenames and flags such as 42 and -4 without requiring the use of quotes, without making them "sometimes strings, sometimes numbers". So, for the sake of avoiding sloppiness, they are always strings. The math functions in Elvish all expect strings as a result, just that such strings should contain well-formed numbers.

This approach actually works pretty well in Elvish, and causes few problems or surprises, as long as the programmer stays within Elvish land.

However, there is one area that this approach falls short: interoperability with serialization formats that do have number types - that is to say, most of them. As a result, it is not possible to convert JSON to Elvish value and back to JSON and get exactly the same thing, because JSON numbers are converted to Elvish strings. A simple example:

~> echo '[1,42]' | from-json | to-json
["1","42"]

To solve this problem, the best approach so far would be to introduce explicit number types for the sake of interoperability. Let's say that we introduce a number builtin for constructing such explicitly typed numbers. Now the JSON array [1,42] will be converted into the Elvish list [(number 1) (number 42)], and it will be converted back to the JSON array [1,42] - no information loss.

Because it is common to manipulate the data and JSON and then output back, all the builtin math functions should also output explicitly typed numbers, when at least one of their arguments are explicitly typed numbers. Yet, to retain the old behavior, when all of the arguments are strings, the output should still be a string. Examples:

~> + 1 2 # all arguments are strings
▶ 3
~> + (number 1) 2
▶ (number 3)
~> echo '[1,42]' | from-json | each [v]{ + $v 1 } | to-json
[2,43]

It is worth stressing that the use of the explicit number type is entirely optional for code that stays within the "Elvish value universe". This is because all of the math functions will continue to accept string arguments, and output strings when all arguments are strings. One does not need to even know the existence of explicit number types.

Another interesting potential in future is to introduce perhaps not only one number type, but many number types, in order to interoperate with formats that distinguish (for instance) integers and floating-point numbers. All of those types will still be entirely optional, however.

@xiaq xiaq changed the title Support explicit number types Support explicit number types for interoperability with serialization formats Apr 7, 2019
@xiaq

This comment has been minimized.

Copy link
Member Author

@xiaq xiaq commented Apr 7, 2019

In fact, use of explicitly typed numbers in Elvish code for purposes other than interoperability with serialization formats is not only optional, but should maybe even be discouraged; as such, the relatively clumsy syntax of (number 3) is intentional.

@xiaq

This comment has been minimized.

Copy link
Member Author

@xiaq xiaq commented Apr 9, 2019

As a starter, let's simply borrow Go's type names. For instance, the double-precision floating number type is float64, and a number of that type is written like (float64 42)

@VictorLowther

This comment has been minimized.

Copy link

@VictorLowther VictorLowther commented Apr 9, 2019

I would suggest different semantics -- everything stays stringly upfront, but behind the scenes every value gets converted to the type most suited to the current operation and stays that way until a different type is needed. Since basically every value has a canonical string representation anyways. That is what TCL wound up having to do to get good performance in their everything-is-a-string world

@xiaq

This comment has been minimized.

Copy link
Member Author

@xiaq xiaq commented Apr 9, 2019

This is how Elvish behaves now. It doesn't solve the problem of JSON interoperability.

I have researched Tcl's solution, and the answer is that Tcl does not have a solution: writing JSON from Tcl-native values requires a schema specifying the type of each field.

xiaq added a commit that referenced this issue Apr 9, 2019
Float64 values are printed like (float64 42). However, the float64
builtin does not actually exist yet, and builtin math functions do not
accept float64 arguments either.

However, after this CL, from-json | to-json is now an identity operator.
Previously:

~> echo '[1]' | from-json | to-json
["1"]

Now:

~> echo '[1]' | from-json | to-json
[1]

It is also possible to use from-json to construct float64 values before
the float64 builtin is actually implemented.
@clouds56

This comment has been minimized.

Copy link

@clouds56 clouds56 commented Apr 12, 2019

I'm new to elvish and have some concerns. Correct me if I'm wrong.

Supported types

  1. the support for really large integers like factorial 100 should be
(number 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000)
  1. should we support complex (complex128 (float64 3) (float64 4)) as 3+4i

My options for 1. support using big.Int/big.Float and for 2. not for now

Convert between types

  1. how numbers convert (e.g. from float64 to int32)? implicit or explicit
  2. how numbers convert to string? implicit or explicit

I prefer implicit convert numbers between different representation, (i.e. float64, int64 in golang, and big.Int, big.Float for large numbers), just like python. We could automatically choose appropriate types for result types.

  • + (int32 1) (float64 2.1) would result (float64 3.1), while + (big.Int 10^100) (big.Int -10^100+1) would result (int32 1). in this case no overflow would be happened (expect OOM)
  • but it would never automatically/implicitly convert from float like variable to integer ones.

And convert from/to string should be explicitly.

  • think about syntax {(float64 11.2)} would convert it to string 11.200000 when it is the only element in brace.
  • what about format of output string? what if we want 1.120000e1?

Strongly typed

Could a function claims it only accept number like object?

@xiaq xiaq changed the title Support explicit number types for interoperability with serialization formats Support explicit number types Apr 15, 2019
@xiaq

This comment has been minimized.

Copy link
Member Author

@xiaq xiaq commented Apr 15, 2019

@clouds56 Thanks for your comment. Indeed, the topics you have touched upon are very interesting:

  • In future I'd like to model Elvish's number system after Scheme's numerical tower, including high-precision number types.

  • Braces are a grouping construct and shouldn't change the type.

  • Static typing should be part of a larger system, and should apply to lists, maps, functions in addition to numbers.

@xiaq

This comment has been minimized.

Copy link
Member Author

@xiaq xiaq commented Apr 18, 2019

After some more thoughts, I've come up with some design changes and clarifications.

A: Builtin functions that previously output numbers as strings will be outputting typed numbers.

So previously + 1 2 will output 3 (a string); now it will output (float64 3) (a typed number). Maybe it should output (int 3) instead; the exact type is not yet final.

The reason for this change is that, when we restrict ourselves to arithmetic functions, the original design makes much sense, as the user can very conviniently switch between the two output modes by deciding what types the arguments are.

However, there are functions that do not take number arguments, but output numbers. Notably, count. We can make count continue to output strings, but it makes less sense now that Elvish actually have number types.

I think that it is worthwhile to re-orientate a bit and treat typed numbers slightly more seriously. Previously my idea was that "typed numbers are only necessary when dealing with serialization formats, and you can forget about it if you are not dealing with them"; but now it seems simpler to embrace typed numbers, but still support the use of plain strings as numbers as a convinience.

What does not change though, is that all the arithmetic functions will continue to accept string arguments. You will never have to write + (float64 1) (float64 2) to do a simple addition; + 1 2 will always be valid.

With this change, however, it is important that typed numbers interoperates well with strings. Hence points B and C:

B: Typed numbers only have the clumsy-looking syntax in its representation, not stringification.

Like Python and some other languages, Elvish values can be converted to a string in 2 ways: representation (resembling Python repr) and stringification (resembling Python str). The (float64 2) syntax seen earlier only applies to the representation, not stringification. When stringified, (float64 2) becomes 2, indistinguishable from the string "2". This means that typed numbers will look quite differently when passed to put than when passed to echo or to-string:

~> put (number 2)
▶ (number 2)
~> echo (number 2)
2
~> to-string (number 2)
▶ 2

C: Typed numbers can be concatenated with strings, in which case they stringify implicitly.

Example:

~> put 'Two: '(number 2)
▶ 'Two: 2'
~> echo 'Two: '(number 2)
Two: 2
@xiaq xiaq changed the title Support explicit number types Support number types Apr 18, 2019
xiaq added a commit that referenced this issue Apr 26, 2019
This addresses #816.
xiaq added a commit that referenced this issue Apr 27, 2019
xiaq added a commit that referenced this issue Apr 27, 2019
xiaq added a commit that referenced this issue Apr 27, 2019
This addresses #816.
xiaq added a commit that referenced this issue Apr 27, 2019
This addresses #816.
@xiaq xiaq removed the type:enhancement label Oct 18, 2019
@xiaq xiaq added P:Number System and removed A:Language labels Dec 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.