Skip to content

Conversation

@sabiwara
Copy link
Contributor

@sabiwara sabiwara commented Sep 21, 2023

Edit: this PR approach has changed following discussions and is now closer to #12950.

====

This PR changes how we translate signed literals in erlang within matches.

If my understanding is correct, according to https://www.erlang.org/patches/otp-26.1#incompatibilities, +0.0 and 0.0 will have a different signification in matches going forward, so I thought it would be better to emit more precise erlang code.

Before (in matches), +0.0 in Elixir would become 0.0 in Erlang:

{:float, 1, 0.0}

After this PR, we keep +0.0:

{:op, 1, :+, {:float, 1, 0.0}}

This is consistent with what we already emit outside of matches, and with what the erlang parser emits:

iex> {:ok, tokens, _} = :erl_scan.string '+0.0 = X.'
{:ok, [{:+, 1}, {:float, 1, 0.0}, {:=, 1}, {:var, 1, :X}, {:dot, 1}], 1}
iex> :erl_parse.parse_exprs tokens
{:ok, [{:match, 1, {:op, 1, :+, {:float, 1, 0.0}}, {:var, 1, :X}}]}

@josevalim
Copy link
Member

I have a PR for this here: 67a4acf - it seems you did additional things I should have considered on the module checker. :)

However, I chose to not merge it because the difference in semantics will only happen on Erlang/OTP 27. But I guess the sooner the best?

@josevalim
Copy link
Member

Ah, I see, my changes did not require changes to module checker.

@sabiwara I choose to keep the numbers as -0.0 and 0.0 in the Elixir pass because anyone can do: unquote(-0.0). So we cannot rely on the unary +/- being "visible". It is best to always encode them as floats and then specially emit positive or negative floats to the Erlang compiler.

@josevalim
Copy link
Member

I sent my previous work as a PR here: #12950. :) Sorry for the duplicate work!

@sabiwara
Copy link
Contributor Author

Right, I totally forgot about unquote...

Maybe it doesn't have to be a deal breaker, since floats are still valid AST:

  • you can still unquote(-0.0) if it's OK for your use case
  • if you need the +0.0 distinction for a match, you can create the AST for it:
ast = quote(do: +0.0)
...
unquote(ast)

Sorry for the duplicate work!

Not at all, it was an interesting problem to think about. Thanks for pushing, will check it tomorrow!

@josevalim
Copy link
Member

if you need the +0.0 distinction for a match, you can create the AST for it:

Yeah, the issue is that -0.0 wouldn't be enough, they would need to check if positive or negative and then generate the quote, which pushes too much complexity upstream. That's why I went with checking in the Erlang pass. :)

@sabiwara
Copy link
Contributor Author

Yeah, the issue is that -0.0 wouldn't be enough, they would need to check if positive or negative and then generate the quote, which pushes too much complexity upstream. That's why I went with checking in the Erlang pass. :)

Actually I think the issue also exists on your branch? I'm getting:

iex(1)> q = quote do: unquote(-0.0) = +0.0
{:=, [],
 [-0.0, {:+, [context: Elixir, imports: [{1, Kernel}, {2, Kernel}]], [0.0]}]}
iex(2)> Code.eval_quoted q
warning: pattern matching on 0.0 is equivalent to matching only on +0.0 from Erlang/OTP 27+. Instead you must match on +0.0 or -0.0
└─ nofile

{0.0, []}

But also, I'm not sure how big an issue it is:

  • a lot of ast injected in unquote comes from a quote somewhere else I think? (for macros)
  • (sorry probably a half baked thought) we could maybe provide some helper like Macro.escape_float which would add signs when required for the kind of macros that need it and actually face the warning?

@josevalim
Copy link
Member

Ah, that’s true, both fail. So I wonder why I went with my approach? Because I initially started with the same patch as you but it didn’t work for a reason…

@josevalim
Copy link
Member

Maybe my changes were due to warnings? We want to print warnings but we don’t want Erlang to warn, so we need to always rewrite the floats on the Erlang side, so I moved all handling up?

@sabiwara
Copy link
Contributor Author

sabiwara commented Sep 21, 2023

Yeah I planned to add the warning as a follow-up but haven't started yet. So I'm a step behind you on this one.
My thinking was: I can't ask the user to replace 0.0 by +0.0 if they end up as 0.0 anyway, so I started by making the generated erlang AST consistent with the original elixir AST.

@josevalim
Copy link
Member

In order for you to not warn on the Erlang side, you also need to add the same code I did that converts the floats 0.0 or -0.0 into an operator call, as the literals numbers may appear in the AST anyway.

In other words, we both need to handle 0.0 and -0.0 in both literal and as unary call formats. Therefore, we might as well normalize them into a single format and I think the literal number format is more concise anyway?

@sabiwara
Copy link
Contributor Author

sabiwara commented Sep 21, 2023

In order for you to not warn on the Erlang side, you also need to add the same code I did that converts the floats 0.0 or -0.0 into an operator call, as the literals numbers may appear in the AST anyway.

I will try to add the warning and see how it goes 👍

In other words, we both need to handle 0.0 and -0.0 in both literal and as unary call formats. Therefore, we might as well normalize them into a single format and I think the literal number format is more concise anyway?

I agree about the conciseness part, which made me hesitant about my proposal. But since this would only affect matches and would make it consistent with we already do for expressions, I thought it might be OK.
On the other hand, I was worried that handling 0.0 as a special case may introduce more unpredictable AST, make it harder to reason about and potentially lead to more edge cases / bugs.

0.0 specific fix Propagate original signs to erlang
0.0 = 0.0 {:match, _, {:op, _, :+, {:float, _, 0.0}}, {:op, _, :+, {:float, _, 0.0}}} {:match, _, {:float, _, 0.0}, {:float, _, 0.0}}
+0.0 = +0.0 {:match, _, {:op, _, :+, {:float, _, 0.0}}, {:op, _, :+, {:op, _, :+, {:float, _, 0.0}}}} {:match, _, {:op, _, :+, {:float, _, 0.0}}, {:op, _, :+, {:float, _, 0.0}}}
+1.0 = +1.0 {:match, _, {:float, _, 1.0}, {:op, _, :+, {:float, _, 1.0}}} {:match, _, {:op, _, :+, {:float, _, 1.0}}, {:op, _, :+, {:float, _, 1.0}}}

(mind the double +: +0.0 becomes +(+0.0) 😉)

@sabiwara
Copy link
Contributor Author

I pushed 49efd36 which added the same warning as you as a first step.
Doesn't address the unquote thing either, but at least we can compare apples to apples :)

Comment on lines +97 to +100
assert {{:ok, :float}, [diagnostic]} = quoted_pattern_with_diagnostics(0.0)

assert diagnostic.message =~
"pattern matching on 0.0 is equivalent to matching only on +0.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggled way too much to disable this warning 😅
It is also on #12950 I believe

@sabiwara
Copy link
Contributor Author

sabiwara commented Sep 22, 2023

OK, that wouldn't fix the unquote thing totally, but, what if Macro.escape/1 would escape +/-0.0 as {:+, _, [0.0]}?
And invite people to escape their floats in macros like we do for almost everything?

The warning could also mention "if this code was generated from a macro, please make sure to use Macro.escape/1 on float values", this way the coder or library author has a good pointer to address it.


expand(Zero, S, #{context := match} = E) when is_float(Zero), Zero == 0.0 ->
elixir_errors:file_warn([], E, ?MODULE, invalid_match_on_zero_float),
Escaped = {{'.', [], [erlang, '+']}, [], [0.0]},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this consider if we have -0.0 or 0.0 accordingly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this current implementation makes it so +0.0 repeats the plus operator twice in the Erlang AST, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to address these, I think you will pretty much need to add the same code I did on elixir_erl and expand_remote_arg. Which basically means our code is ultimately the same but moved to different places.

With that said, I would prefer to keep the AST smaller (a single float) than bigger. So, if you agree with the assessment above, let's merge my PR and then you push your pattern tests fixes?

Copy link
Contributor Author

@sabiwara sabiwara Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this consider if we have -0.0 or 0.0 accordingly?

Yes, definitely 👍

Also, this current implementation makes it so +0.0 repeats the plus operator twice in the Erlang AST, right?

As far as I checked, it doesn't, this piece only gets called for 0.0, not +0.0. But #12950 does repeat it: +0.0 becomes {:op, 4, :+, {:op, 4, :+, {:float, 4, 0.0}}} outside of matches, and -0.0 becomes -(+0.0).

With that said, I would prefer to keep the AST smaller (a single float) than bigger.

I understand, but in this case, shouldn't we do it for non matches too? Right now, only matches are smaller (which is a bit ironic given that it is the only place we would need the sign info 😉).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but in this case, shouldn't we do it for non matches too?

I am fine with always doing it. :) Can you please amend this PR then so we always convert to -0.0 or +0.0 as literals instead of AST and we build the AST back in the Erlang pass? I will close mine. Awesome job!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I had in mind. However, I am slightly confused with something: given we rewrite +0.0 AST into +0.0 literal, won't we always warn? Even when def foo(+0.0) is written? Since +0.0 (AST) will expand to -0.0 (literal)?

handle 0.0 in Macro.escape/1 (necessary to fix the bug below)

Be careful with this. If +0.0 (literal) becomes +0.0 (AST), then expanding +0.0 (AST) cannot become +0.0 (literal), otherwise you may introduce an infinite loop. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle 0.0 in Macro.escape/1 (necessary to fix the bug below)

Also, can we say from the code that the user intends to match only on +0.0/-0.0 or both? If you only have a value, you don't know the original context. Maybe the code was @x some_expr_that_returns_float(). So I don't think we should change Macro.escape yet!

Copy link
Contributor Author

@sabiwara sabiwara Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I had in mind. However, I am slightly confused with something: given we rewrite +0.0 AST into +0.0 literal, won't we always warn?

We don't since we go in the rewrite clause instead when we expand {:., _, [Kernel, :+]}, _, [0.0]}, calling expand_remote(:erlang, _, :+, _, false, [0.0], ...) (here), we never expand 0.0.
There is no warning when running tests either (except the one I captured).

Even when def foo(+0.0) is written?

Yup, it's fine as well 👍

Screenshot 2023-09-22 at 23 36 35

Be careful with this. If +0.0 (literal) becomes +0.0 (AST), then expanding +0.0 (AST) cannot become +0.0 (literal), otherwise you may introduce an infinite loop. :)

Great point, I thought about a risk of infinite loops but maybe it's more dangerous than I thought, I'll be extra careful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can we say from the code that the user intends to match only on +0.0/-0.0 or both? If you only have a value, you don't know the original context. Maybe the code was @x some_expr_that_returns_float(). So I don't think we should change Macro.escape yet!

Right, good point... Even though, it might still make sense:

  • in your example, if some_expr_that_returns_float returns 0.0 which is +0.0, and we match on +0.0, this is not necessarily a bug but could be arguably the right thing to do?
  • the difference is only if we emit the warning or not, but the behavior is otherwise this one anyway
  • I think the warning makes sense when users try to match on "0.0, the syntax", but not on "0.0, the value" (which is +0.0). x = some_expr_that_returns_float(); ^x = 0.0 won't warn either and will suffer the same issue, right?
  • without escaping, users couldn't write an attr containing 0.0 like @x [+0.0, ...] and use it on a match

Thinking in terms of prop-based testing, escaping would be required in order to respect the unquote(Macro.escape(ast)) = unquote(Macro.escape(ast)) "invariant" which I think holds true for all escapable values? (I might be wrong)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in your example, if some_expr_that_returns_float returns 0.0 which is +0.0, and we match on +0.0, this is not necessarily a bug but could be arguably the right thing to do?

The issue is that the code today matches both, so they need to decide if the intent is to match one or both. The whole issue is the ambiguity, there is no right answer unless we know if the value originally had a sign or not.

@sabiwara sabiwara changed the title Proposal: Translate signed literals with sign in matches Warn when matching on 0.0 and generate erl AST for +/-0.0 Sep 22, 2023
@sabiwara sabiwara merged commit 9c0fa3c into elixir-lang:main Sep 22, 2023
@sabiwara sabiwara deleted the erl-literal-sign branch September 22, 2023 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants