New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse integer division operator #6470

Merged
merged 4 commits into from Aug 4, 2018

Conversation

@bcardiff
Member

bcardiff commented Jul 30, 2018

In the path to implement #2968 first the operators needs to be allowed.

This PR allows // and &// operators to be defined.
The behavior will need to wait to the next release.
// will raise on overflow and &// won't. Similar to + vs &+.

// will continue to mean empty regex when possible:

  • foo // bar is equivalent to foo.//(bar)
  • foo // is a syntax error (from now on)
  • foo(//) is a call passing empty regex

The ASTNode#to_s will use %r() as empty regex always since // will not always work.

There are some refactors on specs and code for allowing operators to be used as methods and macro names introducing a state ivar @wants_def_or_macro_name that works similar to @slash_is_regex configuring the behavior of the lexer depending on the context.

@bcardiff bcardiff added this to the 0.26.0 milestone Jul 30, 2018

@straight-shoota

This comment has been minimized.

Contributor

straight-shoota commented Jul 30, 2018

// will be allowed to overflow and &// won't.

Isn't it the other way around?

@bcardiff

This comment has been minimized.

Member

bcardiff commented Jul 30, 2018

Yes @straight-shoota I mean to overflow as "raise overflow". &// won't raise. I've just edited for clarity.

@RX14

This comment has been minimized.

Member

RX14 commented Jul 31, 2018

&//=

This reminds me of perl. I don't like it.

@RX14

This comment has been minimized.

Member

RX14 commented Jul 31, 2018

I really don't like all these new operators being proposed. Wrapped(T) will work fine, and integer division works fine as-is. Why change?

I really don't think we should introduce all these operators, they're just more things for a newcomer to learn to save a few characters typing (.to_f) in a fairly rare situation. I think it's a bad tradeoff.

@bcardiff

This comment has been minimized.

Member

bcardiff commented Jul 31, 2018

I would expect that from all the operators added in this PR // will be at far the most used. The rest are for consistency. Implementing only // would leave things unfinished in this topic.

The division operator / can be improved, this has been discussed already in #2968 and #6223. The main issue is semantic one. / performs different operators depending on the numeric type. The fact that we are used to an overload behaviour of / does not mean it can't be improved.

Adding .to_i/.to_f to force one or another will make the code use more (or less) precision that it should because .to_i/.to_f has fixed output type.

The wrapper api, can be implemented on top of this. But not the other way around. So operators are more powerful and expressive.

For wrapper to work well with overloads you will need Wrapper(T) < T which is something that is not supported. Otherwise the type is used only locally without passing them around.

The discussion left (hence this PR) is not about if the language needs to add this, but if the implementation is good enough.

@RX14

This comment has been minimized.

Member

RX14 commented Jul 31, 2018

Wrapping(T) is a discussion for another issue, but i'm yet to see a place where numbers with wrapping arithmetic performed on them are actually passed around significantly outside of a single class. I don't see a reason why ease of use of unchecked arithmetic is a big deal because it's just so rare. Adding a whole set of new operators just for something so rarely used seems to be like a gross premature optimization and a bad idea.

Back to this issue, I'm not sure what your concern is about .to_i/.to_f, because the current method clearly works, because it's the current method. I haven't seen any friction in that area and I've been on IRC helping newbies for a long time. I can't see the motivation.

I'm really against adding these operators because what we have now just isn't broken, so lets not fix it. Change for the sake of change is a bad thing.

@RX14

This comment has been minimized.

Member

RX14 commented Jul 31, 2018

I just want data that this is actually a problem in the real world, and not a solution to a problem that we've invented in our heads.

@bcardiff

This comment has been minimized.

Member

bcardiff commented Jul 31, 2018

Back to this issue, I'm not sure what your concern is about .to_i/.to_f, because the current method clearly works, because it's the current method.

Let's use an example, if avg method wants to be coded that will work with any kind of numbers you will end up adding a to_f to force non-integer division, because / currently is both (int and non-int). I think that that already needs to be improved because it can be more clear.

The to_f will make the result of the operation Float64, but if the input numbers are Float32 then why the operation needs to be carried on 64 bits? Only because the lack / to express the floating point division in the most appropriate type given the operands.

Again, the fact that we are used to C like math operators does not make it the best option.

@RX14

This comment has been minimized.

Member

RX14 commented Jul 31, 2018

@bcardiff I think issues like that are fairly rare, and val = val.to_f unless val.is_a? Float is a very short snippet for the small amount of methods that operate on generic Numbers.

@bcardiff

This comment has been minimized.

Member

bcardiff commented Jul 31, 2018

That snippet is a workaround and harder to come up when learning the language. Having different operators is cleaner. Something new to learn today, yes. But is a more consistent set of operators.

Try to explain that line when coding a method that manipulate numbers vs explain that / is floating point division and // is floor division always.

  1. Splitting / and // was already agreed and pros/cons were exposed.
  2. Leaving // to raise upon overflow and avoid adding &// seems incomplete.
  3. Excluding //= and &//= when /= is already supported seems incomplete also.

I could understand removing 3 from the PR, if and only if /= is also removed. But it seems arbitrary since +=, &+=, etc are allowed. So they are implemented.

@RX14

This comment has been minimized.

Member

RX14 commented Aug 1, 2018

@bcardiff it's not just something new to learn today, it's something new that every newcomer from now until forever will have to learn.

At the end of the day though, I'm more concerned about the overflow stuff adding a bazillion more operators than this PR only adding two.

@asterite

This comment has been minimized.

Contributor

asterite commented Aug 1, 2018

  • I think having / and // operators is more intuitive than just / (many people find it strange that 1/2 == 0)
  • The & operators will be used in very specific circumstances, so this doesn't add a lot of complexity. Much less with division.
  • I think the decision about introducing this // operator is already taken. Now it's time to review the code.
when StringInterpolation
visit_interpolation(exp) { |s| Regex.append_source s, @str }
if (exp = node.value).is_a?(StringLiteral) && exp.value.empty?
# // is not allways an empty regex, sometimes is an operator

This comment has been minimized.

@r00ster91

r00ster91 Aug 1, 2018

Contributor

Typo: allways -> always

@j8r

Generally there is no point of using case statements for only one single condition vs if.

There is strictly no performance difference between the two. However using a case adds one more line to the code because the condition is splitted in two lines, which is IMO less clear than on one line with if

it "lexes #{string.inspect}" do
lexer = Lexer.new string
unless (v = slash_is_regex).nil?

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

Replace simply by if v = slash_is_regex

This comment has been minimized.

@straight-shoota

straight-shoota Aug 1, 2018

Contributor

This is not the same. slash_is_regex is Bool? so you need to test for .nil?

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

Ok 👍 - the language evaluate if the expression is true and not nil, and we just want to know if it isn't nil.

This comment has been minimized.

@bcardiff

bcardiff Aug 1, 2018

Member

That would be abusing that the default value of slash_is_regex is false.

@@ -287,8 +288,17 @@ module Crystal
line = @line_number
column = @column_number
char = next_char
if !@slash_is_regex && char == '='
if (@wants_def_or_macro_name || !@slash_is_regex) && char == '/'
case next_char

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

Can be replaced by a smaller if statement

This comment has been minimized.

@bcardiff

bcardiff Aug 1, 2018

Member

Using the case is more about following the pattern of the rest of the lexer.
But I don't follow which is the proposed smaller if.

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

This is fine - this just makes the lexer looks a bit more complex that it is, that's it :)
Here it would be

if next_char == '/'
  next_char :"//="
else
  @token.type = :"//"
end

This comment has been minimized.

@bcardiff

bcardiff Aug 1, 2018

Member

I thought you ment the if (@wants_def_or_macro_name ....
There are some if in the lexer, bust mostly case even single when-ed (?) case. I found them easier to read the lexer if all the structures are the same and not case by case based that we will end up having if your suggestions are applied.

At there is no real change the the actual code.

So case are kept. But thanks.

when '/'
case next_char
when '/'
case next_char

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

ditto

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

With ifs it would be (start at line 648, replacing the 2 cases):

raise "&/ is not a valid operator. Did you mean &// ?" if next_char != '/'
if next_char == '='
  next_char :"&//="
else
  @token.type = :"&//"
end

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

At a note, the previous code was:

case next_char
when '/'
  case next_char
  when '='
    next_char :"&//="
  else
    @token.type = :"&//"
  end
else
  raise "&/ is not a valid operator. Did you mean &// ?"
end
@@ -470,6 +489,13 @@ module Crystal
else
symbol "&*"
end
when '/'
case next_char

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

ditto

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

And here it would be:

if next_char == '/'
  next_char_and_symbol "&//"
else
  symbol "&/"
end
@@ -412,7 +426,12 @@ module Crystal
symbol "*"
end
when '/'
next_char_and_symbol "/"
case next_char

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

ditto

This comment has been minimized.

@bcardiff

bcardiff Aug 1, 2018

Member

ditto ditto :-)

This comment has been minimized.

@j8r

j8r Aug 1, 2018

Contributor

It would be here

if next_char == '/'
  next_char_and_symbol "//"
else
  symbol "/"
end
@ysbaddaden

Let's merge!

I regularly fumble on this in Ruby or put to_f in too many places because I can't trust / to rrturn a float. This is inconsistent and I'm happy we finally fix this in Crystal, with a minimal change for floor division (just add a /).

@RX14 points were already made/refuted in related issues.

@RX14

This comment has been minimized.

Member

RX14 commented Aug 1, 2018

I forgot we'd already merged all the &+ overflow operators. Since we've already - in my opinion - gone overboard with the number of operators in the language you might as well go and merge this PR.

@oprypin

This comment has been minimized.

Contributor

oprypin commented Aug 1, 2018

I would prefer &//= to be removed, along with &+= &-= &*=. If someone wants to be this specific, they can write a = a &+ 1 rather than a &+= 1.

And if anyone else is thinking: how can integer division overflow? 🤔
@RX14 had to explain to me that it happens (only?) with Int32::MIN / -1

@bcardiff bcardiff force-pushed the bcardiff:feature/parse-int-div-ops branch from def342d to a85334f Aug 1, 2018

@bcardiff

This comment has been minimized.

Member

bcardiff commented Aug 1, 2018

I fixed the typo and rebased. :shipit: !

@oprypin I will still raise the card of completeness regarding the &+= and others. And they will be actually used in Hasher and others. All the op= are syntax sugar and not a core part of the language.

@asterite

This comment has been minimized.

Contributor

asterite commented Aug 1, 2018

By the way: https://stackoverflow.com/questions/30378591/overflow-division-operator-not-recognized

I would recommend not having the &/ operator in Crystal, just like in Swift. It's an extra check for something that can rarely happen.

@RX14

This comment has been minimized.

Member

RX14 commented Aug 1, 2018

completeness for completeness's sake seems like a very bad idea to me. It just expands the set of operators.

I'd be much happier if the only operators we were adding were &+, &-, &*, // and //=.

@bcardiff

This comment has been minimized.

Member

bcardiff commented Aug 1, 2018

Since division by zero would still be raised in &// is not that that operator is free of exception.
Ok, I will remove &// and &//=.

bcardiff added some commits Jul 6, 2018

Add lexer of // //= operators
Refactor specs to assert operators all together
Add // //= operators parsing
Allow // to be used as empty regex in some context only.
Use %r() for empty regex when in to_s of AST.
Reorganize DefOrMacroCheck constants
the position of :"[]?" was off, new line added to compare them easily
Refactor: simplify parsing of def and macro name
DRY logic of allowed operators names for def and macro

@bcardiff bcardiff force-pushed the bcardiff:feature/parse-int-div-ops branch from a85334f to 5cb2886 Aug 2, 2018

@bcardiff bcardiff changed the title from Parse integer division operators to Parse integer division operator Aug 2, 2018

@sdogruyol

Great work 💯 Thank you @bcardiff 👍

@bcardiff bcardiff merged commit db838c7 into crystal-lang:master Aug 4, 2018

4 checks passed

ci/circleci: test_darwin Your tests passed on CircleCI!
Details
ci/circleci: test_linux Your tests passed on CircleCI!
Details
ci/circleci: test_linux32 Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@RX14 RX14 added the topic:compiler label Aug 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment