Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise integer parsing #5859

Merged
merged 4 commits into from Mar 10, 2017
Merged

Optimise integer parsing #5859

merged 4 commits into from Mar 10, 2017

Conversation

fishcakez
Copy link
Member

  • Use single binary match context
  • Do minimal pattern matches and maths

do_parse(rest, base, parse_digit(char))
else
:error
digits = [{?0..?9, -?0}, {?A..?Z, 10-?A}, {?a..?z, 10-?a}]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need spaces before/after -.


for {chars, diff} <- digits, char <- chars do
defp parse_digits(<<unquote(char), rest::binary>>, base, sign)
when base > unquote(char+diff) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spaces around + as well :bowtie:

def parse(binary, base) when is_binary(binary) and base in 2..36 do
parse_in_base(binary, base)
def parse(<<bin::binary>>, base) when base in 2..36 do
parse_sign(bin, base)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would

{sign, rest} = parse_sign(bin)
parse_digits(rest, base, sign)

work in the same way? We can keep it a bit more tidy by not calling parse_digits/3 from parse_sign/2. But maybe we do bad stuff with the binary context this way 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would lose a binary match context optimisation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feared so. Sad! It makes the code a bit convolute and it's really hard to get why just by reading it. Maybe we can add a comment about why it's done this way, not sure. We could in the meantime rename parse_sign/2 to parse_sign_and_digits/2, wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we can keep the single context here. Also ill move the parse_sign/2 logic into parse/2.

do_parse(rest, base, parse_digit(char))
else
:error
digits = [{?0..?9, -?0}, {?A..?Z, 10-?A}, {?a..?z, 10-?a}]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need spaces around - and +. :bowtie:

@fishcakez
Copy link
Member Author

I have fixed the spaces.

Copy link
Member

@lexmag lexmag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful. 💛

@fishcakez
Copy link
Member Author

fishcakez commented Mar 10, 2017

Simplifed the logic by removing the sign argument.

Simple benchmark run as elixir -r file.exs:

defmodule Bench do

  def bench(n) do
    {time, _} = :timer.tc(__MODULE__, :test, [n])
    div(time, n)
  end

  def test(n) do
    binary = :binary.copy(<<"123456790">>, 100)
    
    _ =for _ <- 1..n do
      Integer.parse(binary)
    end
    
    :ok
  end
end

IO.inspect Bench.bench(1000)

x21 faster on OTP 19.2.1 : 4662us -> 222us.

@fishcakez fishcakez merged commit 8b0b15e into master Mar 10, 2017
@fishcakez fishcakez deleted the jf-int-parse branch March 10, 2017 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants