New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise integer parsing #5859

Merged
merged 4 commits into from Mar 10, 2017

Conversation

Projects
None yet
4 participants
@fishcakez
Member

fishcakez commented Mar 10, 2017

  • Use single binary match context
  • Do minimal pattern matches and maths
do_parse(rest, base, parse_digit(char))
else
:error
digits = [{?0..?9, -?0}, {?A..?Z, 10-?A}, {?a..?z, 10-?a}]

This comment has been minimized.

@whatyouhide

whatyouhide Mar 10, 2017

Member

We need spaces before/after -.

for {chars, diff} <- digits, char <- chars do
defp parse_digits(<<unquote(char), rest::binary>>, base, sign)
when base > unquote(char+diff) do

This comment has been minimized.

@whatyouhide

whatyouhide Mar 10, 2017

Member

Spaces around + as well :bowtie:

def parse(binary, base) when is_binary(binary) and base in 2..36 do
parse_in_base(binary, base)
def parse(<<bin::binary>>, base) when base in 2..36 do
parse_sign(bin, base)

This comment has been minimized.

@whatyouhide

whatyouhide Mar 10, 2017

Member

Would

{sign, rest} = parse_sign(bin)
parse_digits(rest, base, sign)

work in the same way? We can keep it a bit more tidy by not calling parse_digits/3 from parse_sign/2. But maybe we do bad stuff with the binary context this way 🤔

This comment has been minimized.

@fishcakez

fishcakez Mar 10, 2017

Member

That would lose a binary match context optimisation.

This comment has been minimized.

@whatyouhide

whatyouhide Mar 10, 2017

Member

I feared so. Sad! It makes the code a bit convolute and it's really hard to get why just by reading it. Maybe we can add a comment about why it's done this way, not sure. We could in the meantime rename parse_sign/2 to parse_sign_and_digits/2, wdyt?

This comment has been minimized.

@fishcakez

fishcakez Mar 10, 2017

Member

Actually we can keep the single context here. Also ill move the parse_sign/2 logic into parse/2.

do_parse(rest, base, parse_digit(char))
else
:error
digits = [{?0..?9, -?0}, {?A..?Z, 10-?A}, {?a..?z, 10-?a}]

This comment has been minimized.

@lexmag

lexmag Mar 10, 2017

Member

We need spaces around - and +. :bowtie:

@fishcakez

This comment has been minimized.

Member

fishcakez commented Mar 10, 2017

I have fixed the spaces.

@lexmag

lexmag approved these changes Mar 10, 2017

Wonderful. 💛

fishcakez added some commits Mar 10, 2017

@fishcakez

This comment has been minimized.

Member

fishcakez commented Mar 10, 2017

Simplifed the logic by removing the sign argument.

Simple benchmark run as elixir -r file.exs:

defmodule Bench do

  def bench(n) do
    {time, _} = :timer.tc(__MODULE__, :test, [n])
    div(time, n)
  end

  def test(n) do
    binary = :binary.copy(<<"123456790">>, 100)
    
    _ =for _ <- 1..n do
      Integer.parse(binary)
    end
    
    :ok
  end
end

IO.inspect Bench.bench(1000)

x21 faster on OTP 19.2.1 : 4662us -> 222us.

@fishcakez fishcakez merged commit 8b0b15e into master Mar 10, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@fishcakez fishcakez deleted the jf-int-parse branch Mar 10, 2017

ckampfe added a commit to ckampfe/elixir that referenced this pull request Jul 22, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment