Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ragel-bitmap for less memory #1343

Open
wants to merge 2 commits into
base: master
from
Open

Conversation

@kddeisz
Copy link

kddeisz commented May 30, 2019

Hi friends -

This is an attempt at mitigating very large memory usage as per #1342. ragel-bitmap is a gem I made yesterday that has a replace-inline tool for ruby code generated by ragel that replaces integer arrays with bitmap objects with #[] defined so they act the same. @schneems found this patch reduces memory consumption by about 23mb.

In terms of performance, I haven't found any kind of significant difference. That being said, my performance measurement abilities are relatively rudimentary, so I would be happy to be coached on this if someone else wanted to take a shot at it.

@kddeisz

This comment has been minimized.

Copy link
Author

kddeisz commented May 30, 2019

Looks like the tests are failing on older versions of ruby because of a bundler incompatibility unrelated to this code.

.travis.yml Outdated
@@ -3,7 +3,8 @@ cache: bundler

bundler_args: --without local_development
before_install:
- gem install bundler

This comment has been minimized.

Copy link
@ahorek

ahorek May 30, 2019

Contributor

I don't think we have to install the bundler explicitly, #1308

This comment has been minimized.

Copy link
@kddeisz

kddeisz May 30, 2019

Author

Happy to remove, was just trying to get the tests to pass.

@ahorek

This comment has been minimized.

Copy link
Contributor

ahorek commented May 30, 2019

ruby 2.6.3

           2.6.6     22.999  (± 8.7%) i/s -    456.000  in  20.074763s

           2.7.1     57.845  (± 5.2%) i/s -      1.155k in  20.035455s

      2.8.0 pre      63.768  (± 4.7%) i/s -      1.275k in  20.048792s

    ragel-bitmap      6.435  (± 0.0%) i/s -    129.000  in  20.110172s

jruby 9.2.7.0

           2.6.6     11.895  (? 8.4%) i/s -    237.000  in  20.026944s

           2.7.1     32.772  (?12.2%) i/s -    645.000  in  20.011202s

       2.8.0 pre     33.888  (?11.8%) i/s -    664.000  in  19.991385s

    ragel-bitmap      3.733  (? 0.0%) i/s -     74.000  in  20.029995s
@schneems

This comment has been minimized.

Copy link

schneems commented May 30, 2019

@ahorek can you share the benchmark script you used to generate that please?

@ahorek

This comment has been minimized.

Copy link
Contributor

ahorek commented May 30, 2019

sure, it's nothing special, just parsing a few real world emails

benchmark script
require "bundler/inline"

begin
  file = File.binread(ARGV[0])
rescue
  puts "Usage: ruby #{__FILE__} mail.eml" 
  exit
end

gemfile(true) do
  source "https://rubygems.org"

  gem "mail", git: "https://github.com/kddeisz/mail.git", branch: "use-ragel-bitmap"
  gem "benchmark-ips"
end

Benchmark.ips do |x|
  x.config(:time => 20, :warmup => 10)
  x.report('test') do
    Mail.new(file).to_s
  end
end

these bitmaps will be stored as bigints and each lookup has to be computed which is much more expensive then a simple array reference
https://github.com/kddeisz/ragel-bitmap/blob/bdfd37dba7543d13ed66b2677ebbd240afe28b4d/lib/ragel/bitmap.rb#L18

@kddeisz

This comment has been minimized.

Copy link
Author

kddeisz commented May 31, 2019

Hmm... I suppose it depends on the use case. For the most part I'd imagine people will trade the speed for the memory, but some folks might be parsing tons and tons of emails. I think it'd be pretty easy to add an option to ragel-bitmap like:

class Ragel::Bitmap
  class << self
    attr_reader :get_speed_lose_memory
    def get_speed_lose_memory!
      @get_speed_lose_memory = true
    end

    def from(width, numbers)
      if get_speed_lose_memory
        reverse_engineer_bitmap(width, numbers)
      else
        new(width, numbers)
      end
    end
  end

  @get_speed_lose_memory
end

That way people could call Ragel::Bitmap.get_speed_lose_memory! before they require their parsers. Maybe that could then be exposed on the Mail module as a config option.

@schneems

This comment has been minimized.

Copy link

schneems commented May 31, 2019

Thanks @ahorek, i'm able to reproduce your results. I'm getting about 10 i/s with this pr and 148 i/s from the currently released gem.

For my case the only reason a parser is being required is for the valid_email gem. It only validates the email address, and it doesn't really get called that often. Being able to tune performance/memory would be nice.

@fcheung

This comment has been minimized.

Copy link

fcheung commented Jun 1, 2019

I tried using a string as the backing store for the bitmap instead of a bignum ( fcheung/ragel-bitmap@8dd2279 ) I'm seeing similar memory decreases (~23mb). I still get a performance hit, but it is less noticeable.

With the benchmark script above & a random email from my inbox I get 158 i/s on master, and 97i/s with my branch. Still quite a big hit, but getting closer.

@jhawthorn

This comment has been minimized.

Copy link

jhawthorn commented Jun 2, 2019

Nice! I did some testing of byteslice.unpack vs getbyte based on @fcheung's string approach.

A = Array.new(20_000, 123)
B = A.pack("c*")
W = A.pack("n*")

Benchmark.ips do |x|
  x.report('array', <<-end)
    A[5678]
  end

  x.report('words.byteslice.unpack', <<-end)
    idx = 5678 << 1
    W.byteslice(idx, 2).unpack("n")
  end

  x.report('getbyte', <<-end)
    B.getbyte(5678)
  end

  x.report('bytes << 8 | bytes', <<-end)
    (B.getbyte(5678) << 8) | B.getbyte(5678)
  end

  x.report('words | words', <<-end)
    idx = 5678 << 1
    (W.getbyte(idx) << 8) | W.getbyte(idx + 1)
  end
end

Result:

               array     39.446M (± 0.2%) i/s -    197.320M in   5.002250s
words.byteslice.unpack    5.025M (± 2.7%) i/s -     25.372M in   5.052152s
             getbyte     31.348M (± 4.2%) i/s -    156.432M in   5.014899s
  bytes << 8 | bytes     12.455M (± 1.7%) i/s -     62.581M in   5.025988s
       words | words      8.540M (± 2.1%) i/s -     42.712M in   5.003991s

So from this microbenchmark we get the best performance from shift-ing and or-ing two getbyte calls. A little over 2x faster than byteslice.unpack, but still a little more than 3x slower than plain array access.

I made a branch to test this and used the benchmark above: ruby benchmark.rb spec/fixtures/emails/mime_emails/raw_email2.eml. I ~79% the speed of master. (I'm curious what others get with this approach, I'm on a fast computer and testing a different file)

FWIW, I left arrays with values >= 2**16 alone, since we only have one such array and that space could be considered "less wasted" anyways.

Before

                test    319.684  (±13.8%) i/s -      6.240k in  20.016511s

After

                test    252.325  (±13.5%) i/s -      4.944k in  20.085762s
@ahorek

This comment has been minimized.

Copy link
Contributor

ahorek commented Jun 2, 2019

@kddeisz's approach has a design flaw, the resident memory decreases, but it allocates 3 magnitudes more memory during runtime. So if you want to save the memory, in reality it'll do the opposite.

here's a comparsion how much memory is needed to parse a single email ( https://github.com/SamSaffron/memory_profiler )

total allocated
   438035 master
232125152 ragel-bitmap

@jhawthorn's version looks promising.

I hope we could find a good tradeoff between memory / performance. Thank you all for working on this!

btw: here's an interesting talk about the topic https://www.youtube.com/watch?v=ubykHUyNi_0

@schneems

This comment has been minimized.

Copy link

schneems commented Jun 2, 2019

@kddeisz kddeisz force-pushed the kddeisz:use-ragel-bitmap branch from 2bd3329 to 15a7f8e Jul 10, 2019
@kddeisz

This comment has been minimized.

Copy link
Author

kddeisz commented Jul 10, 2019

Okay friends - I finally got a minute to get back to this. I've updated and pushed up a new version of ragel-bitmap based on @jhawthorn's approach, which was based on @fcheung's approach. All of the tests are passing locally.

Could you folks that are better at benchmarking than me give this a shot? I think the strings are going to end up yielding much better results than bignums.

EDIT: Tests on 1.8 were failing because apparently 1.9 introduced getbyte. I've now added a conditional polyfill for that and it should be working now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.