Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data read from a UDP socket retains buffer size causing large memory overhead #5148

Closed
jsvd opened this issue Apr 23, 2018 · 10 comments
Closed

Comments

@jsvd
Copy link
Contributor

jsvd commented Apr 23, 2018

This happens on jruby 9k (not sure about 1.7.x):

require 'socket'
udp = UDPSocket.new(Socket::AF_INET)
udp.bind("localhost", 3300)
num_packets = 10000
data = []

puts "Listening for UDP data"
consumer = Thread.new do
  while true do
    begin
      payload, client = udp.recvfrom_nonblock(65536)
      next if payload.empty?
      data << payload
    rescue IO::EAGAINWaitReadable
      sleep 1
    end
  end
end

producer = Thread.new do
  socket = UDPSocket.new
  socket.connect("localhost", 3300)
  num_packets.times.each { |i| socket.send("test.#{i}", 0) }
end

producer.join
sleep 10000000

When this is run, a large part of the heap will be used by 64k RubyStrings containing only "test.#{i}":

screen shot 2018-04-23 at 17 59 56

This is causing: elastic/logstash#9346

For reference, this doesn't happen on jruby-1.7.27:

screen shot 2018-04-23 at 18 45 59

@enebo
Copy link
Member

enebo commented Apr 23, 2018

This was introduced in commit 388030b. The intention was to avoid always copying the backing array. I personally can see why this is unexpected behavior but I can also see that reverting will affect people who know how much data they will receive (well its UDP but let's pretend it gives more or less what you expect). So I wonder if we should apply a heuristic here? Don't copy if n% of length is filled from the recvfrom (n == 100%/80%???), but if not then call with copy constructor which will right-size the String made.

I am looking for opinions. The revert and the heuristic are trivial to do but what is the best way to give the speed benefit while not creating this massive over allocation in a case where you get much less back than you asked for?

@enebo enebo added this to the JRuby 9.1.18.0 milestone Apr 23, 2018
@headius
Copy link
Member

headius commented Apr 23, 2018

So we discussed and looked into this a bit over hangouts, and have the following findings:

  • The non-copying, non-cropping, allocate all 64k at once behavior does appear to match CRuby. In their code, here's the logic for allocating the buffer, and no subsequent use of this string appears to do any buffer-cropping:
    buflen = NUM2INT(len);
    str = rsock_strbuf(str, buflen);
  • Although our behavior seems to match CRuby, we agree this isn't an efficient use of resources when the requested size is significantly larger than the actual data size.

At this point I'm leaning toward having a heuristic, where if the requested size is greater than (for example) 1.5x the actual data size, we crop it when returning the string result. However, in both JRuby and CRuby it appears that every call to recvfrom(64k) is going to allocate (and possibly throw away, if we copy) a 64k buffer every time.

@enebo
Copy link
Member

enebo commented Apr 23, 2018

Yeah so the second idea we talked about was potentially doing n recvfroms with a bounded allocated buffer in a loop and building up a string. This felt a bit dubious to us because you are requesting something fairly specific (a single 64k recv_from). Us doing n of those calls for you maybe is a semantic gap too far.

As an aside, does UDP really ever reliably read a 64k data packet? I think that is a theoretical limit but one I doubt will work very often. Seems like reducing that size in your code could end up speeding things up because we won't speculatively try and allocate a large buffer for your call.

@jsvd
Copy link
Contributor Author

jsvd commented Apr 24, 2018

As an aside, does UDP really ever reliably read a 64k data packet?

It's a matter of safe defaults, we don't want to surprise users by defaulting to truncation at an arbitrary length.

As for CRuby doing the same, I can see that both 1.7.27 and cruby 2.4.2 have a much lower memory footprint at the end of the script being executed:

Ruby 9k:

USER               PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
joaoduarte       58015   0.0  5.5  5006212 927808 s006  S+   10:27AM   0:12.17 java -Djdk.home= -Djruby.home=/Users/joaoduarte/.rvm/rubies/jruby-9.1.16.0 -Djruby.script=jruby -Djruby.shell=/bin/sh -Djffi.boot.library.path=/Users/joaoduarte/.rvm/rubies/jruby-9.1.16.0/lib/jni:/Users/joaoduarte/.rvm/rubies/jruby-9.1.16.0/lib/jni/Darwin -Xss2048k -Dsun.java.command=org.jruby.Main -cp  -Xbootclasspath/a:/Users/joaoduarte/.rvm/rubies/jruby-9.1.16.0/lib/jruby.jar -Dfile.encoding=UTF-8 -Djava.security.egd=file:/dev/urandom -Xmx1g org/jruby/Main udp_script.rb

JRuby 1.7.27

USER               PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
joaoduarte       58179   0.0  1.0  4977304 168476 s006  S+   10:28AM   0:17.57 java -Djdk.home= -Djruby.home=/Users/joaoduarte/.rvm/rubies/jruby-1.7.27 -Djruby.script=jruby -Djruby.shell=/bin/sh -Djffi.boot.library.path=/Users/joaoduarte/.rvm/rubies/jruby-1.7.27/lib/jni:/Users/joaoduarte/.rvm/rubies/jruby-1.7.27/lib/jni/Darwin -Xss2048k -Dsun.java.command=org.jruby.Main -cp  -Xbootclasspath/a:/Users/joaoduarte/.rvm/rubies/jruby-1.7.27/lib/jruby.jar -Dfile.encoding=UTF-8 -Djava.security.egd=/dev/./urandom -Xmx1g org/jruby/Main udp_script.rb

CRuby 2.4.2

USER               PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
joaoduarte       58333   0.0  0.3  3107672  55676 s006  S+   10:31AM   0:01.29 ruby udp_script.rb

@enebo
Copy link
Member

enebo commented Apr 24, 2018

@jsvd well that is interesting. Either something we missed in MRI upon making the string originally or something else which is reducing the unused backing store in the COW stuff in MRIs String impl.

Yeah I can see the use case that when you don't know what will be arriving you just grab as much as you can. My question was mostly just me remembering how flaky UDP can be.

@headius
Copy link
Member

headius commented Apr 24, 2018

@jsvd Hmm I'll have to go over the code in MRI again then. If they're truncating the buffer somewhere, I can't see it. There might be a realloc call hidden in there somewhere hiding the truncation.

jsvd added a commit to jsvd/logstash-input-udp that referenced this issue May 2, 2018
on JRuby 9k, a read from the UDP socket will not truncate unused
buffer space so we have to force it using String#s
jsvd added a commit to logstash-plugins/logstash-input-udp that referenced this issue May 9, 2018
on JRuby 9k, a read from the UDP socket will not truncate unused
buffer space so we have to force it using String#s

ensure the cloned string is UTF_8 encoded
@headius
Copy link
Member

headius commented May 22, 2020

I'm not sure if I missed this the first time around (it's clearly in the code) but it seems MRI resizes immediately after the recv:

https://github.com/ruby/ruby/blob/master/ext/socket/init.c#L204

Looking into a fix now.

@headius
Copy link
Member

headius commented May 22, 2020

FWIW I ran into this bug because it's marked for 9.1.18 but never got closed or moved to a new release.

@headius headius modified the milestones: JRuby 9.1.18.0, JRuby 9.3.0.0 May 22, 2020
@headius
Copy link
Member

headius commented May 22, 2020

Ok rb_str_set_len does not resize; it just forces the String's reported size to a specific value. Still looking for how they avoid keeping a 64k string around.

@headius
Copy link
Member

headius commented May 22, 2020

Yeah I still have no answers. I pinged my favorite CRuby expert @tenderlove to see if he can figure out why MRI doesn't keep 64k buffers around, but otherwise I'm going to proceed with a 1.5x heuristic to copy rather than wrap the buffer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants