sending UTF-8 data over SSL can result in lost data #20

bporterfield opened this Issue Mar 1, 2012 · 4 comments

1 participant


Not quite sure yet if this belongs here. Requests seem to hang over ssl when sending back data read from a file. The file contains a mix of single-byte UTF-8 characters and double-byte UTF-8 characters (all at the front in my example, but not in the real case). Over http it works fine, and the same app works in https and http in MRI.

Rack app:

require 'rack/utils'
require 'openssl'
require 'webrick'
require 'webrick/https'

class SslEncodingIssue
  def call(env)

    # file can be downloaded from
    out ="gistfile1.txt")    

    headers = {}
    headers["Content-Length"] = [out].inject(0) { |l, p| l + Rack::Utils.bytesize(p) }.to_s
    headers["Content-Type"] = "text/html"

    [200, headers, [out]]

pkey ="keypair.pem"))
cert ="cert.pem")),
  :Port => 3000,
  :SSLEnable => true,
  :SSLVerifyClient => OpenSSL::SSL::VERIFY_NONE,
  :SSLCertificate => cert,
  :SSLPrivateKey => pkey,
  :SSLCertName => [ [ "CN",WEBrick::Utils::getservername ] ] 

Run rackup and ping http://localhost:3000/ - for me, the request hangs for a bit, and curl responds with: * transfer closed with 2 bytes remaining to read.

Change :SSLEnable = true to = :SSLEnable = false and hit the http url, and the problem goes away. I've put this test case into Webrick, but was having the same issue with other server.

Problem is very dependent on length of output - remove a few lines of dots and the issue does not occur.

Happy to provide more info!


Additional info:

jruby 1.6.6 (ruby-1.9.2-p312) (2012-01-30 5673572) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_29) [darwin-x86_64-java]


Created ticket on jira (jruby-ossl README says to report there):


Writing to SSL Socket with jruby-ossl in 1.9 doesn't look like it supports multi-byte character encodings at all.

The function that writes to the buffer expects characters to be single-byte. For ex, this line uses String[] access characters, but decides what chars to pick during the loop based on a value returned in bytes, not characters. syswrite returns actual bytes written - which screws up the buffer if chars are multibyte, since the value of 'remaining' can be off, and at least in my test case become negative.

It seems to work if I change the buffering to respect actual bytes in the String:

This doesn't work without String.byteslice, which is new in 1.9.3, so to get around it:

There's still the fact that this line ignores encoding differences in strings - I'm not sure how that's handled in Ruby, but potentially more issues there.


I've found the fix I was using (using a monkey-patched String.byteslice) was much too slow, so for now I'm getting by via a force_encoding call in do_write:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment