Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Binary downloads converted from source data? #503

Closed
jdbo opened this Issue · 3 comments

2 participants

@jdbo

capybara-webkit appears to be applying some sort of conversion to binary files downloaded through it (i.e. downloading the file attachment via page#body).

For example, the original file (a JPEG image) begins: FF D8 FF E0 00 10 4A 46 49 46 00 01 01 00 00 01 00 01 00 00 FF E1 8D 08 45 78 69 66 00 00 49 49 (or "ˇÿˇ‡�JFIF����ˇ·ç�ExifII") while the downloaded file begins: C3 BF C3 98 C3 BF C3 A0 10 4A 46 49 46 01 01 01 01 C3 BF C3 A1 C2 8D 08 45 78 69 66 49 49 2A 08 (or "√ø√ò√ø√†�JFIF����√ø√°¬ç�ExifII*"). Not all elements are being converted (the similar runs of "4A 46 49 46 "/"JFIF" for example), but more than enough changes are being applied to make the downloaded file unusable as a JPEG.

I'm assuming that this conversion is happening at the driver/capybara-webkit level, but if you believe that it's happening elsewhere (within capybara itself, for example), please let me know. Any idea what this conversion might be? and/or how to bypass it?

I'm running the latest macports of qt4-mac (qt4-mac @4.8.4_6) on OS X 10.7.5 with capybara (2.0.3) and capybara-webkit (0.14.2), w/ ruby-1.9.2-p290 under RVM.

I'd post both images but the converted one is no longer recognized as such; I can email both files directly upon request.

@mhoran
Collaborator

This should be fixed on master, could you give it a shot? We've not released a new version yet as we're waiting for Capybara 2.1 to be released.

@jdbo

Thanks for the quick reply!

Unfortunately I'm still encountering some funky behavior around downloading/saving binaries (testing with images); testing as follows:

  • directly visiting the URL for an image and using page#save_page to download the image (downloading a PNG saves the following string: "\u0089PNG\r\n\u001A\n")
  • navigating through a site and using an image downloading link is still converting the binary data (as described above), but now also wrapping that data with <html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;"> and </pre></body></html> (it should be noted that the site I'm working with appears to be returning odd response headers, though I haven't been able to reproduce that outside of capybara)

BTW, I had a few issues installing from master:

  • I had to switch from macports to homebrew to workaround PCH (precompiled header?) "is a directory" errors
  • I had to install via gem install_specific -l https://github.com/thoughtbot/capybara-webkit.git as installing locally-built gems failed around missing "lib/capybara_webkit_builder" file during installation (for reasons unknown to me, extconf.rb appeared to want this file present before the file was installed). Please note that I'm pretty new to manually building/installing gems.

Let me know if you have any questions; BTW, I tested under the latest stable rvm with the latest stable ruby 1.9.3 on OS X 10.7.5.

@mhoran mhoran closed this issue from a commit
@mhoran mhoran Don't cast raw frame content to QString
The conversion is lossy and drops non-ASCII characters.

Fixes #503.
e72b48f
@mhoran mhoran closed this in e72b48f
@mhoran
Collaborator

I found a bug in the Body command which was converting the raw content into a QString, which is a lossy conversion (non-ASCII characters are dropped.) Regarding the odd response headers, if the JPEG content is being served up as "text/html", unfortunately there's not much we can do. The Body command looks at the content type to determine whether to return the DOM (in the case of "text/html") or the raw content (which was happening, but it was being converted to ASCII characters.)

@youpy youpy referenced this issue from a commit in youpy/capybara-webkit
@mhoran mhoran Don't cast raw frame content to QString
The conversion is lossy and drops non-ASCII characters.

Fixes #503.
8235f82
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.