capybara-webkit appears to be applying some sort of conversion to binary files downloaded through it (i.e. downloading the file attachment via page#body).
For example, the original file (a JPEG image) begins: FF D8 FF E0 00 10 4A 46 49 46 00 01 01 00 00 01 00 01 00 00 FF E1 8D 08 45 78 69 66 00 00 49 49 (or "ˇÿˇ‡�JFIF����ˇ·ç�ExifII") while the downloaded file begins: C3 BF C3 98 C3 BF C3 A0 10 4A 46 49 46 01 01 01 01 C3 BF C3 A1 C2 8D 08 45 78 69 66 49 49 2A 08 (or "√ø√ò√ø√†�JFIF����√ø√°¬ç�ExifII*"). Not all elements are being converted (the similar runs of "4A 46 49 46 "/"JFIF" for example), but more than enough changes are being applied to make the downloaded file unusable as a JPEG.
I'm assuming that this conversion is happening at the driver/capybara-webkit level, but if you believe that it's happening elsewhere (within capybara itself, for example), please let me know. Any idea what this conversion might be? and/or how to bypass it?
I'm running the latest macports of qt4-mac (qt4-mac @4.8.4_6) on OS X 10.7.5 with capybara (2.0.3) and capybara-webkit (0.14.2), w/ ruby-1.9.2-p290 under RVM.
I'd post both images but the converted one is no longer recognized as such; I can email both files directly upon request.
This should be fixed on master, could you give it a shot? We've not released a new version yet as we're waiting for Capybara 2.1 to be released.
Thanks for the quick reply!
Unfortunately I'm still encountering some funky behavior around downloading/saving binaries (testing with images); testing as follows:
<html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">
BTW, I had a few issues installing from master:
gem install_specific -l https://github.com/thoughtbot/capybara-webkit.git
Let me know if you have any questions; BTW, I tested under the latest stable rvm with the latest stable ruby 1.9.3 on OS X 10.7.5.
Don't cast raw frame content to QString
The conversion is lossy and drops non-ASCII characters.
I found a bug in the Body command which was converting the raw content into a QString, which is a lossy conversion (non-ASCII characters are dropped.) Regarding the odd response headers, if the JPEG content is being served up as "text/html", unfortunately there's not much we can do. The Body command looks at the content type to determine whether to return the DOM (in the case of "text/html") or the raw content (which was happening, but it was being converted to ASCII characters.)