New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impossible to access files with an accent in the name #1294
Comments
What's the operating system of the host machine? |
OS X (local server) |
Do you know what S3 does as far as collation goes? It may be that you have to give it strings in a particular canonical form for stuff to work properly (and it's worth remembering that os x gives you an almost but not quite canonically decomposed forms for file names) |
@fcheung What do you mean by collation ? I'm providing a simple string :s |
Nothing's ever simple when unicode is involved. Collation means how are string compared & ordered, which is important in a unicode world because different languages can have different rules and because something like é can be stored in more than one way. I've no idea what S3 does in this respect. I would naively assume that you'd be ok if you're just using the string returned from an s3 list operation. It's probably also worth checking that fog is doing the percent escaping properly - have a look with tcpdump or similar to see what the request actually looks like |
If I retrieve the file list from S3, then select the correct file and do .key then an .inspect on the string (a string with which I can retrieve the file without any problem), I have exactly the same result as an .inspect on my previous string :s Any idea why ? |
Do the raw bytes differ (ie s.bytes.to_a) ? |
Nope :s
Any idea on what to try next ? I'm starting to become mad :s |
Those byte sequences look different to me
Which is P, e, 0xcc, 0x81 (which is utf for combining e acute accent) So both byte sequences result in the same sequence of glyphs when printed on screen, but s3 appears to consider them as being different. Which one of those strings works? Fred. Sent from my iPhone On 20 Nov 2012, at 17:36, Etienne Depaulis notifications@github.com wrote:
|
Sorry @fcheung, I meant "Nope, they differ" :s Both strings prints the same on the screen but only the first one (the one retrieved from directory.files) returns the correct file. Thanks for your help so far |
Hello Etienne, just saw it on Twitter. You have two kind of é
Which uploader are you using (paperclip, carrierwave, dragonfly, a homemade) ? |
Hey @maximeg ;) I'm using JQuery File Upload without any specific uploader (very large files) which POST a S3 URL to a very simple controller that stores the value in a postgres database. Is it not possible to convert a NFD string to NFC format ? |
I think I get it... In Mac OS, filenames are in NFD, so when your jquery thing send back the filename of the just uploaded file, it will send NFD... and maybe postgresql don't care. (filename in Web, Linux are NFC, and S3 do the math) |
Long day ;) Here is the final solution: In the Gemfile:
=> https://github.com/knu/ruby-unf Then a simple .to_nfc on the path_name ! |
Thanks for working through this! @EtienneDepaulis - what a tricky one... Thanks for sharing the solution. |
Hello,
We are using fog to manage files on our S3 server.
We have a file with an accent in it and which is the list if we do:
But when we try access it with the same path name:
fog returns nil.
Any idea ?
(.encoding on the string returns #Encoding:UTF-8)
The text was updated successfully, but these errors were encountered: