New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incoming message non-English characters are shown as ** #128
Comments
I am unable to reproduce this, either manually by sending emails directly, or via functional tests. Considering this request: http://informatazyrtare.org/sq/request/special_characters I note that the raw_email is correctly stored in encoded form: http://informatazyrtare.org/sq/admin/request/show_raw_email/28 If I download it (using the download link on that page), and then run:
The message is displayed correctly on my system (in the Holding Pen, since I don't have the original request). Do you still get the asterisks if you do the same? It's interesting to note that it's two asterisks for each single unicode character, which suggests an encoding issue to do with UTF8 -- like something is trying to convert it to ASCII. |
Also... what are |
...and you're running Ruby 1.8.x, right? |
I am running Ruby 1.8.7, I have downloaded the message and run cat .. ./script/mail and is still the same |
So, the problem is that elinks is used to generate the plain text email based on the HTML version. Elinks is receiving UTF8 but treating it as some single-byte character set, perhaps ASCII. The invocation https://github.com/sebbacon/alaveteli/blob/master/app/models/incoming_message.rb#L832 calls through to the plain-textify method without passing a charset. In this particular example, however, the email supplied wrongly declares its charset to be The reason this doesn't work on the IZ server and works everywhere else is also unclear. The Possibly the correct thing to do these days is to assume UTF8 in the first instance (given that here we have a broken charset in any case). |
Fixed in dbac412 (Though note that the test, while it failed on the IZ server, never failed for me anywhere else) |
When looking into the performance issues on
Fixed with:
Not sure how it ended up in this state, though. |
Reopening this though I think the issue might be slightly different. Incoming emails such as this one are still displaying double-byte characters as asterisks. It is part of the elinks conversion. The really puzzling thing is that if I get the incoming message from the command line, and cause it to regenerate the text version, it works:
Viewing the request from the web browser now shows the correct characters. However, if I reset the cache again as above, but omit the final step ( This is very confusing! I can only assume the different is somehow between the environment the web server's running in, and the environment that my rails console is running in (e.g. apache is running as |
I should add I am unable to reproduce this locally, so it appears to be something to do with the server settings, as was the case previously. |
Also, if I run the |
I just double checked this. The So the input to the command isn't the problem. The command is always invoked the same, viz. If I run that command from a bash shel against the utf-8 input, then I get utf-8 output. If I cause that command to be run from Rails, via the console using If I cause that command to be run from Rails, via a web browser, I get the double byte characters replaced by asterisks. The asterisks can be reproduced by forcing elinks to assume ASCII for the input ( Passenger is running Rails as However, it is clearly some kind of elinks-related setting that is overriding or ignoring the codepage we're trying to set from the command line, because I have fixed this by adding |
When an authority replies to a request with characters like ë, ç, they get replaced with **
This does not happen with outgoing messages.
The text was updated successfully, but these errors were encountered: