You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is it possible that Patron could handle this incorrect encoding name in header and fall back on the charset specified in the response body?
As of now i get the following error:
ArgumentError: unknown encoding name - iso-88591
from /home/useraname/shared/bundle/ruby/1.9.1/gems/patron-0.4.16/lib/patron/response.rb:69:in `force_encoding'
from /home/useraname/shared/bundle/ruby/1.9.1/gems/patron-0.4.16/lib/patron/response.rb:69:in `convert_to_default_encoding!'
from /home/username/shared/bundle/ruby/1.9.1/gems/patron-0.4.16/lib/patron/response.rb:42:in `block in initialize'
from /home/username/shared/bundle/ruby/1.9.1/gems/patron-0.4.16/lib/patron/response.rb:41:in `each'
from /home/username/shared/bundle/ruby/1.9.1/gems/patron-0.4.16/lib/patron/response.rb:41:in `initialize'
from /home/username/shared/bundle/ruby/1.9.1/gems/patron-0.4.16/lib/patron/session.rb:222:in `handle_request'
from /home/username/shared/bundle/ruby/1.9.1/gems/patron-0.4.16/lib/patron/session.rb:222:in `request'
from /home/username/shared/bundle/ruby/1.9.1/gems/patron-0.4.16/lib/patron/session.rb:125:in `get'
The text was updated successfully, but these errors were encountered:
Patron currently doesn't parse the content for anything and I would like to keep it that way. What I will do is make the charset coercion optional and allow users to specify which Content-Types should be coerced and a fallback type.
This needs to be tackled a bit more broadly IMO. The problem you saw is not unique, in that for example in Russia there was long a custom of forcing charset headers onto pages that had an entirely different charset specified in the HTML. Parsing HTML is out of scope for Patron (I agree with @toland on this), but I think there might be an extra method on the Response called something like binary_body for the cases when the encoding detection failed or didn't work for some reason. Then the user would be able to revert to handling the encoding manually, including all possible scenarios such as a corpus-based charset guesser, or an HTML parser, or whatevs.
I found a site that returns an incorrect encoding name in the response header:
But the response body has a correct encoding specified
Is it possible that Patron could handle this incorrect encoding name in header and fall back on the charset specified in the response body?
As of now i get the following error:
The text was updated successfully, but these errors were encountered: