New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HAML and 1.9.2 Encoding #519
Comments
This is the layout file, completely fresh generated application w/Sequel & Sqlite3, haml https://gist.github.com/743e07cb1ecdc25bac52 And the 'index' action was literally an empty file ala 'touch app/base/index.haml' |
I think is it a Tilt issue: rtomayko/tilt#83 |
If so, why would it work in Plain Sinatra+HAML which Wayne tested separately? Have you experienced this before Dave in any of your apps? |
For clarification, I ran
So the file was identical. |
Additionally,
|
For a bit of an exercise, I went through and found all the areas in Sinatra mentioning encodings (v1.2.6): Most interesting: def force_encoding References: def content_type (references default_encoding) |
More research, the only lines actually causing this error were these two characters:
and
With those two symbols removed the error goes away. So to reproduce this simply in Padrino, simply do: # app/layouts/base.haml
!!!
%html
%head
%title Encoding Fail
%meta{ :content => "text/html; charset=UTF-8", "http-equiv" => "Content-Type" }
%body
%p ∴ Foo
%p – Bar
%p= yield Render anything with that template and you will see:
Getting a bit closer... |
If I look at the backtrace in Padrino when that encoding issue occurs, I see:
is fundamentally still doing the rendering as you'd expect. So the default encoding is properly set, if I were to add: raise settings.default_encoding I see: |
Brilliant deductive work thus far! |
@wayneeseguin @DAddYE Ok I have figured out the exact source of the issue in Padrino. At least we tracked it down. I have a Padrino app and a Sinatra app both with the same template files. The Padrino app returns:
and the Sinatra app returns with no errors. I tracked the culprit in Padrino to "set_encoding" method: def set_encoding
if RUBY_VERSION < '1.9'
$KCODE='u'
else
Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8 # <- THIS
end
nil
end By setting # PLAIN SINATRA
get "/" do
Encoding.default_internal = Encoding::UTF_8
haml :simple, :layout => true
end will have an exception and: # PADRINO
get "/" do
Encoding.default_internal = nil
haml :simple, :layout => true
end works like a charm. |
Quick fix for right now @wayneeseguin is to simply set: # config/boot.rb
Padrino.after_load do
Encoding.default_internal = nil
end
Padrino.load! and Padrino appears to render UTF-8 perfectly fine. Unexpectedly it is actually setting the @DAddYE How should we proceed? My feeling is to simply remove that declaration? Although why does that cause this exception to occur? |
That seems very counter-intuitive (what is actually happening) however thank you for the fix! I am very excited about using Padrino this speed bump aside :) |
Glad I could fix this with only about an hour of investigation. I was actually surprised and tried almost everything else before removing that line as the culprit. I actually prefer that internal encoding being set to UTF-8. It also feels like it must be a bug of some sort in HAML. As if trying to force the 'internal' encoding to be UTF-8 exposes an issue with HAML itself. Anyhow glad the fix was pretty easy and glad to hear you are using Padrino :) |
@nex3 @chriseppstein If you can read through this and give your opinion, I would be greatful. The tl;dr is: # PLAIN SINATRA
get "/" do
Encoding.default_internal = Encoding::UTF_8
haml :simple, :layout => true
end Fails with an exception in Sinatra:
but: # PLAIN SINATRA
get "/" do
Encoding.default_internal = nil
haml :simple, :layout => true
end Renders without issue given the layout: # app/layouts/base.haml
!!!
%html
%head
%title Encoding Fail
%meta{ :content => "text/html; charset=UTF-8", "http-equiv" => "Content-Type" }
%body
%p ∴ Foo
%p – Bar
%p= yield Any ideas? I am on the latest haml (3.1.1) and ruby 1.9.2 |
Interesting also encoding reference: http://haml-lang.com/docs/yardoc/file.HAML_REFERENCE.html |
Yeah I saw that, but why does that mean that default_internal cannot be set to UTF8? |
Mmm no Idea, Im interested in what @rkh think about that. |
An interesting thing is that with: Encoding.default_internal = nil
Haml::Template.options[:encoding] = "utf-8" but the strange thing is that Haml set |
With |
By default haml think that source is |
Here's what's happening, as far as I can tell:
The fundamental problem here is that |
@nex3 Even with Encoding.default_external set to 'utf-8' sinatra still throws an exception regardless it doesn't seem to fix the issue. With both internal and external set to 'utf-8' as far as I can tell the problem persists. Why would that be? Assuming we don't want to set the encoding in every single file, what's the best fallback? Set default_internal to nil? |
If Setting |
It should not be set to US-ASCII, but to ASCII-8BIT, since we use Note: I do not know if this fully applies for Padrino, this is what's going on in vanilla Sinatra. |
Well, there's the problem. Haml's getting passed an ASCII-8BIT file by Tilt, which it then tries to re-encode to UTF-8, causing the failure. @rkh: the policy of passing binary data to template engines seems mistaken to me. Haml, for instance, assumes it's being passed a well-formed and properly-encoded document, an assumption that a binary-flagged UTF-8-encoded document violates. |
@nex3: I disagree, parsing as ASCII-8BIT is the only option we have. We cannot know any encoding before Haml parses the source code. Otherwise we will have to re-implement simple parsers for all template engine supported (currently 22, not counting those added by extensions) to figure out what encoding the files are in. What if it is some Japanese encoding, or UTF-16 (likely on Windows), simply assuming the file is in UTF-8 would break that. Moreover, Rack is following a strict everything we are not 100% sure of is ASCII-8BIT policy, with the basic idea that this will have Strings behaving exactly as on 1.8. In contrast to Rails, Sinatra/Tilt is not opinionated and we cannot simply assume files are in UTF-8. Might be an alternative option to not pass |
Note: It is not just that way because we thought it makes sense (I had everything default to UTF-8 at one point), but because we got actual bug reports for this. There are still ppl out there using different encodings. |
@rkh: Ruby has a built-in mechanism for specifying what encoding external files are ( You seem to assume that all files will specify their encodings internally. But as @nesquena points out, people don't want to add an encoding comment to each file. They want to be able to set a default, and you're ignoring that default. The way Rails does this, for reference, is not to force everything into UTF-8. Rails assumes external files use |
@rkh @nex3 Yes, we are using the exact same Sinatra and Tilt rendering as you guys at the end of the day. It is indeed the Tilt based ASCII-8BIT encoding which causes the exception with HAML when default_internal is set. If we can fix this for Sinatra and Tilt then it will be fixed in Padrino as well. Seems this might actually be a Tilt issue where it needs to respect the Encoding defaults if set in 1.9? |
This presents a bit of a problem for us in the meantime with that Tilt ASCII-8BIT encoding issue but I suspect the best thing for Padrino to do is leave the default encoding set to UTF8 as it is now since this is actually correct and then these changes will be made in the downstream. Man I hate encoding issues. Just to restate for Padrino users, the quick workaround hack if anyone experiences this problem is to: # config/boot.rb
Padrino.after_load do
Encoding.default_internal = nil
end |
@nex3: The default value of There is also a corresponding Tilt issue, btw: rtomayko/tilt#75 |
Sure, using Having the default be BINARY and ignoring the standard mechanism for overriding that default is not the proper solution. |
Ok, but what is? Loading UTF-8 files as US-ASCII is neither, nor is having US-ASCII templates that work properly and then insert UTF-8 values from the HTTP communication. |
The UTF-8-over-HTTP problem is going to exist just as much with BINARY templates as with From a template engine's point of view, Rails' solution is pretty much ideal: use |
It is not ideal, |
Perhaps we should consider moving our discussion back to rtomayko/tilt#75 since that is where this should probably be fixed. I am closing this ticket since nothing needs to change in Padrino, and there's an easy hacky workaround if anyone encounters this problem. Thanks guys! |
So, I'm tired of waiting for Tilt to fix it and bored of writing magic comments and BOM markers to my templates. I have a tiny monkeypatch to help Haml deal with utf-8 files. Hope this helps those who don't feel like setting module Tilt
class HamlTemplate
def prepare
@data.force_encoding Encoding.default_external # magic line
options = @options.merge(:filename => eval_file, :line => line)
@engine = ::Haml::Engine.new(data, options)
end
end
end |
Thanks for that patch. I agree with @nex3 that tilt should respect default_external - that was the first thing I thought of, since this is the built-in mechanism for resolving such issues and making the code bullet-proof. |
Use tilt 1.4.0+ (fixes haml & UTF-8 issue #519)
Chatting with wayneeseguin: http://twitter.com/#!/wayneeseguin/statuses/67353013299322880
Looks like he is experiencing encoding errors when he uses Padrino with HAML but not with Sinatra+Haml. It works if he does:
in each HAML file but this shouldn't be necessary. We need to investigate why this happens and correct the issue for the next release.
The text was updated successfully, but these errors were encountered: