-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding Logic and Fallback... #213
Comments
Any reason you can't set Encoding.default_external = Encoding::UTF_8 Falling back to UTF-8 seems like a good idea, but it might also lead to inconsistencies. It would mean that files with only ASCII characters would have ASCII encoding, while files with UTF-8 characters would have UTF-8 encoding. This could cause odd errors if you're concatenating, or in any other way manipulating them together. |
My rationale... The fact is, my setup is not unusual. I run a standard operating system (openSUSE), standard web server (Nginx), standard RVM installation with Ruby 2.0.p-xxx, and my templates are as standard as you can get. The fact that under such a typical operating environment, Tilt fails to load a template encoded in the most common (and most technically superior) encoding, UTF-8, to me highlights a problem that probably needs to be addressed. Sure, it's a combination of short falls all the way down to the OS level, but Tilt's got to deal with it like all higher-level libraries should. No one should have to discover like I did that pretty much every nix* distro falls back to POSIX as it's default locale when the environment is cleared, with no means to change it to anything else; locale is completely dependant on environment variables in other words. Most init.d scripts, such the Nginx one for my openSUSE, clear the environment which is why this problem arises. On top of that, the locale of the server is barely relevant to what the templates of my application/website are encoded in. I develop on my desktop computer, not on my server, so the encoding of my files is going to reflect the locale on my local computer, and whatever software I used to generate the template. The templates I create could be deployed to any server; imagine a popular open-source application using Tilt that's deployed all around the world. So there's a couple of reason not use the external default encoding. With that said, I can see your point in regards to my suggestion, so perhaps instead of falling back to UTF-8, UTF-8 can be Tilt's default encoding, overridable through a setting in Tilt, and Thoughts? |
Default behavior now:
Suggested default behavior:
So, basically it's suggested that Tilt should ignore Encoding.default_external (set by system/app) in favor of standard UTF-8; and force some users to set Tilt.default_encoding to something exotic if they need. @Wardrop am I seeing your point right? |
This needs some clarifications:
You can set Encoding.default_external = 'UTF-8' Tilt already has a global default encoding: Encoding.default_external. I'm finding it hard to find cases where you want Tilt's (global) default encoding to be different from Encoding.default_external. Is it likely that your templates are UTF-8, but every other file you read is US-ASCII? Also, what if you use Tilt to load templates from multiple locations with different encodings? Then the global default encoding is useless anyway, and you'll have to pass The way I see it:
|
To clarify, this discussion is all about default behaviours. It's not that Tilt doesn't provide plenty of means to override the default behaviour. Ruby's Encoding.default_external can be set at many levels, and Tilt provides it's own means of overriding this, by either setting The point of this discussion is, in what likely makes up a significant portion of production environments, the current default usually resolves to something non-UTF-8, whilst I'd hazard a guess that the [vast] majority of templates would be encoded in UTF-8. Point being, as it stands, the average user is likely to run into an encoding issue at some point between development and deployment, if not sometime after if that deployment environment was to change in some way. I suppose there's an expectation these days that UTF-8 should always "just work" without having to make any special considerations. Am I being too idealistic? I feel the current default behaviour would make a better fallback strategy, rather than as a primary strategy. I believe UTF-8 should be assumed, unless where
That's what I'd like to see. |
My main issue is adding another global setting. If people use Tilt in an environment that doesn't use UTF-8 they now need to set two settings ( Tilt is also a library, not a framework. I expect people to build tools on top of Tilt and these tools can be more opinionated about encodings. For now I'd rather keep it as close as possible to Ruby's encoding system. |
It's an awkward situation, and really, Encoding.default_external should ideally always be UTF-8 on a modern western operating system, but there are scenarios unfortunately where that is not the case (as we've discusse). I believe setting encoding on a per-application basis is perhaps the best option, and like you said, this is something I can default to UTF-8 in my framework. |
Yep, it's that encoding subject again. I have a problem with Tilt forcing
Encoding::default_external
and exploding when it doesn't work out. The problem is, a lot of web servers and thus Ruby (such in the case of Phusion Passenger) are run in the context of a clear environment, equivalent toenv -i
. The issue is that under this context, the locale always reverts to "POSIX" on many nix distro's, under whichEncoding::default_external
defaults to US_ASCII.What happens in this scenario, is that Tilt tries forcing the encoding of what's typically a UTF-8 file to ASCII. It then does a
valid_encoding?
check, finds it's false, and raises an error.I think a better behaviour would be to load the file without the binary switch (so ruby reads in the string using the default external encoding), check if the encoding is valid, if not, force encoding to UTF-8. If still invalid, THEN by all means raise the error. UTF-8 is a very common and compatible encoding. Most other encodings will still work when forced to UTF-8, so in my opinion if the default encoding fails, UTF-8 should at least be attempted as a sane fallback behaviour.
Otherwise, Tilt is always at the mercy of the environment it's being run in, making for sensitive Applications prone to breaking. Robustness should be the goal here.
Thoughts?
The text was updated successfully, but these errors were encountered: