New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

html_error_template doesn't handle non-ascii characters in source code #37

Closed
sqlalchemy-bot opened this Issue Apr 15, 2007 · 4 comments

Comments

Projects
None yet
1 participant
@sqlalchemy-bot

sqlalchemy-bot commented Apr 15, 2007

Migrated issue, originally created by Anonymous

Max Ischenko reported this here:

http://groups.google.com/group/pylons-discuss/browse_frm/thread/22e2af6ba7bbf778

I've attached a patch to convert the source code to unicode via the source code's magic encoding comment (like Myghty does -- in fact I pretty much ripped all of dairiki's code for this). In the rare case that the magic encoding comment doesn't describe the non-ascii characters correctly, the 'replace' error handler is used to add garbage in its place

But before we commit it, I think I would like to see an addition to the Template class.

Pylons currently does this in its exception handler to render the exception:

            return mako.exceptions.html_error_template().render()[610:-16]

With this patch, and Max's example, Pylons now causes a UnicodeEncodeError here:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 4-9: ordinal not in range(128)

What's needed for Pylons to render the html_error_template correctly (including the non-ascii characters showing up in the browser), it now has to do this:

            return mako.exceptions.html_error_template().render_unicode().encode(sys.getdefaultencoding(), 'htmlentityreplace')[610:-16]

Mike I may have asked you this before, but is there a reason why mako Templates don't have an 'encoding_errors' parameter to go along with 'output_encoding'?

If it did, we could have html_error_template set encoding_errors='htmlentitiyreplace'. Then anyone needing an html_error_template wouldn't have to be concerned with unicode at all -- they could simply render() (like Pylons is doing now) and always get back a sys.getdefaultencoding() encoded string with html entitys for non-defaultencoding characters. That is what Myghty did:

http://www.myghty.org/trac/changeset/2120

http://www.myghty.org/trac/changeset/2121

I remember arguing this implementation a little with dairiki when he did this, but it seemed to work out pretty well


Attachments: html_error_source_encoding.diff

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 15, 2007

Anonymous wrote:

Er, this is pjenvey btw

CC: pjenvey@groovie.org

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 16, 2007

Michael Bayer (@zzzeek) wrote:

with regards to the patch, why would we go through all that trouble to parse the magic encoding comment when we are receiving a CompileException or SyntaxException ? in both cases, we have already lexed the template and we have already converted the source to unicode. this encoding could be passed along to CompileException and SyntaxException. also "parser" is not defined, and the whole BOM thing if we want to support that (which im not thrilled about, but I can be swayed) should just be part of Lexer.

the second part of the question, putting encoding_errors as a parameter to Template/TemplateLookup, i would say its not a big deal, the only reason its not there is because there already is a way to convert to unicode explicitly (more than one way to do it...but not a big deal)

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 20, 2007

Michael Bayer (@zzzeek) wrote:

ok so in [changeset:259] im propigating the source of the original template as a unicode object to the LexerException/CompileException, also ModuleInfo can get at the template source and its encoding from the generated module so also returns source as a unicode object. added tests for every version of this issue i could think of (memory based, file based, etc).

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 20, 2007

Changes by Michael Bayer (@zzzeek):

  • changed status to closed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment