Skip to content

Commit

Permalink
Inline parse now returns Unicode.
Browse files Browse the repository at this point in the history
This has to do with an error in the blog, where the body and tease text is passed through the inline parser before being sent to Markdown (if the user chooses to use Markdown as their markup language). If the text contained characters like fancy curly quotes or emdashes, Markdown would throw an error, complaining that the input was neither ASCII nor Unicode. Strangely, this occured only with the text after it had been put through the inline parser. At first I thought BeautifulSoup might be doing something odd to the string encoding, but eventually I tracked it down to Django's mark_safe function. The inline parser passes its return value through mark_safe right before returning it. I don't really understand why, but for some reason mark_safe returned non-Unicode content. When I pass the string through Python's unicode constructor before passing it to mark_safe, the string that mark_safe returns is propper Unicode. Markdown can process it fine, even with curly quotes.

I don't know if there's any disadvantage to adding the unicode constructor to the inline parser. It hasn't broken anything in my testing. But what if you pass Kanji or some other funky characters into it? I don't know. If you do know, please let me know!

My own belief is that blog posts should be written with plain own straight quotes, dashes (--), etc. Then, for the actualy display, pass the post through Smartpants and turn that stuff into curly quotes, emdashes, ellipsis, or whatever. That way your output will look pretty and will be proper HTML, but the data stored in your database will be good old, portable, use-it-anywhere ASCII. But I know a lot of people like to write their blog posts in something like Microsoft Word, which turns straight quotes curly, and then copy/paste that into the blog. This commit should make their lives easier. (Same for folks using the Wordpress importer.)

Word.
  • Loading branch information
pigmonkey committed Nov 29, 2011
1 parent c89e4c0 commit 1d5fb47
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions basic/inlines/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def inlines(value, return_list=False):
inline.replaceWith(render_to_string(rendered_inline['template'], rendered_inline['context']))
else:
inline.replaceWith('')
return mark_safe(content)
return mark_safe(unicode(content))


def render_inline(inline):
Expand Down Expand Up @@ -92,4 +92,4 @@ def render_inline(inline):
template = ["inlines/%s_%s.html" % (app_label, model_name), "inlines/default.html"]
rendered_inline = {'template':template, 'context':context}

return rendered_inline
return rendered_inline

0 comments on commit 1d5fb47

Please sign in to comment.