Skip to content
This repository has been archived by the owner on Aug 22, 2023. It is now read-only.

Unicode #24

Open
konsumer opened this issue Jun 24, 2018 · 2 comments
Open

Unicode #24

konsumer opened this issue Jun 24, 2018 · 2 comments

Comments

@konsumer
Copy link

konsumer commented Jun 24, 2018

I am running rebrow in docker marian/rebrow:latest

I have an issue where I am saving records that have emojis and kanji and other extended characters. When I try load a page that displays a value or key, I get this:

[2018-06-24 04:20:39,039] ERROR in app: Exception on /redis:6379/0/keys/KyszTWwwcEhKdXRiZnFaaElZL2l4Z0NGNDBVPQ==/ [GET]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1988, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1641, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1544, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/app/runserver.py", line 239, in key
    duration=time.time()-s)
  File "/usr/local/lib/python2.7/site-packages/flask/templating.py", line 134, in render_template
    context, ctx.app)
  File "/usr/local/lib/python2.7/site-packages/flask/templating.py", line 116, in _render
    rv = template.render(context)
  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 1008, in render
    return self.environment.handle_exception(exc_info, True)
  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 780, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/app/templates/key.html", line 1, in top-level template code
    {% extends "layout.html" %}
  File "/app/templates/layout.html", line 66, in top-level template code
    {% block body %}{% endblock %}
  File "/app/templates/key.html", line 86, in block "body"
    <td><code>{{ item[1] }}</code></td>
  File "/usr/local/lib/python2.7/site-packages/markupsafe/_native.py", line 22, in escape
    return Markup(text_type(s)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 17: ordinal not in range(128)

I switched to base64 encoding the keys, but it still does this on pages that have extended character in values.

@earthgecko
Copy link

earthgecko commented Jun 25, 2018

Hi @konsumer

You probably want to use something like the emoji module and check if the key value has an emoji in it and if so then demojize the key value, something like https://stackoverflow.com/a/51007945 and https://gist.github.com/jezdez/0185e35704dbdf3c880f

# You will need to install emoji with pip and then import it in runserver.py
import emoji

# Then around line 219 https://github.com/marians/rebrow/blob/master/runserver.py#L219
    if t == "string":
        val = r.get(key).decode('utf-8', 'replace')
# Let us say the val was
# val = u'I \u2764 emoji'
   def char_is_emoji(character):
        return character in emoji.UNICODE_EMOJI
    if [c for c in val if char_is_emoji(c)]:
        val = emoji.demojize(val)
# Which will result in val
# u'I :red_heart: emoji'

This is just an example but may work for you, in the very least it is somewhere to start.

@konsumer
Copy link
Author

I could just strip any utf8 chars outside the ascii range, but It's in my data. I want to keep the emojis. I am just reporting that rebrow freaks on those records.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants