Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeEncodeErrors with celery built-in console logging #427

Closed
kmike opened this issue Jul 6, 2011 · 35 comments
Closed

UnicodeEncodeErrors with celery built-in console logging #427

kmike opened this issue Jul 6, 2011 · 35 comments

Comments

@kmike
Copy link
Contributor

kmike commented Jul 6, 2011

Colored celery console logger fails to display unicode strings containing non-ascii letters (e.g. django's sql query log with non-ascii data):

UnicodeEncodeError: 'ascii' codec can't encode characters in position 91-95: ordinal not in range(128)
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/logging/__init__.py", line 776, in emit
    msg = self.format(record)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/logging/__init__.py", line 654, in format
    return fmt.format(record)
  File "/Users/kmike/envs/zina/lib/python2.6/site-packages/celery/log.py", line 56, in format
    t = logging.Formatter.format(self, record)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/logging/__init__.py", line 439, in format
    s = self._fmt % record.__dict__
  File "/Users/kmike/envs/zina/lib/python2.6/site-packages/celery/utils/term.py", line 62, in __str__
    return prefix + str(reduce(self._add, self.s)) + suffix

As a workaround I set CELERYD_HIJACK_ROOT_LOGGER = False and then add 'logging.StreamHandler' to 'celery' logger (it handles non-ascii data).

@ask
Copy link
Contributor

ask commented Jul 18, 2011

Are you saying logging.FileHandler doesn't handle unicode data? Wonder how we could fix that, and what makes it work with StreamHandler

@kmike
Copy link
Contributor Author

kmike commented Jul 18, 2011

Hi ask,

Please excuse me for not being clear: celery.log.ColorFormatter and celery.utils.term.colored wasn't able to handle unicode data, logging.FileHandler works fine. Only console output causes troubles and switching to a less fancy default StreamHandler helps in my case.

@ask ask closed this as completed in dde8868 Sep 2, 2011
@ask
Copy link
Contributor

ask commented Sep 2, 2011

Does this patch help?

@tkloc
Copy link

tkloc commented Dec 27, 2011

No, it doesn't.

DEBUG Tue, 27 Dec 2011 16:54:46 +0124 tasks 22474 140486263453440 [z3sr-01286.73] Rozpoczynam finalizację zakończonych aukcji
[2011-12-27 16:54:46,729: DEBUG/PoolWorker-2] <Unrepresentable <type 'unicode'>: UnicodeEncodeError('ascii', u'\x1b[1;34m[z3sr-01286.73] Rozpoczynam finalizacj\u0119 zako\u0144czonych aukcji\x1b[0m', 45, 46, 'ordinal not in range(128)')>
Traceback (most recent call last):
  File "/home/tomek/environments/.../lib/python2.6/site-packages/celery-2.3.3-py2.6.egg/celery/log.py", line 47, in format
    record.msg = str(color(safe_str(record.msg)))
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0119' in position 45: ordinal not in range(128)

while my standard logger saves&displays message correctly, i'm still getting these exceptions in celery's console

@victorbstan
Copy link

Getting same error <Unrepresentable <type 'unicode'>: UnicodeDecodeError('ascii', '\x1b[1;34m(0.000)

@ask
Copy link
Contributor

ask commented Jan 13, 2012

Could you give me the source code for a task that is able to reproduce this issue?

@victorbstan
Copy link

I guess any task that outputs to the console any odd unicode characters? Say, if they are in the name of the object that is passed as a parameter?

# products.tasks.increment_view_count
@task(ignore_result=True)
def increment_view_count(product):
    p = Product.objects.get(pk=product.id)
    p.view_count = F('view_count')+1
    p.save()

@brockhaywood
Copy link

Has this issue been addressed? It's closed but seems to not be resolved in celery==3.0.11

@ask
Copy link
Contributor

ask commented Feb 13, 2013

Having print(unicode) in a task works for me on 3.0.15

@mtirsel
Copy link

mtirsel commented Feb 25, 2013

I think, it is still present. It is reproducible if you have for example a Django TextField that stores JSON string with unicode. A part of the SQL query from Postgres log:

UPDATE ...
SET
    ...,
    "invoice_name" = 'Faktúra č. 11300007',
    "app_data" = '{"address": "Lipovn\u00edk 123"}'
WHERE ...

And the same in celery log:

[2013-02-25 11:47:35,143: DEBUG/MainProcess] <Unrepresentable <type 'str'>: UnicodeDecodeError('ascii', '\x1b[1;34m(0.001) UPDATE ... SET ..., "invoice_name" = \'Fakt\xc3\xbara \xc4\x8d. 11300007\', "app_data" = \'{"address": "Lipovn\\u00edk 123"}\' WHERE ... ; args=(..., u\'Fakt\\xfara \\u010d. 11300007\', \'{"address": "Lipovn\\\\u00edk 123"}\', ...)>
Traceback (most recent call last):
  File "/home/bruce/.virtualenv/expreso/local/lib/python2.7/site-packages/celery/utils/log.py", line 90, in format
    record.msg = safe_str(str_t(color(record.msg)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 219: ordinal not in range(128)

You can see double escapes on the JSON string.

@buster
Copy link

buster commented Mar 13, 2013

Hi, on celery 3.0.16 i have the same problem.
When trying to log a unicode string i get:

 <Unrepresentable <type 'str'>: UnicodeDecodeError('ascii', "\x1b[1;33moption2:<type 'unicode'>Super krasser Full Name mit \xc3\xa4\xc3\xa4\xc3\xbc\xc3\x9f\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa\x9e\x1b[0m", 59, 60, 'ordinal not in range(128)')>
Traceback (most recent call last):
  File "/home/buster/projects/nomadenv/local/lib/python2.7/site-packages/celery/utils/log.py", line 90, in format
    record.msg = safe_str(str_t(color(record.msg)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 59: ordinal not in range(128)

@ask
Copy link
Contributor

ask commented Mar 13, 2013

@buster Can you please show me an example task that reproduces this?

@ask ask reopened this Mar 13, 2013
@buster
Copy link

buster commented Mar 14, 2013

Hi,

the simplest test i have found to work:

@celery.task
def unicode_test_task():
    print(u'hiöäüß')

And in a celeryshell:

>>> from nomad.tasks import unicode_test_task
>>> unicode_test_task.delay()
<AsyncResult: a7d5a92e-1402-45cb-93c4-843bab6760b7>

And in the worker output:

[2013-03-14 09:42:24,668: INFO/MainProcess] Got task from broker: nomad.tasks.unicode_test_task[a7d5a92e-1402-45cb-93c4-843bab6760b7]
[2013-03-14 09:42:24,699: WARNING/PoolWorker-1] <Unrepresentable <type 'str'>: UnicodeDecodeError('ascii', '\x1b[1;33mhi\xc3\xb6\xc3\xa4\xc3\xbc\xc3\x9f\x1b[0m', 9, 10, 'ordinal not in range(128)')>
Traceback (most recent call last):
  File "/home/buster/projects/nomadenv/local/lib/python2.7/site-packages/celery/utils/log.py", line 90, in format
    record.msg = safe_str(str_t(color(record.msg)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 9: ordinal not in range(128)
[2013-03-14 09:42:24,703: INFO/MainProcess] Task nomad.tasks.unicode_test_task[a7d5a92e-1402-45cb-93c4-843bab6760b7] succeeded in 0.00515413284302s: None

I did find that this works when i disable colored log output...

@ask
Copy link
Contributor

ask commented Apr 11, 2013

Thanks! Fixed

@ask ask closed this as completed Apr 11, 2013
ask added a commit that referenced this issue Apr 11, 2013
@karanlyons
Copy link

I'm actually seeing this issue with 3.0.21:

Traceback (most recent call last):
  File "/Users/karanlyons/.environments/btf/lib/python2.7/site-packages/celery/utils/log.py", line 96, in format
    record.msg = str_t(color(safe_str(msg)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 333: ordinal not in range(128)

I haven't looked into this that deeply, so perhaps it's an issue on my end, but maybe the bug is still there?

@ask
Copy link
Contributor

ask commented Jul 30, 2013

@karanlyons May be, this is one of those bugs that when you fix it for one person, it breaks for someone else ;)

@ask ask reopened this Jul 30, 2013
@ahollingsworth-marin
Copy link

I've been running into the same issue as @karanlyons on 3.0.21 in log.py, line 96.

@karanlyons
Copy link

@ask — Yeah, I hate these sorts of bugs (unicode is hard, guys!).

One "solution" would be to wrap that line in a try except UnicodeDecodeError block, and retry with msg = unicode(msg).encode('ascii', 'ignore').

That kinda sucks for obvious reasons, but depending on the code paths that hit these lines (is it just for writing stuff to stdout/stderr?), it might be okay.

@JoshuaChi
Copy link

I still get this kind of error with fix for #427. @karanlyons solution might works. But you will lost a lot information if your message is unicode.

@ask
Copy link
Contributor

ask commented Oct 7, 2013

There was a bug in safe_str where it used sys.getfilesystemencoding instead of sys.getdefaultencoding. This may have been the source of this problem, so could you test using 3.0.23?

You may lose information, but what can you do if the terminal does not support it :/

@JoshuaChi
Copy link

@ask I did tried 3.0.23 before. It worked.
I tried all these solutions in my terminal, but still has not got it work.

export LANG=zh_TW.UTF-8
export LANG=zh_CN.UTF-8

.bashrc

stty pass8
bind 'set convert-meta off'
bind 'set meta-flag on'
bind 'set output-meta on'

Will give more try later.

@ask
Copy link
Contributor

ask commented Oct 17, 2013

@JoshuaChi And what is the value of sys.getdefaultencoding() ? It must return non-ascii otherwise it would have already replaced the chars.

My terminal does not support unicode and the unicode_test_task provided by @buster above logs:

[2013-10-17 16:43:09,400: WARNING/Worker-1] hi????

@JoshuaChi
Copy link

print sys.getdefaultencoding()
ascii

@ask
Copy link
Contributor

ask commented Oct 19, 2013

Btw, people, when it happens are you logging to a console or to a file?
@JoshuaChi, that is strange, what is the traceback you have?

@JoshuaChi
Copy link

Here is piece of traceback:

[2013-10-20 20:24:19,845: DEBUG/MainProcess] : UnicodeDecodeError('ascii', "\x1b[1;34m(0.001) SELECT * FROM $TABLE_NAME WHERE $TABLE_NAME.`title` = '\xe8\xb1\xa1\xe5\x8e\x82\xe5\x96\x9c\xe5\x89\xa7' ; args=(u'\\u8c61\\u5382\\u559c\\u5267',)\x1b[0m", 722, 723, 'ordinal not in range(128)')>
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/celery/utils/log.py", line 96, in format
    record.msg = str_t(color(safe_str(msg)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 722: ordinal not in range(128)

ask added a commit that referenced this issue Oct 23, 2013
@ask ask closed this as completed in 31b5717 Oct 23, 2013
@ask
Copy link
Contributor

ask commented Oct 23, 2013

Well, this is a mess, but I'm not sure how this can be solved in a pretty way. I used to dream of succinct and beautiful code, then reality kicked in with platform workarounds, posix and users with their special cases, so this shouldn't be news to me :)

You cannot really know the encoding until you actually write to a specific file, so now it will maintain a "default output file" that the workers set when it configures logging. If it logs to a stream it will take encoding information from that stream, if it writes to a file it takes encoding information from that file.

It seems that logging can handle unicode already though, so maybe this is not necessary anymore. If that is the case we must find out what version that was added in.

@JoshuaChi
Copy link

At least it works for me. :-)

@flyingfoxlee
Copy link
Contributor

In my case, I fix it by set export LANG=en_US.UTF-8, I already set it in .zshrc, but need to set it again, because the working environment does not respect settings in .zshrc.

@lukeaus
Copy link

lukeaus commented Oct 23, 2015

I am getting a UnicodeDecodeError from Celery

I have this set in my /etc/default/celeryd
export LANG=en_US.UTF-8

I have celery==3.1.18 and django-celery==3.1.17 installed.

However celery throwing a UnicodeDecodeError on this print statement
print '%s - Assessing %s-%s' % (user_name, foo.id, foo.name)

where user_name = foobar don\xe7a

As a side note, nginx handles unicode fine

Do I need to set:
export LANG=en_US.UTF-8
anywhere else?

@ask
Copy link
Contributor

ask commented Oct 27, 2015

@lukeaus do you have a traceback?

nginx is not written in Python 2 ;)

@lukeaus
Copy link

lukeaus commented Oct 27, 2015

@ask
Yeah - I had another issue a while back with unicode and nginx - thats all working now.

This issue just relates to celery.

Here is a stacktrace as requested.

Traceback (most recent call last):
  File "/home/webapps/.virtualenvs/crowdstaff/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 112, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/home/webapps/.virtualenvs/crowdstaff/local/lib/python2.7/site-packages/django/contrib/auth/decorators.py", line 22, in _wrapped_view
    return view_func(request, *args, **kwargs)
  File "/home/webapps/apps/crowdstaff/lib/smsverify/views.py", line 57, in verify_form
    return profile_complete_redir(request)
  File "/home/webapps/apps/crowdstaff/project/users/decorators.py", line 26, in func
    return view_func(request, *args, **kwargs)
  File "/home/webapps/apps/crowdstaff/project/workers/views/profile.py", line 445, in profile_complete_redir
    if request.user.signup_complete and profile_complete_and_user_verified(request):
  File "/home/webapps/apps/crowdstaff/project/users/decorators.py", line 26, in func
    return view_func(request, *args, **kwargs)
  File "/home/webapps/apps/crowdstaff/project/workers/views/profile.py", line 499, in profile_complete_and_user_verified
    return create_matches_for_signup_applications(request)
  File "/home/webapps/apps/crowdstaff/project/users/decorators.py", line 26, in func
    return view_func(request, *args, **kwargs)
  File "/home/webapps/apps/crowdstaff/project/workers/views/profile.py", line 523, in create_matches_for_signup_applications
    score = score_worker_for_job(worker, job, external_applicant=True)
  File "/home/webapps/apps/crowdstaff/project/jobs/process.py", line 490, in score_worker_for_job
    print '%s - Assessing %s-%s' % (worker, job.id, job.name)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)

Any assistance would be greatly appreciated.
Thanks

@ask
Copy link
Contributor

ask commented Oct 28, 2015

But this traceback seems to be from a Django view? What are you running exactly when this error occurs?

@lukeaus
Copy link

lukeaus commented Oct 28, 2015

@ask so it is... wrong stacktrace. Sorry about that

I have uploaded the stacktrace as an image as Opbeat won't let me copy and paste the stacktrace as one might expect.

Opbeat also gives me these extra details:
args []
task <@task: jobs.process.process_jobs of default:0x7f2dd278d0d0 (v2 compatible)>
task_id 4b84ff8f-3ecc-4708-92b8-fa67e4a62390
kwargs {}

this is the same function as the previous incorrect stacktrace but this time is called as an @task

stacktrace

@ask
Copy link
Contributor

ask commented Oct 30, 2015

The worker redirects output to stdout to the logger, but having unicode in there works for me.
In that case the traceback would also be different (occurring somewhere in celery.log)

Your problem is in your code, and could beseveral things, job.id or job.name could be objects that are stringified with str instead of unicode on Python2, or you have disabled the redirection.

May be fixed by using safe_str:

from kombu.utils.encoding import safe_str
print '%s - Assessing %s-%s' % (safe_str(worker), safe_str(job.id), safe_str(job.name))

Or you will get a better traceback

@lukeaus
Copy link

lukeaus commented Nov 10, 2015

@ask You were right - good thinking.
I had a unicode method on the Worker model returning a string, rather than a unicode object.
Fixed. Thanks for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests