UnicodeEncodeErrors with celery built-in console logging #427

kmike · 2011-07-06T22:38:31Z

Colored celery console logger fails to display unicode strings containing non-ascii letters (e.g. django's sql query log with non-ascii data):

UnicodeEncodeError: 'ascii' codec can't encode characters in position 91-95: ordinal not in range(128)
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/logging/__init__.py", line 776, in emit
    msg = self.format(record)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/logging/__init__.py", line 654, in format
    return fmt.format(record)
  File "/Users/kmike/envs/zina/lib/python2.6/site-packages/celery/log.py", line 56, in format
    t = logging.Formatter.format(self, record)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/logging/__init__.py", line 439, in format
    s = self._fmt % record.__dict__
  File "/Users/kmike/envs/zina/lib/python2.6/site-packages/celery/utils/term.py", line 62, in __str__
    return prefix + str(reduce(self._add, self.s)) + suffix

As a workaround I set CELERYD_HIJACK_ROOT_LOGGER = False and then add 'logging.StreamHandler' to 'celery' logger (it handles non-ascii data).

The text was updated successfully, but these errors were encountered:

ask · 2011-07-18T11:38:16Z

Are you saying logging.FileHandler doesn't handle unicode data? Wonder how we could fix that, and what makes it work with StreamHandler

kmike · 2011-07-18T15:57:09Z

Hi ask,

Please excuse me for not being clear: celery.log.ColorFormatter and celery.utils.term.colored wasn't able to handle unicode data, logging.FileHandler works fine. Only console output causes troubles and switching to a less fancy default StreamHandler helps in my case.

ask · 2011-09-02T23:54:02Z

Does this patch help?

tkloc · 2011-12-27T15:59:54Z

No, it doesn't.

DEBUG Tue, 27 Dec 2011 16:54:46 +0124 tasks 22474 140486263453440 [z3sr-01286.73] Rozpoczynam finalizację zakończonych aukcji
[2011-12-27 16:54:46,729: DEBUG/PoolWorker-2] <Unrepresentable <type 'unicode'>: UnicodeEncodeError('ascii', u'\x1b[1;34m[z3sr-01286.73] Rozpoczynam finalizacj\u0119 zako\u0144czonych aukcji\x1b[0m', 45, 46, 'ordinal not in range(128)')>
Traceback (most recent call last):
  File "/home/tomek/environments/.../lib/python2.6/site-packages/celery-2.3.3-py2.6.egg/celery/log.py", line 47, in format
    record.msg = str(color(safe_str(record.msg)))
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0119' in position 45: ordinal not in range(128)

while my standard logger saves&displays message correctly, i'm still getting these exceptions in celery's console

victorbstan · 2012-01-12T19:58:01Z

Getting same error <Unrepresentable <type 'unicode'>: UnicodeDecodeError('ascii', '\x1b[1;34m(0.000)

ask · 2012-01-13T12:15:47Z

Could you give me the source code for a task that is able to reproduce this issue?

victorbstan · 2012-01-13T14:54:05Z

I guess any task that outputs to the console any odd unicode characters? Say, if they are in the name of the object that is passed as a parameter?

# products.tasks.increment_view_count
@task(ignore_result=True)
def increment_view_count(product):
    p = Product.objects.get(pk=product.id)
    p.view_count = F('view_count')+1
    p.save()

brockhaywood · 2013-02-12T18:50:36Z

Has this issue been addressed? It's closed but seems to not be resolved in celery==3.0.11

ask · 2013-02-13T12:54:38Z

Having print(unicode) in a task works for me on 3.0.15

mtirsel · 2013-02-25T11:14:44Z

I think, it is still present. It is reproducible if you have for example a Django TextField that stores JSON string with unicode. A part of the SQL query from Postgres log:

UPDATE ...
SET
    ...,
    "invoice_name" = 'Faktúra č. 11300007',
    "app_data" = '{"address": "Lipovn\u00edk 123"}'
WHERE ...

And the same in celery log:

[2013-02-25 11:47:35,143: DEBUG/MainProcess] <Unrepresentable <type 'str'>: UnicodeDecodeError('ascii', '\x1b[1;34m(0.001) UPDATE ... SET ..., "invoice_name" = \'Fakt\xc3\xbara \xc4\x8d. 11300007\', "app_data" = \'{"address": "Lipovn\\u00edk 123"}\' WHERE ... ; args=(..., u\'Fakt\\xfara \\u010d. 11300007\', \'{"address": "Lipovn\\\\u00edk 123"}\', ...)>
Traceback (most recent call last):
  File "/home/bruce/.virtualenv/expreso/local/lib/python2.7/site-packages/celery/utils/log.py", line 90, in format
    record.msg = safe_str(str_t(color(record.msg)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 219: ordinal not in range(128)

You can see double escapes on the JSON string.

buster · 2013-03-13T13:07:37Z

Hi, on celery 3.0.16 i have the same problem.
When trying to log a unicode string i get:

 <Unrepresentable <type 'str'>: UnicodeDecodeError('ascii', "\x1b[1;33moption2:<type 'unicode'>Super krasser Full Name mit \xc3\xa4\xc3\xa4\xc3\xbc\xc3\x9f\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa\x9e\x1b[0m", 59, 60, 'ordinal not in range(128)')>
Traceback (most recent call last):
  File "/home/buster/projects/nomadenv/local/lib/python2.7/site-packages/celery/utils/log.py", line 90, in format
    record.msg = safe_str(str_t(color(record.msg)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 59: ordinal not in range(128)

ask · 2013-03-13T14:34:39Z

@buster Can you please show me an example task that reproduces this?

buster · 2013-03-14T08:58:07Z

Hi,

the simplest test i have found to work:

@celery.task
def unicode_test_task():
    print(u'hiöäüß')

And in a celeryshell:

>>> from nomad.tasks import unicode_test_task
>>> unicode_test_task.delay()
<AsyncResult: a7d5a92e-1402-45cb-93c4-843bab6760b7>

And in the worker output:

[2013-03-14 09:42:24,668: INFO/MainProcess] Got task from broker: nomad.tasks.unicode_test_task[a7d5a92e-1402-45cb-93c4-843bab6760b7]
[2013-03-14 09:42:24,699: WARNING/PoolWorker-1] <Unrepresentable <type 'str'>: UnicodeDecodeError('ascii', '\x1b[1;33mhi\xc3\xb6\xc3\xa4\xc3\xbc\xc3\x9f\x1b[0m', 9, 10, 'ordinal not in range(128)')>
Traceback (most recent call last):
  File "/home/buster/projects/nomadenv/local/lib/python2.7/site-packages/celery/utils/log.py", line 90, in format
    record.msg = safe_str(str_t(color(record.msg)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 9: ordinal not in range(128)
[2013-03-14 09:42:24,703: INFO/MainProcess] Task nomad.tasks.unicode_test_task[a7d5a92e-1402-45cb-93c4-843bab6760b7] succeeded in 0.00515413284302s: None

I did find that this works when i disable colored log output...

ask · 2013-04-11T11:33:43Z

Thanks! Fixed

karanlyons · 2013-07-30T11:47:13Z

I'm actually seeing this issue with 3.0.21:

Traceback (most recent call last):
  File "/Users/karanlyons/.environments/btf/lib/python2.7/site-packages/celery/utils/log.py", line 96, in format
    record.msg = str_t(color(safe_str(msg)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 333: ordinal not in range(128)

I haven't looked into this that deeply, so perhaps it's an issue on my end, but maybe the bug is still there?

ask · 2013-07-30T12:11:38Z

@karanlyons May be, this is one of those bugs that when you fix it for one person, it breaks for someone else ;)

ahollingsworth-marin · 2013-07-30T16:27:07Z

I've been running into the same issue as @karanlyons on 3.0.21 in log.py, line 96.

karanlyons · 2013-07-30T17:46:35Z

@ask — Yeah, I hate these sorts of bugs (unicode is hard, guys!).

One "solution" would be to wrap that line in a try except UnicodeDecodeError block, and retry with msg = unicode(msg).encode('ascii', 'ignore').

That kinda sucks for obvious reasons, but depending on the code paths that hit these lines (is it just for writing stuff to stdout/stderr?), it might be okay.

JoshuaChi · 2013-09-02T14:48:39Z

I still get this kind of error with fix for #427. @karanlyons solution might works. But you will lost a lot information if your message is unicode.

ask · 2013-10-07T11:06:35Z

There was a bug in safe_str where it used sys.getfilesystemencoding instead of sys.getdefaultencoding. This may have been the source of this problem, so could you test using 3.0.23?

You may lose information, but what can you do if the terminal does not support it :/

JoshuaChi · 2013-10-09T02:19:52Z

@ask I did tried 3.0.23 before. It worked.
I tried all these solutions in my terminal, but still has not got it work.

export LANG=zh_TW.UTF-8
export LANG=zh_CN.UTF-8

.bashrc

stty pass8
bind 'set convert-meta off'
bind 'set meta-flag on'
bind 'set output-meta on'

Will give more try later.

ask · 2013-10-17T15:46:08Z

@JoshuaChi And what is the value of sys.getdefaultencoding() ? It must return non-ascii otherwise it would have already replaced the chars.

My terminal does not support unicode and the unicode_test_task provided by @buster above logs:

[2013-10-17 16:43:09,400: WARNING/Worker-1] hi????

JoshuaChi · 2013-10-19T07:07:56Z

print sys.getdefaultencoding()
ascii

ask · 2013-10-19T12:55:02Z

Btw, people, when it happens are you logging to a console or to a file?
@JoshuaChi, that is strange, what is the traceback you have?

JoshuaChi · 2013-10-20T12:26:47Z

Here is piece of traceback:

[2013-10-20 20:24:19,845: DEBUG/MainProcess] : UnicodeDecodeError('ascii', "\x1b[1;34m(0.001) SELECT * FROM $TABLE_NAME WHERE $TABLE_NAME.`title` = '\xe8\xb1\xa1\xe5\x8e\x82\xe5\x96\x9c\xe5\x89\xa7' ; args=(u'\\u8c61\\u5382\\u559c\\u5267',)\x1b[0m", 722, 723, 'ordinal not in range(128)')>
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/celery/utils/log.py", line 96, in format
    record.msg = str_t(color(safe_str(msg)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 722: ordinal not in range(128)

ask · 2013-10-23T17:34:13Z

Well, this is a mess, but I'm not sure how this can be solved in a pretty way. I used to dream of succinct and beautiful code, then reality kicked in with platform workarounds, posix and users with their special cases, so this shouldn't be news to me :)

You cannot really know the encoding until you actually write to a specific file, so now it will maintain a "default output file" that the workers set when it configures logging. If it logs to a stream it will take encoding information from that stream, if it writes to a file it takes encoding information from that file.

It seems that logging can handle unicode already though, so maybe this is not necessary anymore. If that is the case we must find out what version that was added in.

JoshuaChi · 2013-10-29T05:47:05Z

At least it works for me. :-)

flyingfoxlee · 2014-07-07T00:29:38Z

In my case, I fix it by set export LANG=en_US.UTF-8, I already set it in .zshrc, but need to set it again, because the working environment does not respect settings in .zshrc.

lukeaus · 2015-10-23T07:22:18Z

I am getting a UnicodeDecodeError from Celery

I have this set in my /etc/default/celeryd
export LANG=en_US.UTF-8

I have celery==3.1.18 and django-celery==3.1.17 installed.

However celery throwing a UnicodeDecodeError on this print statement
print '%s - Assessing %s-%s' % (user_name, foo.id, foo.name)

where user_name = foobar don\xe7a

As a side note, nginx handles unicode fine

Do I need to set:
export LANG=en_US.UTF-8
anywhere else?

ask · 2015-10-27T18:09:29Z

@lukeaus do you have a traceback?

nginx is not written in Python 2 ;)

lukeaus · 2015-10-27T19:57:36Z

@ask
Yeah - I had another issue a while back with unicode and nginx - thats all working now.

This issue just relates to celery.

Here is a stacktrace as requested.

Traceback (most recent call last):
  File "/home/webapps/.virtualenvs/crowdstaff/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 112, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/home/webapps/.virtualenvs/crowdstaff/local/lib/python2.7/site-packages/django/contrib/auth/decorators.py", line 22, in _wrapped_view
    return view_func(request, *args, **kwargs)
  File "/home/webapps/apps/crowdstaff/lib/smsverify/views.py", line 57, in verify_form
    return profile_complete_redir(request)
  File "/home/webapps/apps/crowdstaff/project/users/decorators.py", line 26, in func
    return view_func(request, *args, **kwargs)
  File "/home/webapps/apps/crowdstaff/project/workers/views/profile.py", line 445, in profile_complete_redir
    if request.user.signup_complete and profile_complete_and_user_verified(request):
  File "/home/webapps/apps/crowdstaff/project/users/decorators.py", line 26, in func
    return view_func(request, *args, **kwargs)
  File "/home/webapps/apps/crowdstaff/project/workers/views/profile.py", line 499, in profile_complete_and_user_verified
    return create_matches_for_signup_applications(request)
  File "/home/webapps/apps/crowdstaff/project/users/decorators.py", line 26, in func
    return view_func(request, *args, **kwargs)
  File "/home/webapps/apps/crowdstaff/project/workers/views/profile.py", line 523, in create_matches_for_signup_applications
    score = score_worker_for_job(worker, job, external_applicant=True)
  File "/home/webapps/apps/crowdstaff/project/jobs/process.py", line 490, in score_worker_for_job
    print '%s - Assessing %s-%s' % (worker, job.id, job.name)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)

Any assistance would be greatly appreciated.
Thanks

ask · 2015-10-28T19:18:17Z

But this traceback seems to be from a Django view? What are you running exactly when this error occurs?

lukeaus · 2015-10-28T21:11:12Z

@ask so it is... wrong stacktrace. Sorry about that

I have uploaded the stacktrace as an image as Opbeat won't let me copy and paste the stacktrace as one might expect.

Opbeat also gives me these extra details:
args []
task <@task: jobs.process.process_jobs of default:0x7f2dd278d0d0 (v2 compatible)>
task_id 4b84ff8f-3ecc-4708-92b8-fa67e4a62390
kwargs {}

this is the same function as the previous incorrect stacktrace but this time is called as an @task

ask · 2015-10-30T19:37:11Z

The worker redirects output to stdout to the logger, but having unicode in there works for me.
In that case the traceback would also be different (occurring somewhere in celery.log)

Your problem is in your code, and could beseveral things, job.id or job.name could be objects that are stringified with str instead of unicode on Python2, or you have disabled the redirection.

May be fixed by using safe_str:

from kombu.utils.encoding import safe_str
print '%s - Assessing %s-%s' % (safe_str(worker), safe_str(job.id), safe_str(job.name))

Or you will get a better traceback

lukeaus · 2015-11-10T03:39:55Z

@ask You were right - good thinking.
I had a unicode method on the Worker model returning a string, rather than a unicode object.
Fixed. Thanks for your help

ask closed this as completed in dde8868 Sep 2, 2011

ask reopened this Mar 13, 2013

ask added a commit that referenced this issue Apr 11, 2013

Fixes unicode output in colored logs. Closes #427

fcc432a

ask closed this as completed Apr 11, 2013

ask added a commit that referenced this issue Apr 11, 2013

Real fix for #427

9e9674f

ask mentioned this issue May 14, 2013

UnicodeDecodeError in celery/log.py #1343

Closed

ask reopened this Jul 30, 2013

ask added a commit that referenced this issue Oct 23, 2013

Workaround for Unicode errors (Issue #427)

7b69164

ask closed this as completed in 31b5717 Oct 23, 2013

UnicodeEncodeErrors with celery built-in console logging #427

UnicodeEncodeErrors with celery built-in console logging #427

Comments

kmike commented Jul 6, 2011

ask commented Jul 18, 2011

kmike commented Jul 18, 2011

ask commented Sep 2, 2011

tkloc commented Dec 27, 2011

victorbstan commented Jan 12, 2012

ask commented Jan 13, 2012

victorbstan commented Jan 13, 2012

brockhaywood commented Feb 12, 2013

ask commented Feb 13, 2013

mtirsel commented Feb 25, 2013

buster commented Mar 13, 2013

ask commented Mar 13, 2013

buster commented Mar 14, 2013

ask commented Apr 11, 2013

karanlyons commented Jul 30, 2013

ask commented Jul 30, 2013

ahollingsworth-marin commented Jul 30, 2013

karanlyons commented Jul 30, 2013

JoshuaChi commented Sep 2, 2013

ask commented Oct 7, 2013

JoshuaChi commented Oct 9, 2013

ask commented Oct 17, 2013

JoshuaChi commented Oct 19, 2013

ask commented Oct 19, 2013

JoshuaChi commented Oct 20, 2013

ask commented Oct 23, 2013

JoshuaChi commented Oct 29, 2013

flyingfoxlee commented Jul 7, 2014

lukeaus commented Oct 23, 2015

ask commented Oct 27, 2015

lukeaus commented Oct 27, 2015

ask commented Oct 28, 2015

lukeaus commented Oct 28, 2015

ask commented Oct 30, 2015

lukeaus commented Nov 10, 2015