Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte #286

Closed
HelloMukama opened this issue May 24, 2021 · 9 comments

Comments

@HelloMukama
Copy link

I am trying to push my django project to heroku but each time i try to python manage.py collectstatic, I am get this error every time i set STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage'
YET When i run python manage.py collectstatic with the default STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.StaticFilesStorage'
It runs without an issue.
I originally thought one/some of my static files had issues and went through each and everyone of the tens of files but still got the same error.
Then i decided to default the STATICFILES_STORAGE setting and to my surprise, python manage.py collectstatic worked.

I would like for someone to help me trouble shoot on this one.
Because the issue seems to be related to whitenoise.

I am using Python 3.9.4 and django 3.2

Below is the error i am getting.

Traceback (most recent call last):
  File "/home/nmj/PROJECTS/abc/blueMust/mysite/src/manage.py", line 22, in <module>
    main()
  File "/home/nmj/PROJECTS/abc/blueMust/mysite/src/manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 187, in handle
    collected = self.collect()
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 128, in collect
    for original_path, processed_path, processed in processor:
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/whitenoise/storage.py", line 148, in post_process_with_compression
    for name, hashed_name, processed in files:
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/whitenoise/storage.py", line 88, in post_process
    for name, hashed_name, processed in files:
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/django/contrib/staticfiles/storage.py", line 406, in post_process
    yield from super().post_process(*args, **kwargs)
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/django/contrib/staticfiles/storage.py", line 231, in post_process
    for name, hashed_name, processed, _ in self._post_process(paths, adjustable_paths, hashed_files):
  File "/home/nmj/PROJECTS/abc/SubmissionMgtSyst/mysite/venv/lib/python3.9/site-packages/django/contrib/staticfiles/storage.py", line 288, in _post_process
    content = original_file.read().decode("utf-8")   # original line
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
@tysonclugg
Copy link

The 0xff byte at the very start of your file suggests it may be UTF-16 encoded with a byte-order-mark. @HelloMukama, can you confirm if this is correct?

@adamchainz
Copy link
Collaborator

Tyson seems to be right. You probably want to fix that file.

Also this error is inside Django - you should see the same problem if you use ManifestStaticFilesStorage, which Whitenoise extends from.

@oliwarner
Copy link

For the little this is worth, I've just started getting these after upgrading a working site to Django 4.0. Was on Whitenoise, fell back to ManifestStaticFilesStorage but still get the same problem. I edited the Django script to get some filename feedback and it's choking on (perfectly valid) unicode characters inside some tinymce scripts that have worked forever.

Still not sure what to do next but.. yeah..

@evansd
Copy link
Owner

evansd commented Mar 2, 2022

I assume this is because Django now tries to rewrite source map references inside JS files, so it now has to read JS files which it never did before:
https://docs.djangoproject.com/en/4.0/releases/4.0/#django-contrib-staticfiles

@adamchainz
Copy link
Collaborator

That sounds like a legitimate explanation!

it's choking on (perfectly valid) unicode characters

Are you sure they're valid? Django is just doing a plain read-and-decode. If you run that yourself for the same file, it should raise the same error.

@oliwarner
Copy link

$ python -c "open('...../static/tiny_mce/plugins/spellchecker/editor_plugin_src.js').read()"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa7 in position 4560: invalid start byte

Huh, okay. So this is a tiny_mce problem. I'll add upgrading a long- and deeply- embedded library to my blossoming todo list D:

Thanks for the pointers.

@evansd
Copy link
Owner

evansd commented Mar 2, 2022

If you can't face doing the upgrade (know the feeling!) then an alternative would be to work out what encoding it's in (e.g. using chardet), and convert it to utf-8 so Django can read it correctly.

@adamchainz
Copy link
Collaborator

You may also be able to just download the original version again - it may have switched encoding after you downloaded it.

@oliwarner
Copy link

TinyMCE seems to completely reinvent itself every three years. They're having another run at making some money. I could build it myself but honestly given how much they've changed all the packaging, that looks like it'll need re-integrating into the project. I'd sooner use something else that isn't quite so aggressive with its users. There are lots of RTE options these days.

@evansd your idea worked :) chardet found ISO-8859-1 so I banged it through:

find ..../static/tiny_mce/ -type f -iname '*.js' -exec sh -c "iconv -f ISO-8859-1 -t UTF-8 {} | sponge {}" \;

Altered two files, all working. What a pain in the bum over nothing. Thanks again for all your help. You've both made my Wednesday immeasurably better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants