Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify charset=UTF-8 when serving non-base64 files #2402

Merged
merged 3 commits into from Apr 26, 2017
Merged

Conversation

@gnestor
Copy link
Contributor

@gnestor gnestor commented Apr 12, 2017

Closes #2397

@takluyver
Copy link
Member

@takluyver takluyver commented Apr 12, 2017

This handles the fallback case where we don't find a mimetype for a file. Do we also want to do the same if guess_type returns text/plain as the mimetype of e.g. a .txt file?

rgbkrk
rgbkrk approved these changes Apr 12, 2017
@bertjwregeer
Copy link

@bertjwregeer bertjwregeer commented Apr 12, 2017

@takluyver does mimetype return the charset for a file? I wouldn't think it would, but for text/plain sending charset=utf-8 wouldn't be that bad of a guess in most cases.

When you have a .txt file and click on it in the browser, you open up an editor that correctly shows the contents as a UTF-8, it is only when you download the raw file at the moment, for example when you rename a file with UTF-8 characters within it to somefile.log for example, because that won't cause the editor to be opened when clicked.

@gnestor
Copy link
Contributor Author

@gnestor gnestor commented Apr 12, 2017

@takluyver It sounds like a good idea to me. Does that look right?

@takluyver
Copy link
Member

@takluyver takluyver commented Apr 13, 2017

Thanks, that looks good. There's a test failure, though, because we have a test that expects text/plain:

FAIL: make sure ContentsManager returns right files (ipynb, bin, txt).
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\projects\notebook\notebook\tests\test_files.py", line 98, in test_contents_manager
    self.assertEqual(r.headers['content-type'], 'text/plain')
AssertionError: 'text/plain; charset=UTF-8' != 'text/plain'
- text/plain; charset=UTF-8
+ text/plain

I don't think mimetype - or anything in the standard library - does encoding detection.

@gnestor
Copy link
Contributor Author

@gnestor gnestor commented Apr 20, 2017

Ok, tests are passing! Let's wait to see if these resolves #2397 before merging...

@bertjwregeer
Copy link

@bertjwregeer bertjwregeer commented Apr 26, 2017

This resolves the issue for me and log files are now downloaded correctly.

@rgbkrk rgbkrk merged commit 469b1c8 into jupyter:master Apr 26, 2017
4 checks passed
@rgbkrk rgbkrk deleted the utf-8 branch Apr 26, 2017
@takluyver takluyver added this to the 5.1 milestone Jul 18, 2017
@gnestor gnestor mentioned this pull request Aug 3, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants