Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asciifying http header for csv download; fixes #3952 #3975

Merged

Conversation

rumbin
Copy link
Contributor

@rumbin rumbin commented Dec 1, 2017

I think this should be all that is needed to fix #3952.

This way, the rather pathologic tab label →ʬéıρδ Ñämë← will get converted to WWeird_Name, which is pretty much what I would expect it to be in non-unicode representation.

Copy link
Contributor

@xrmx xrmx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a test would be nice :)

@@ -2288,7 +2289,7 @@ def csv(self, client_id):
csv = df.to_csv(index=False, **config.get('CSV_EXPORT'))
response = Response(csv, mimetype='text/csv')
response.headers['Content-Disposition'] = (
'attachment; filename={}.csv'.format(query.name))
'attachment; filename={}.csv'.format(unidecode(query.name)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is enough?

query.name.encode('ascii', 'ignore')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a German there is a big difference, whether the Umlauts are entirely dropped or just replaced by their closest ascii counterpart.

I do understand that it may seem inappropriate to introduce yet another dependency for such a minor change, but I feel like this is affecting the user experience, especially of the less technically versed users.

You may safely skip the following pleadings…


There are languages, where this is even more relevant. To name an extreme example, in Turkish there exists this nice word düsündürttürücülügümüzün. Personally I think that dusundurtturuculugumuzun may still be readable, but dsndrttrclgmzn is not.

Also think of the Finnish people: A SQLLab tab on the unemployment in the town of Järvenpää työttömyys_Järvenpää could either be tyttmyys_Jrvenp or tyottomyys_Jarvenpaa.

And finally, also our Frensh speaking friends may benefit from not dropping the letters: élevé becomes lev or eleve.


@rumbin
Copy link
Contributor Author

rumbin commented Dec 2, 2017

@xrmx: Judging from your smiley, I assume that you expect me to still not having the slightest idea of how to write a test for the csv function that this PR is all about. And you are right.

If somebody (you?) could give me a hand, especially for getting #3705 through, I would be really glad...

@xrmx
Copy link
Contributor

xrmx commented Dec 2, 2017

The smile was me begging for tests. Anyway look at tests.core_tests.CoreTests.test_csv_endpoint copy tht test to something like test_csv_endpoint_works_returns_ascii_headers and test that you can decode('ascii') the header.

@mistercrunch
Copy link
Member

Unidecode will be useful in other contexts, I'm ok with the new dep.

@mistercrunch mistercrunch merged commit e98a1c3 into apache:master Dec 5, 2017
michellethomas pushed a commit to michellethomas/panoramix that referenced this pull request May 24, 2018
* asciifying http header for csv download; fixes apache#3952

* fixed order of imports and added unidecode to requirements in setup.py
wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018
* asciifying http header for csv download; fixes apache#3952

* fixed order of imports and added unidecode to requirements in setup.py
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SQLLab, CSV export] UnicodeEncodeError when tab/filename contains non-ascii characters
3 participants