Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exports need to include a BOM #430

Closed
marksweb opened this issue Jan 13, 2021 · 9 comments · Fixed by #441
Closed

Exports need to include a BOM #430

marksweb opened this issue Jan 13, 2021 · 9 comments · Fixed by #441

Comments

@marksweb
Copy link
Collaborator

Currently exporting a CSV and opening it in Excel ruins any unicode characters.

For example, Japanese characters display as 裕太

Exporter needs to do something like this to write the BOM to the beginning of the file;

    contents = StringIO()
    contents.write(codecs.BOM_UTF8.decode('utf-8'))
    writer = csv.writer(contents)
@bdista
Copy link
Contributor

bdista commented May 12, 2021

I'd like to work on that if that's ok?

@marksweb
Copy link
Collaborator Author

Hi @bdista

It would be great if you're able to. I've not had the time yet so would love a PR to fix this!

@bdista
Copy link
Contributor

bdista commented May 12, 2021

Ok great, I'll prepare a PR, thanks!

@bdista
Copy link
Contributor

bdista commented May 13, 2021

Hi @marksweb,

While testing this issue locally I noticed that the paths to the requirement files in test_project/start.sh are not in line with the actual requirements files.

Should I change that in the PR I'll make for the BOM or do you prefer a dedicated issue and PR for this ? (sorry newbie here)

Thanks!

@marksweb
Copy link
Collaborator Author

Hi @bdista, happy for you to include that fix.

I think I've broken those paths as I migrated from travis CI to github actions recently!

bdista added a commit to bdista/django-sql-explorer that referenced this issue May 13, 2021
@bdista
Copy link
Contributor

bdista commented May 14, 2021

Hi @marksweb,

I've created a draft PR for this issue.
Not sure how users use this, so I'm wondering if this can be a breaking change for some if these exported csv get then processed by other tools that don't play nice with an UTF8 BOM.
If that's a problem, perhaps a setting in app_settings.py could define whether a BOM is to be included or not.
Happy to try out other approaches if needed!

Thanks

@marksweb
Copy link
Collaborator Author

Hi @bdista

At the moment unicode characters don't export properly and nobody else has reported it, so I'm not concerned about this feature having heavy use and prefer to just get bugs fixed.

If this change lead to people raising issues because of the BOM, then we can consider an enhancement with settings I think.

@legios89
Copy link

Actually, we also stumbled into this problem in our place.
What we did is the following

We improved the built in exporter a little bit

from explorer.exporters import CSVExporter
from io import BytesIO


class CSVExporterBOM(CSVExporter):
    def _get_output(self, res, **kwargs):
        csv_data = super(CSVExporterBOM, self)._get_output(res, **kwargs)
        csv_data_io = BytesIO()
        csv_data_io.write(b'\xef\xbb\xbf')
        csv_data_io.write(csv_data.getvalue().encode('utf-8'))
        return csv_data_io

than we set this as the CSV exporter in the settings

EXPLORER_DATA_EXPORTERS = [
    ('csv', 'core.exporters.CSVExporterBOM'),
    ('excel', 'explorer.exporters.ExcelExporter'),
    ('json', 'explorer.exporters.JSONExporter')
]

So this is an existing problem I think, it was just easy to fix it like this.

@bdista
Copy link
Contributor

bdista commented May 14, 2021

Ok thanks for the feedback! I'll leave the PR as is then.

marksweb pushed a commit that referenced this issue May 14, 2021
Co-authored-by: bdista <bdista@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants