Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utf-8-sig when reading csv for import or do something else to remove BOM character #22151

Closed
tinodj opened this issue Aug 22, 2023 · 2 comments · Fixed by #26183
Closed

Comments

@tinodj
Copy link

tinodj commented Aug 22, 2023

When reading CSV for import, from CSV file with BOM (which is default under Excel/MacOS ) the first column name is read together wit the BOM characters - therefore it is never recognized and needs to be mapped manually. Workaround is - add empty column ad the beginning that will be ignored.

Proposed solution is add utf-8-sig encoding on this line:

https://github.com/frappe/frappe/blob/8f7a4f6697bd732a712cf582c2f321dc67c73629/frappe/utils/csvutils.py#L42C3-L42C61

I am open for other solutions as well - such as removing BOM characters at the beginning of the file if found.

@ankush
Copy link
Member

ankush commented Aug 24, 2023

@tinodj adding utf-8-sig makes sense on surface. Can you send a PR?

@vmatt
Copy link
Contributor

vmatt commented Apr 27, 2024

Hey @tinodj, Included the fix for this in my #26183 pull request.

One note on your original pull request, It's not enough to update only the csvutils.py, as csv files are plaintext files, and first will be decoded in file.py, and became instantly incorrectly decoded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants