Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csv import error + workaround #46

Open
T3chbug opened this issue Aug 31, 2023 · 2 comments
Open

csv import error + workaround #46

T3chbug opened this issue Aug 31, 2023 · 2 comments

Comments

@T3chbug
Copy link

T3chbug commented Aug 31, 2023

Problem

When I tried to import a measurements .csv I got a ParserError: Error tokenizing data. C error: Expected 20 fields in line 3, saw 23 from ~\mambaforge\envs\devbio-napari-env\lib\site-packages\pandas\_libs\parsers.pyx:

Cause

pandas is unable to read .csv files that use semicolon as seperator. When importing the file with pd.read_csv() the exact same error occurs, and will succesfully import when given the additional parameter sep = ";" or sep = None (auto detect). Note that semicolon seperated .csv files usually use comma as decimal seperator, which requires decimal="," (no auto detect available), otherwise decimal numbers will be interpreted as strings.

→ So its not a problem of devbio-napari, but pandas and the different .csv and number formats.

Workaround

After replacing the seperators to match pandas default (comma and dot) the import of the csv file into napari worked.

Also see

@haesleinhuepf
Copy link
Owner

Hi @T3chbug ,

can you share an example CSV file where this happened? Also how was it created?

Thanks!

Best,
Robert

@T3chbug
Copy link
Author

T3chbug commented Sep 1, 2023

Hi @haesleinhuepf,
the file originates from regionprops export to .csv and was then edited in excel (additional column "custom"). The original unchanged csv could be imported, while the excel edited file caused the error.

Excel uses the system standard number format (german in my case), which lead to the semi-colon seperated csv file with decimal comma numbers which pandas cannot handle on default. After changing the system standard or excel setting to comma seperated csv with decimal dot numbers the import into napari worked.

excel edited file causing the import error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants