Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when reading file with non-utf8 bytes in verbose mode #628

Closed
st-pasha opened this issue Dec 4, 2017 · 0 comments · Fixed by #629
Closed

Error when reading file with non-utf8 bytes in verbose mode #628

st-pasha opened this issue Dec 4, 2017 · 0 comments · Fixed by #629
Assignees
Labels
bug Any bugs / errors in datatable; however for severe bugs use [segfault] label fread Issues related to parsing any input files via fread function
Projects
Milestone

Comments

@st-pasha
Copy link
Contributor

st-pasha commented Dec 4, 2017

>>> import datatable as dt
>>> src = b"A,\x80\n2,3\n"
>>> dt.fread(src, verbose=True)
  Character 3 in the input is '\n', treating input as raw text
[1] Check arguments
  Using 8 threads (omp_get_max_threads()=8, nth=0)
  NAstrings = ["NA"]
  None of the NAstrings look like numbers.
  showProgress = 1
[3] Detect and skip BOM
[4] Detect end-of-line character(s)
  Detected eol as \n only.
[6] Skipping initial rows if needed
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 36: invalid start byte

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/pasha/github/datatable2/datatable/fread.py", line 850, in debug
    print(_log_color(message), flush=True)
  File "/Users/pasha/py36/lib/python3.6/site-packages/blessed/formatters.py", line 239, in __call__
    for idx, ucs_part in enumerate(args):
SystemError: <class 'enumerate'> returned a result with an error set
...

Note: same error does not appear when verbose=False.

See also #594

@st-pasha st-pasha added bug Any bugs / errors in datatable; however for severe bugs use [segfault] label fread Issues related to parsing any input files via fread function labels Dec 4, 2017
@st-pasha st-pasha self-assigned this Dec 4, 2017
@st-pasha st-pasha added this to Done in fread Dec 4, 2017
@st-pasha st-pasha added this to the Release 0.3.0 milestone Jan 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Any bugs / errors in datatable; however for severe bugs use [segfault] label fread Issues related to parsing any input files via fread function
Projects
fread
  
Done
1 participant