Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to parse attached dataset. #1045

Open
mmalohlava opened this issue May 17, 2018 · 1 comment
Open

Unable to parse attached dataset. #1045

mmalohlava opened this issue May 17, 2018 · 1 comment
Labels
fread Issues related to parsing any input files via fread function improve Improvement of an existing functionality low priority Low priority tasks
Projects

Comments

@mmalohlava
Copy link
Member

mmalohlava commented May 17, 2018

Running DAI 1.1.3 (pydatatable version 0.4.0.dev114)

and loading attached dataset:

image

smsDataNew.txt

@mmalohlava mmalohlava added the bug Any bugs / errors in datatable; however for severe bugs use [segfault] label label May 17, 2018
@st-pasha
Copy link
Contributor

The file is in fixed-width format (formatted with tabs), which we don't currently support.
If you ignore the header, then the file can be parsed using sep=\t (which fread autodetects), with the exception of a single line 1234 which contains an unquoted \t character inside the message (and thus from fread's standpoint it looks like an extra field).

@st-pasha st-pasha added improve Improvement of an existing functionality fread Issues related to parsing any input files via fread function and removed bug Any bugs / errors in datatable; however for severe bugs use [segfault] label labels May 18, 2018
@st-pasha st-pasha added this to To Do in fread May 31, 2018
@st-pasha st-pasha added the low priority Low priority tasks label Jun 1, 2018
@st-pasha st-pasha mentioned this issue Jan 4, 2020
27 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fread Issues related to parsing any input files via fread function improve Improvement of an existing functionality low priority Low priority tasks
Projects
fread
  
To Do
Development

No branches or pull requests

2 participants