-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird convert to data frame behavior #7414
Comments
Zuzana Olajcová commented: -reproducible with h2o-3.32.1.4, not reproducible on top of current rel-zipf branch:- |
Zuzana Olajcová commented: resolved in #5589 |
JIRA Issue Details Jira Issue: PUBDEV-8239 |
Attachments From Jira Attachment Name: as_df_err.parquet |
Linked PRs from JIRA |
Importing the attached parquet file and then converting to data frame has weird behavior.
{code:python}df = h2o.import_file('as_df_err.parquet', header=1)
display(df)
print(df.shape)
display(df.as_data_frame()){code}
In Python 3 rows are dropped and characters are converted.
In Python 2 it throws a {{ParserError: Error tokenizing data. C error: EOF inside string starting at row 194}} error.
This is an example of a special case where a column could contain any character, thus escapes, separators, and quotes could all show up.
The text was updated successfully, but these errors were encountered: