You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The issue is described here: [https://stackoverflow.com/questions/57309381/as-data-frame-h2oframe-deletes-rows-when-they-contain-a-quote|https://stackoverflow.com/questions/57309381/as-data-frame-h2oframe-deletes-rows-when-they-contain-a-quote]
It works fine when using {{as.h2o()}} to go from data.frame to H2OFrame, but it does not work when going from H2OFrame to data.frame via {{as.data.frame()}}.
Warning messages:1: In data.table::fread(ttt, blank.lines.skip = FALSE, na.strings = "", :
Found and resolved improper quoting in first 100 rows. If the fields are not quoted (e.g. field separator does not appear within any field), try quote="" to avoid this warning.
2: In data.table::fread(ttt, blank.lines.skip = FALSE, na.strings = "", :Detected 3 column names but the data has 4 columns (i.e. invalid file). Added 1 extra default column name for the first column which is guessed to be row names or an index. Use setnames() afterwards if this guess is not correct, or fix the file write command that created the file to create a valid file.
The issue is described here: [https://stackoverflow.com/questions/57309381/as-data-frame-h2oframe-deletes-rows-when-they-contain-a-quote|https://stackoverflow.com/questions/57309381/as-data-frame-h2oframe-deletes-rows-when-they-contain-a-quote]
It works fine when using {{as.h2o()}} to go from data.frame to H2OFrame, but it does not work when going from H2OFrame to data.frame via {{as.data.frame()}}.
{code:r}library(h2o)
h2o.init()
tmp <- data.frame(ngram = c("SIRET:417 653 698",
"SIRET:417 653 698 00031",
"Sans",
"Sans esc.",
"Sans esc. jusqu"au",
"Sans esc. jusqu"au 15.11.2018"))
tmp <- as.h2o(tmp)
tmp <- as.data.frame(tmp)
print(tmp)
ngram
1 SIRET:417 653 698
2 SIRET:417 653 698 00031
3 Sans
4 Sans esc.
5 Sans esc. jusquau\nSans esc. jusquau 15.11.2018{code}
I also tried to see if using data.table as the backend could resolve the issue, but it’s even worse:
{code:r}options("h2o.use.data.table"=TRUE)
tmp <- as.data.frame(tmp)
Warning messages:1: In data.table::fread(ttt, blank.lines.skip = FALSE, na.strings = "", :
Found and resolved improper quoting in first 100 rows. If the fields are not quoted (e.g. field separator does not appear within any field), try quote="" to avoid this warning.
2: In data.table::fread(ttt, blank.lines.skip = FALSE, na.strings = "", :Detected 3 column names but the data has 4 columns (i.e. invalid file). Added 1 extra default column name for the first column which is guessed to be row names or an index. Use setnames() afterwards if this guess is not correct, or fix the file write command that created the file to create a valid file.
dim(tmp)
[1] 1 4
tmp
V1 "Sans esc. jusqu"au"
1 "Sans esc. jusqu"au 15.11.2018"{code}
The text was updated successfully, but these errors were encountered: