#FIXME Benni und Flo #2

pfistfl · 2015-08-24T14:55:23Z

# FIXME: dat sets which text features and special chars, they are not stored as UTF8 on OML
dids = setdiff(dids, c(374, 376,  379,  380))

# FIXME: strings are broken at "," so "[1,2]" becomes "'[1" and "2]'"
dids = setdiff(dids, c(1047, 1057))

# FIXME: foreign can not read dat set linebreaks are \r\r\n instead of \r\n 
# Might be due to conversion using R download.file()?
dids = setdiff(dids, c(579,585, 581))

# FIXME: dat sets with space in column names
dids = setdiff(dids, c(1058))

# FIXME: Error in data, numeric data sometimes quoted e.g. '1047' instead of 1047
# Weka simply removes quotes
dids = setdiff(dids, c(1092, 1095))

# FIXME: dat set where @Data lines sometimes begin with ",". 
# farff reads NA for first and drops last entry in the line
# rWeka removes ","
dids = setdiff(dids, c(676))

# FIXME: dat set of form {0 entry1, 1 entry2, 2 entry3, 4 entry5}
# Where 0,1,2,4... is the column number.
# dat set with according column number in front of entry,
# if colnumber not  in '{}' tags 
# then fill with 0 (that is what RWeka does)
dids = setdiff(dids, c(292))
´´´

berndbischl · 2015-08-24T15:47:01Z

Thanks
Please only check against RWEka, not foreign

berndbischl · 2015-08-24T15:47:21Z

Pls update the list here, I am still working on new versions

berndbischl · 2015-08-24T15:48:16Z

Data sets where I cannot do anything, usually because they are "invalid" on OML, I will exclude with comments in the oml unit test file in farff

berndbischl · 2015-08-24T15:50:26Z

c(579,585, 581)

These work now

berndbischl · 2015-08-24T15:51:14Z

dids = setdiff(dids, c(1058))

This should work now

berndbischl · 2015-08-24T15:53:54Z

292 is in sparse format, I cannot handle this yet.

berndbischl · 2015-08-24T19:32:12Z

dids = setdiff(dids, c(1092, 1095))

I reported this on the server, faulty ARFF IMHO

openml/OpenML#204

EDIT: Fixed on OML server and can be parsed correctly now.

berndbischl · 2015-08-24T19:32:54Z

dids = setdiff(dids, c(374, 376, 379, 380))

I reported this on the server, faulty ARFF IMHO

openml/OpenML#201

EDIT: Fixed on OML server and can be parsed correctly now, but only with data.reader = 'readr'.

berndbischl · 2015-08-24T19:33:52Z

dids = setdiff(dids, c(676))

I reported this on the server, faulty ARFF IMHO

openml/OpenML#203

EDIT: Fixed on OML server and can be parsed correctly now.

berndbischl · 2015-08-28T11:37:50Z

can you please give feedback whether this is now all done?

berndbischl · 2015-08-28T11:38:15Z

please retest from your side with the latest version

pfistfl · 2015-08-28T13:01:32Z

Ok, checked back:

Not yet working for me:

# FIXME: dat sets which text features and special chars, they are not stored as UTF8 on OML
dids = setdiff(dids, c(374, 376,  379,  380))

# Should't this work by now? I even explicitly included 
 # d1 = readARFF(path, data.reader = "readr")

# and (not possible yet):

# FIXME: dat set of form {0 entry1, 1 entry2, 2 entry3, 4 entry5}
# Where 0,1,2,4... is the column number.
# dat set with according column number in front of entry,
# if colnumber not  in '{}' tags 
# then fill with 0 (that is what RWeka does)
dids = setdiff(dids, c(292))
´´´

pfistfl · 2015-08-28T13:10:41Z

Additionally found new errors (extended search range):

# If data lines do not end in \r\n  an extra line of NA's is added
# Happens at  end of 1028, 1030; Every second line of 1059, 1064
did2s = setdiff(did2s, c(1028, 1030, 1059, 1064))
´´´

berndbischl · 2015-08-28T13:23:03Z

dids = setdiff(dids, c(374, 376, 379, 380))

This is already in my unit tests? Please paste what happens here, with readr.

berndbischl · 2015-08-28T13:24:16Z

292: Please dont refer to that again, like i said, it is sparse, I cannot handle that now, and we have an extra issue for that

berndbischl · 2015-08-28T13:25:29Z

Can you please also simply run the whole unit tests on your machine?
They all pass here, and all of your data sets are already included here.

pfistfl · 2015-08-28T13:28:00Z

dids = setdiff(dids, c(374, 376, 379, 380)) 
# Works now, no idea what the error was before. I'll reply when I can reproduce.

292: noted.

jakobbossek · 2016-09-07T16:26:50Z

All but 292 (sparse format) runs fine. Closing 🎉

pfistfl mentioned this issue Aug 24, 2015

#FIXME Benni und Flo #1

Closed

jakobbossek closed this as completed Sep 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#FIXME Benni und Flo #2

#FIXME Benni und Flo #2

pfistfl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 28, 2015

berndbischl commented Aug 28, 2015

pfistfl commented Aug 28, 2015

pfistfl commented Aug 28, 2015

berndbischl commented Aug 28, 2015

berndbischl commented Aug 28, 2015

berndbischl commented Aug 28, 2015

pfistfl commented Aug 28, 2015

jakobbossek commented Sep 7, 2016

#FIXME Benni und Flo #2

#FIXME Benni und Flo #2

Comments

pfistfl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 24, 2015

berndbischl commented Aug 28, 2015

berndbischl commented Aug 28, 2015

pfistfl commented Aug 28, 2015

pfistfl commented Aug 28, 2015

berndbischl commented Aug 28, 2015

berndbischl commented Aug 28, 2015

berndbischl commented Aug 28, 2015

pfistfl commented Aug 28, 2015

jakobbossek commented Sep 7, 2016