Examine Gravier (2010) data set #16

Closed
ramhiser opened this Issue Oct 10, 2013 · 1 comment

Projects

None yet

1 participant

@ramhiser
Owner

Måns Thulin from Uppsala University sent the following email to me:

I am now planning to use the Gravier (2010) data to illustrate a new method in a paper, but was wondering if perhaps some of the patients in the study have been misclassified in your R package. According to the Gravier et al. paper and your description of the data on the wiki, there should be 111 patients labelled "good" and 57 labelled "poor". However, when I import the data into R, I get the following:

summary(gravier$y)
good poor
106 62

The numbers of patients (168) and features (2,905) are correct, but there seems to be a problem with the class labels. Have 5 "good" patients been labelled as "poor" or is there in fact a misprint in the Gravier et al. paper? Any insights that you could provide regarding this would be deeply appreciated!

@ramhiser ramhiser was assigned Oct 10, 2013
@ramhiser
Owner

As Måns noted, the labels were incorrect. The script was gathering the labels from the incorrect column in the additional_info.txt file.

@ramhiser ramhiser closed this in a7374d8 Oct 10, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment