Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SRBCT microarray data #204

Open
szcf-weiya opened this issue Aug 18, 2019 · 4 comments
Open

SRBCT microarray data #204

szcf-weiya opened this issue Aug 18, 2019 · 4 comments

Comments

@szcf-weiya
Copy link
Owner

szcf-weiya commented Aug 18, 2019

SRBCT microarray data

SBRCT gene expression data. 2318 genes, 63 training samples,
25 test samples.

One gene per row, one sample per column

Cancer classes are labelled 1,2,3,4 for c("EWS","RMS","NB","BL")

Files

  • Training set gene expression: khan.xtrain.txt
  • Training set class labels: khan.ytrain.txt
  • Test set gene expression: khan.xtest.txt
  • Test set class labels : khan.xtest.txt
szcf-weiya added a commit that referenced this issue Aug 18, 2019
@szcf-weiya
Copy link
Owner Author

diagonal LDA

p652 or ESL CN
image
The original text claims that

Here the diagonal LDA classifier yielded five misclassification errors for the 20 test samples.

As you can see in the above frequency table, there are 7 misclassification errors among 20 test samples (the NA samples are excluded), roughly the same performance.

szcf-weiya added a commit that referenced this issue Aug 18, 2019
@szcf-weiya
Copy link
Owner Author

Regularized Diagonal LDA (with Delta = 2.0)

image
Zero training error and only 1 misclassification errors among 20 test samples (exclude NA samples), roughly agree with the top panel of Fig. 18.4

szcf-weiya added a commit that referenced this issue Aug 18, 2019
@szcf-weiya
Copy link
Owner Author

remove NA

image

szcf-weiya added a commit that referenced this issue Aug 18, 2019
@szcf-weiya
Copy link
Owner Author

Error curves (Fig. 18.4 top)

error_curves
Roughly reproduce the original figure, the cv error might be different since the division of folds.

Tips related to the plot. Cannot find the twiny command in plot.jl, although it does exist a twinx command, which is a bonus-feature and is not described in the docs. JuliaPlots/Plots.jl#337
Then I resorted to the pyplot package.

szcf-weiya added a commit that referenced this issue Aug 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant