one sentence per line

greenelab · Jun 12, 2023 · fe3c015 · fe3c015
1 parent 131ae24
commit fe3c015
Showing 1 changed file with 10 additions and 4 deletions.
diff --git a/01_stratified_classification/README.md b/01_stratified_classification/README.md
@@ -28,14 +28,20 @@ The parent directory README also contains instructions for running tests to ensu
 
 ## Running experiments
 
-To train classifiers and generate the primary results files for the optimization comparison (e.g. classification metrics, best model coefficients, loss function curves), run the `run_stratified_lasso_penalty.py` script. By default, this will use the `liblinear` (coordinate descent) optimizer, unless the `--sgd` flag is included in which case it will use SGD.
+To train classifiers and generate the primary results files for the optimization comparison (e.g. classification metrics, best model coefficients, loss function curves), run the `run_stratified_lasso_penalty.py` script.
+By default, this will use the `liblinear` (coordinate descent) optimizer, unless the `--sgd` flag is included in which case it will use SGD.
 
-If the `--sgd` flag is included, the `--sgd_lr_schedule` argument can be used to select the learning rate schedule. The default is `optimal` (this is the scikit-learn default), but most experiments in the paper use the `constant_search` option. Other options are described [here](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html), under the `learning_rate` function argument.
+If the `--sgd` flag is included, the `--sgd_lr_schedule` argument can be used to select the learning rate schedule.
+The default is `optimal` (this is the scikit-learn default), but most experiments in the paper use the `constant_search` option.
+Other options are described [here](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html), under the `learning_rate` function argument.
 
-The `--num_features` argument can be used for feature selection. This will default to 8000 features selected by median absolute deviation. For the experiments in the paper we used 16042 features, which is all of the features in the preprocessed TCGA gene expression dataset.
+The `--num_features` argument can be used for feature selection.
+This will default to 8000 features selected by median absolute deviation.
+For the experiments in the paper we used 16042 features, which is all of the features in the preprocessed TCGA gene expression dataset.
 
 The script will write output to the `--results_dir` directory, which defaults to `01_stratified_classification/results`.
 
 ## Analysis and visualization of results
 
-The Jupyter notebooks in this directory can be used to visualize the results generated by `run_stratified_lasso_penalty.py` and ultimately to generate the figures in the paper, as described above in the "repository layout" section. Each of these scripts has a `results_dir` variable (or multiple variables) defined near the top, which can either be set manually or modified programmatically using [papermill commmand line arguments](https://papermill.readthedocs.io/en/latest/).
+The Jupyter notebooks in this directory can be used to visualize the results generated by `run_stratified_lasso_penalty.py` and ultimately to generate the figures in the paper, as described above in the "repository layout" section.
+Each of these scripts has a `results_dir` variable (or multiple variables) defined near the top, which can either be set manually or modified programmatically using [papermill commmand line arguments](https://papermill.readthedocs.io/en/latest/).