Streamline the results file for cross-validation task #372

desilinguist · 2017-10-18T18:20:06Z

The .results file is produced for the evaluate and the cross_validate tasks. Currently, for the cross_validate task, the .results file (a) prints out the entire folds dictionary if folds_file is specified (b) includes details that aren't relevant to the experiment.

In this branch, I streamline the results file in the following way:

Include the path to the folds file when specified. This required the modification of the return tuple from _parse_config_file to include the folds file path.
Do not include stratified folds information when a folds file is specified since it's not used in that context.
Modify number of CV folds to be <n> via folds file if a folds file is specified.
Modify grid search folds to be <n> via folds file if the grid search ends up using the folds file.
Show the value of use_folds_file_for_grid_search when appropriate.
Show grid search related information (grid_search_folds and grid_objective) only when we are actually doing grid search.

To test this, I also included an entirely new test in test_output.py. Additional changes in the branch include:

Modify some additional tests to deal with the modifications to the .results and, hence, .results.json files.
Minor changes to docstrings and documentation.

…-results-file

- Include folds file path as one of the return values so that it can be included in the results file.

- Include the path to the folds file when specified. - Do not include stratified folds information when folds file is specified. - Modify number of CV folds to be `K via folds file` if a folds file is specified. - Modify grid search folds to be `K via folds file` if they also end up using the folds file. - Show the value of `use_folds_file_for_grid_search` when appropriate.

- Modify tests.

coveralls · 2017-10-18T19:09:15Z

Coverage increased (+0.04%) to 92.026% when pulling 7c8e800 on fix/streamline-fancy-results-file into 0052077 on master.

mulhod

Looks good! I think I found a couple typo bugs. Not really sure.

mulhod · 2017-10-18T19:05:31Z

skll/experiments.py

    if lrd['task'] == 'cross_validate':
        print('Number of Folds: {}'.format(lrd['cv_folds']),
              file=output_file)
-        print('Stratified Folds: {}'.format(lrd['stratified_folds']),
-              file=output_file)
+        if not lrd['cv_folds'].endswith('folds file'):


Is "folds file" supposed to be "folds_file" here? Can we just make it check if the string is equal to "folds file"/"folds_file" rather than using endswith or is there some reason that might not work?

Nope, it's trying to match the string on the right hand side of the colon in the results file which should say 3 via folds file so it will always ends with "folds file" not "folds_file".

Ok, I see. Misread that.

No worries. If you want to see some example output for various conditions, you can play around with cross_validate.cfg in the titanic example and look at the results files.

mulhod · 2017-10-18T19:05:52Z

skll/experiments.py

+              file=output_file)
+    if (lrd['task'] == 'cross_validate' and
+        lrd['grid_search'] and
+        lrd['cv_folds'].endswith('folds file')):


See comment above.

See reply above.

Matt didn't actually mean to request changes :)

aoifecahill · 2017-10-18T20:20:28Z

tests/test_output.py

@@ -170,7 +171,7 @@ def check_summary_score(use_feature_hashing=False):
            # the learner results dictionaries should have 29 rows,


I think the comment needs to be updated with the new total number of rows.

desilinguist added 12 commits October 17, 2017 13:19

Merge branch 'fix/cv-folds-file-is-slower' into fix/printing-folds-in…

cb1e1a9

…-results-file

Fix docstring.

744fb11

Modify return value for _parse_config_file

fd94b27

- Include folds file path as one of the return values so that it can be included in the results file.

Deal with addition of folds file path

104e55b

- Modify tests.

Modify tests to account for changes.

1f19ad9

Fix typo in documentation.

89ccea6

Only print grid search info if we are doing it.

87fb2d2

Fix typo.

927a36a

Add test to check results file for cross-validation

16d9f53

Forgot to add template for new test.

cca3800

Merge branch 'master' into fix/streamline-fancy-results-file

397b991

desilinguist self-assigned this Oct 18, 2017

desilinguist requested review from aoifecahill, mulhod and jbiggsets October 18, 2017 18:20

Test that the folds file path is in the results.

7c8e800

EducationalTestingService deleted a comment from coveralls Oct 18, 2017

mulhod previously requested changes Oct 18, 2017

View reviewed changes

mulhod approved these changes Oct 18, 2017

View reviewed changes

jbiggsets approved these changes Oct 18, 2017

View reviewed changes

aoifecahill reviewed Oct 18, 2017

View reviewed changes

aoifecahill approved these changes Oct 18, 2017

View reviewed changes

Fix typo in comment [ci sklp]

3722742

desilinguist merged commit 4a1cc23 into master Oct 18, 2017

desilinguist deleted the fix/streamline-fancy-results-file branch October 18, 2017 20:37

desilinguist mentioned this pull request Oct 19, 2017

Fix how folds are printed in results file #371

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streamline the results file for cross-validation task #372

Streamline the results file for cross-validation task #372

desilinguist commented Oct 18, 2017 •

edited

Loading

coveralls commented Oct 18, 2017 •

edited

Loading

mulhod left a comment

mulhod Oct 18, 2017

desilinguist Oct 18, 2017

mulhod Oct 18, 2017

desilinguist Oct 18, 2017

mulhod Oct 18, 2017

desilinguist Oct 18, 2017

aoifecahill Oct 18, 2017

		@@ -170,7 +171,7 @@ def check_summary_score(use_feature_hashing=False):
		# the learner results dictionaries should have 29 rows,

Streamline the results file for cross-validation task #372

Streamline the results file for cross-validation task #372

Conversation

desilinguist commented Oct 18, 2017 • edited Loading

coveralls commented Oct 18, 2017 • edited Loading

mulhod left a comment

Choose a reason for hiding this comment

mulhod Oct 18, 2017

Choose a reason for hiding this comment

desilinguist Oct 18, 2017

Choose a reason for hiding this comment

mulhod Oct 18, 2017

Choose a reason for hiding this comment

desilinguist Oct 18, 2017

Choose a reason for hiding this comment

mulhod Oct 18, 2017

Choose a reason for hiding this comment

desilinguist Oct 18, 2017

Choose a reason for hiding this comment

aoifecahill Oct 18, 2017

Choose a reason for hiding this comment

desilinguist commented Oct 18, 2017 •

edited

Loading

coveralls commented Oct 18, 2017 •

edited

Loading