Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Decoupled CV to master (#240) #271

Merged
merged 2 commits into from Apr 30, 2021
Merged

Conversation

rth
Copy link
Collaborator

@rth rth commented Apr 23, 2021

This is a backport of #243 to master.

I'll mention paris-saclay-cds/ramp-board#515 in the test protocol documentation which should allow running ramp-board CI on arbitrary ramp-workflow PRs using manual triggers

@codecov
Copy link

codecov bot commented Apr 23, 2021

Codecov Report

Merging #271 (6102a34) into master (fbedf04) will increase coverage by 0.24%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #271      +/-   ##
==========================================
+ Coverage   82.34%   82.58%   +0.24%     
==========================================
  Files         136      138       +2     
  Lines        4995     5059      +64     
==========================================
+ Hits         4113     4178      +65     
+ Misses        882      881       -1     
Impacted Files Coverage Δ
rampwf/prediction_types/base.py 87.09% <100.00%> (+0.88%) ⬆️
rampwf/prediction_types/clustering.py 95.23% <100.00%> (+6.34%) ⬆️
rampwf/prediction_types/combined.py 97.72% <100.00%> (+0.42%) ⬆️
rampwf/prediction_types/detection.py 85.18% <100.00%> (+0.56%) ⬆️
rampwf/prediction_types/multiclass.py 95.55% <100.00%> (+0.43%) ⬆️
rampwf/prediction_types/regression.py 95.45% <100.00%> (+1.01%) ⬆️
rampwf/score_types/base.py 69.23% <100.00%> (-4.11%) ⬇️
rampwf/score_types/brier_score.py 100.00% <100.00%> (ø)
rampwf/score_types/classifier_base.py 100.00% <100.00%> (ø)
rampwf/score_types/combined.py 91.66% <100.00%> (ø)
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fbedf04...6102a34. Read the comment docs.

@albertcthomas
Copy link
Collaborator

I think the first 2 commits (the one different to Decoupled CV) are already on master and not needed in this PR. I don't think they appear in the diff so maybe it's not a big deal.

I also think that we could use this opportunity to remove the co-authoring of the decoupled CV? @kegl used 2 usernames.

@albertcthomas
Copy link
Collaborator

I don't think they appear in the diff so maybe it's not a big deal.

yes they don't appear in the changes.

* passing fold to workflow.test_submission

* passing fold indices to Prediction init instead of indexing in util.submission.run_submission_on_cv_fold

* getting rid of passing valid_indexes to scores

* getting rid of valid_indexes in score_type.score_function

* flake

* functional data flow in doc

* training script in doc

* docstring minor

* Add test protocol to README

* Update README.rst
@rth
Copy link
Collaborator Author

rth commented Apr 23, 2021

I also think that we could use this opportunity to remove the co-authoring of the decoupled CV

Thanks for checking @albertcthomas ! I removed it and the 2 extra commits.


It is called in `ramp_test_submission` and at the [ramp-board][rboard] frontend when evaluating user submissions. Most of the time it uses the default implementation in `BaseScoreType` that optionally selects the cv fold and checks if the number of rows are identical in `y_true` and `y_pred`, then calls `__call__(self, y_true, y_pred)`. `ClassifierBaseScoreType` does the same except it calls `__call__` with `y_true_label_index` and `y_pred_label_index`. When __call__ raises a non implemented error, `score_function` has to be overridden.

It is called in `ramp_test_submission` and at the [ramp-board][rboard] frontend when evaluating user submissions. Most of the time it uses the default implementation in `BaseScoreType` that checks if the number of rows are identical in `y_true` and `y_pred`, then calls `__call__(self, y_true, y_pred)`. `ClassifierBaseScoreType` does the same except it calls `__call__` with `y_true_label_index` and `y_pred_label_index`. When __call__ raises a non implemented error, `score_function` has to be overridden.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should maybe use ramp-test instead of ramp_test_submission

trained_workflow = problem.workflow.train_submission(
'submissions/starting_kit', X_train, y_train)
y_pred_test = problem.workflow.test_submission(
trained_workflow, X_test)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice to have this documentation! I think few users are aware that it's possible.

Copy link
Collaborator Author

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I have changed slightly the procedure for testing with ramp-board. Now one can go to ramp-board github actions and copy paste a pip installable URL of ramp-workflow to run CI with,
image
It's still a bit manual and it requires write permissions to ramp-board to trigger CI, but at least it doesn't require to push commits to ramp-board on a separate branch and keep that branch in sync with master as the previous procedure did.

Code wise this LGTM. I can also confirm this passes ramp-board CI on master

@rth rth merged commit e12cf36 into paris-saclay-cds:master Apr 30, 2021
@rth rth deleted the decoupled-cv branch April 30, 2021 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants