Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write out cross-validation results immediately #2483

Merged
merged 7 commits into from Jan 9, 2024
Merged

Conversation

LukasBeiske
Copy link
Contributor

@LukasBeiske LukasBeiske commented Dec 14, 2023

Fixes #2480

Copy link

codecov bot commented Dec 14, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (75a38ee) 92.44% compared to head (19791c9) 92.47%.
Report is 21 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2483      +/-   ##
==========================================
+ Coverage   92.44%   92.47%   +0.02%     
==========================================
  Files         234      234              
  Lines       19917    19965      +48     
==========================================
+ Hits        18413    18462      +49     
+ Misses       1504     1503       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ctapipe/reco/sklearn.py Outdated Show resolved Hide resolved
Copy link
Member

@maxnoe maxnoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start, but you should

a) Check overwrite conditions in __init__: if output file exists and not overwrite, raise an error, if output file exists and overwrite=true truncate it (tables.open_file(path, mode='w').

b) Move the writing into the cv loop using write_table(..., append=True) to further reduce memory usage

c) Add writing of metadata to the output file

Fix tools after removing unused argument
@LukasBeiske
Copy link
Contributor Author

Good start, but you should

a) Check overwrite conditions in __init__: if output file exists and not overwrite, raise an error, if output file exists and overwrite=true truncate it (tables.open_file(path, mode='w').

b) Move the writing into the cv loop using write_table(..., append=True) to further reduce memory usage

c) Add writing of metadata to the output file

a) The error if file exists and not overwrite is already raised by the train tools via Tool.check_output.

a)/b) Using write_table(..., append=True, mode='w') does not work, does it? Isn't this basically the same as write_table(..., append=True, overwrite=True)?

c) Is there any other metadata requiered besides some column descriptions?

@maxnoe
Copy link
Member

maxnoe commented Dec 14, 2023

a) The error if file exists and not overwrite is already raised by the train tools via Tool.check_output.

Also for the cv output file and not just the model?

@maxnoe
Copy link
Member

maxnoe commented Dec 14, 2023

a)/b) Using write_table(..., append=True, mode='w') does not work, does it? Isn't this basically the same as write_table(..., append=True, overwrite=True)?

You need to truncate the file once in __init__ and then only use append=True

@LukasBeiske
Copy link
Contributor Author

a) The error if file exists and not overwrite is already raised by the train tools via Tool.check_output.

Also for the cv output file and not just the model?

Yes, both output paths are passed and CrossValidator.ouput_path gets ignored if it is None

@maxnoe maxnoe merged commit 75d0ba4 into main Jan 9, 2024
13 of 14 checks passed
@maxnoe maxnoe deleted the write_cv_earlier branch January 9, 2024 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write cross-validation results immediately after cross validation
3 participants