-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark GapEncoder
divergence
#593
Benchmark GapEncoder
divergence
#593
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @LilianBoulard, here are some quick comments. My main point is that benchmarks should be simpler to read and program.
|
||
print(self.W_.shape, self.A_.shape, self.B_.shape) | ||
|
||
self.benchmark_results_.append( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd change 2 things here:
- Only take the last values for this run instead of accumulating data. Meaning, this
append()
should be placed after thebreak
- It's safer and easier for debugging to write results in txt / csv file than accumulating things in RAM (if you cancel the run the results are not lost)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Okay, I see your point
- True, I'll look into that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've improved monitor
to automatically save intermediary results, thanks for the suggestion :)
For the first point, I've left it before the break for now so that we can see with greater detail what's going on (and to be honest, the matrices aren't that large for employee_salaries
, only ~3700 x 10 each), we might reconsider this when working with the other datasets!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the last part, it's just that we're going to have mostly duplicated points, so it will be more work for you to clean your data for plotting ;)
Unless I'm missing something, we want to benchmark the GapEncoder quickly since it's currently the main bottleneck for the TableVectorizer. Improving the benchmark script brings value, but we should aim at simplicity and getting the result first, then iterate on the benchmark. This way, we can parallelize work and improve the GapEncoder default hyper-parameters, for example. WDYT? |
…_gap_divergence # Conflicts: # benchmarks/utils/monitor.py
Looking at the plots above, I think that can see that the drug_directory dataset is particularly difficult. In particular, on the top left plot, the score of the corresponding problem seems to be going up as a function of iterations, although it really shouldn't be. Maybe that would be an interesting specific case to investigate |
…_gap_divergence # Conflicts: # benchmarks/bench_tablevectorizer_tuning.py # benchmarks/utils/__init__.py # benchmarks/utils/_various.py # benchmarks/utils/monitor.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvements to the benchmarking suite, thanks! The doc is much clearer, and the hot-load is cool
_monitored[key].append(value) | ||
# To avoid repeating code, we move the result | ||
# mapping(s) to a list. | ||
result_mappings = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I right that this part is a bit complicated to support the monitored function returning a list of dict? Is this a possible situtation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code is a bit complex I agree, but I haven't found a better way of doing it. And yes, we do want to support getting returned a list of dicts from the monitored function, "why" is explained in the docstring :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah sorry, we're actually using this in this benchmark 🤦♂️
Co-authored-by: LeoGrin <45738728+LeoGrin@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
* Improve framework * Add Gap divergence benchmark * Set initial iter values * Add omitted value (score) * Force keyword arguments and add progress saving * Minor fixes * Update to main * Add pyarrow to benchmark requirements * Implement cross-validation * Update README * Parallelize cross-validation * Fix attribute access * Fix attribute access * Fix unpacking * Fix results naming * Fix results bug * Multiple columns support and W_change plot v1 * Refactor dataset getters * Adapt getters usage to new format * Small fixes * New plots * Fix dataset categorization * Update used datasets * Add score per inner iteration plot * Add benchmark results * Compute the score after tuning * Add issue link for `road_safety` * Update results * Update benchmarks/utils/monitor.py Co-authored-by: LeoGrin <45738728+LeoGrin@users.noreply.github.com> --------- Co-authored-by: LeoGrin <45738728+LeoGrin@users.noreply.github.com>
Following a discussion with @Vincent-Maladiere, here is a benchmark meant to monitor some values during the internal iterations of the
GapEncoder
, which could help us understand what's going wrong.This PR also improves upon #574 by fixing a bug with
n_repeat
, and adding support for multiple return dictionaries (which, here, is useful to save the values at each iteration).Please read the file
bench_gap_divergence.py
's docstring for an in-depth explanation of the process, and give us your insight on what I could add before running it. @GaelVaroquauxPart of the saga #342