Benchmark `GapEncoder` divergence #593

LilianBoulard · 2023-06-12T17:00:05Z

Following a discussion with @Vincent-Maladiere, here is a benchmark meant to monitor some values during the internal iterations of the GapEncoder, which could help us understand what's going wrong.

This PR also improves upon #574 by fixing a bug with n_repeat, and adding support for multiple return dictionaries (which, here, is useful to save the values at each iteration).

Please read the file bench_gap_divergence.py's docstring for an in-depth explanation of the process, and give us your insight on what I could add before running it. @GaelVaroquaux

Part of the saga #342

Vincent-Maladiere

Hey @LilianBoulard, here are some quick comments. My main point is that benchmarks should be simpler to read and program.

Vincent-Maladiere · 2023-06-13T08:49:45Z

benchmarks/bench_gap_divergence.py

+
+            print(self.W_.shape, self.A_.shape, self.B_.shape)
+
+            self.benchmark_results_.append(


I'd change 2 things here:

Only take the last values for this run instead of accumulating data. Meaning, this append() should be placed after the break

It's safer and easier for debugging to write results in txt / csv file than accumulating things in RAM (if you cancel the run the results are not lost)

Okay, I see your point

True, I'll look into that

I've improved monitor to automatically save intermediary results, thanks for the suggestion :)
For the first point, I've left it before the break for now so that we can see with greater detail what's going on (and to be honest, the matrices aren't that large for employee_salaries, only ~3700 x 10 each), we might reconsider this when working with the other datasets!

For the last part, it's just that we're going to have mostly duplicated points, so it will be more work for you to clean your data for plotting ;)

benchmarks/bench_gap_divergence.py

Vincent-Maladiere · 2023-06-22T21:38:41Z

Unless I'm missing something, we want to benchmark the GapEncoder quickly since it's currently the main bottleneck for the TableVectorizer. Improving the benchmark script brings value, but we should aim at simplicity and getting the result first, then iterate on the benchmark.

This way, we can parallelize work and improve the GapEncoder default hyper-parameters, for example.

WDYT?

…_gap_divergence # Conflicts: # benchmarks/utils/monitor.py

benchmarks/bench_gap_divergence.py

LilianBoulard · 2023-07-17T09:51:24Z

Here's the result of the --plot:

Note:

max_iter_e_step is the inner loop
gap_iter is the outer loop
The red dashed line in the center-top plot is the tolerence (parameter tol)

GaelVaroquaux · 2023-07-17T21:00:45Z

Looking at the plots above, I think that can see that the drug_directory dataset is particularly difficult.

In particular, on the top left plot, the score of the corresponding problem seems to be going up as a function of iterations, although it really shouldn't be.

Maybe that would be an interesting specific case to investigate

…_gap_divergence # Conflicts: # benchmarks/bench_tablevectorizer_tuning.py # benchmarks/utils/__init__.py # benchmarks/utils/_various.py # benchmarks/utils/monitor.py

LeoGrin

Nice improvements to the benchmarking suite, thanks! The doc is much clearer, and the hot-load is cool

LeoGrin · 2023-08-04T17:50:54Z

benchmarks/utils/monitor.py

-                            _monitored[key].append(value)
+                    # To avoid repeating code, we move the result
+                    # mapping(s) to a list.
+                    result_mappings = []


Am I right that this part is a bit complicated to support the monitored function returning a list of dict? Is this a possible situtation?

The code is a bit complex I agree, but I haven't found a better way of doing it. And yes, we do want to support getting returned a list of dicts from the monitored function, "why" is explained in the docstring :)

Oh yeah sorry, we're actually using this in this benchmark 🤦‍♂️

benchmarks/utils/monitor.py

Co-authored-by: LeoGrin <45738728+LeoGrin@users.noreply.github.com>

jovan-stojanovic

Thanks!

* Improve framework * Add Gap divergence benchmark * Set initial iter values * Add omitted value (score) * Force keyword arguments and add progress saving * Minor fixes * Update to main * Add pyarrow to benchmark requirements * Implement cross-validation * Update README * Parallelize cross-validation * Fix attribute access * Fix attribute access * Fix unpacking * Fix results naming * Fix results bug * Multiple columns support and W_change plot v1 * Refactor dataset getters * Adapt getters usage to new format * Small fixes * New plots * Fix dataset categorization * Update used datasets * Add score per inner iteration plot * Add benchmark results * Compute the score after tuning * Add issue link for `road_safety` * Update results * Update benchmarks/utils/monitor.py Co-authored-by: LeoGrin <45738728+LeoGrin@users.noreply.github.com> --------- Co-authored-by: LeoGrin <45738728+LeoGrin@users.noreply.github.com>

LilianBoulard added 3 commits June 12, 2023 18:50

Improve framework

8669c8b

Add Gap divergence benchmark

1dcda9c

Set initial iter values

c18c6f8

LilianBoulard added enhancement New feature or request no changelog needed labels Jun 12, 2023

LilianBoulard requested review from Vincent-Maladiere and jovan-stojanovic June 12, 2023 17:00

LilianBoulard self-assigned this Jun 12, 2023

Add omitted value (score)

c07c38c

LilianBoulard requested review from GaelVaroquaux and removed request for GaelVaroquaux June 12, 2023 18:47

Vincent-Maladiere reviewed Jun 13, 2023

View reviewed changes

LilianBoulard mentioned this pull request Jun 13, 2023

GapEncoder hyperparameter tuning #594

Closed

LilianBoulard added 2 commits June 13, 2023 17:25

Force keyword arguments and add progress saving

634d360

Minor fixes

8c3e653

LilianBoulard mentioned this pull request Jun 19, 2023

Benchmark improvements #598

Draft

LilianBoulard added 10 commits June 23, 2023 13:34

Merge branch 'main' of https://github.com/skrub-data/skrub into bench…

bffe614

…_gap_divergence # Conflicts: # benchmarks/utils/monitor.py

Update to main

43a8c0d

Add pyarrow to benchmark requirements

29470e1

Implement cross-validation

e1dc32a

Update README

f833b00

Parallelize cross-validation

406d858

Fix attribute access

f053efd

Fix attribute access

33cd5e1

Fix unpacking

302d8e5

Fix results naming

309146a

LilianBoulard added benchmarks Something related to the benchmarks and removed enhancement New feature or request labels Jun 26, 2023

Fix results bug

7ec1a8d

LilianBoulard added 9 commits June 26, 2023 16:04

Multiple columns support and W_change plot v1

bb89954

Refactor dataset getters

9e6f7ad

Adapt getters usage to new format

f912b3c

Small fixes

377a707

New plots

d16d6d5

Fix dataset categorization

2ad9312

Update used datasets

091f3b2

Add score per inner iteration plot

415ef64

Add benchmark results

122ccfa

LeoGrin reviewed Jun 30, 2023

View reviewed changes

benchmarks/bench_gap_divergence.py Outdated Show resolved Hide resolved

LilianBoulard added 3 commits June 30, 2023 15:27

Compute the score after tuning

4ed66a1

Add issue link for road_safety

0dbd513

Update results

fa5a61c

simonamaggio mentioned this pull request Jul 20, 2023

Benchmark Early Stopping for the GapEncoder #663

Closed

LilianBoulard mentioned this pull request Jul 24, 2023

Create a script to run the TableVectorizer on all openml datasets #665

Merged

Merge branch 'main' of https://github.com/skrub-data/skrub into bench…

5fe6470

…_gap_divergence # Conflicts: # benchmarks/bench_tablevectorizer_tuning.py # benchmarks/utils/__init__.py # benchmarks/utils/_various.py # benchmarks/utils/monitor.py

LilianBoulard requested a review from LeoGrin July 27, 2023 09:05

This was referenced Jul 28, 2023

Benchmark gap encoder early stopping #681

Merged

Gap encoder speedups #680

Merged

LeoGrin approved these changes Aug 4, 2023

View reviewed changes

Update benchmarks/utils/monitor.py

2dec2ba

Co-authored-by: LeoGrin <45738728+LeoGrin@users.noreply.github.com>

jovan-stojanovic approved these changes Aug 7, 2023

View reviewed changes

jovan-stojanovic merged commit 5e22f20 into skrub-data:main Aug 7, 2023
24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark `GapEncoder` divergence #593

Benchmark `GapEncoder` divergence #593

LilianBoulard commented Jun 12, 2023 •

edited

Loading

Vincent-Maladiere left a comment

Vincent-Maladiere Jun 13, 2023

LilianBoulard Jun 13, 2023

LilianBoulard Jun 14, 2023 •

edited

Loading

Vincent-Maladiere Jun 14, 2023

Vincent-Maladiere commented Jun 22, 2023

LilianBoulard commented Jul 17, 2023 •

edited

Loading

GaelVaroquaux commented Jul 17, 2023

LeoGrin left a comment

LeoGrin Aug 4, 2023

LilianBoulard Aug 5, 2023 •

edited

Loading

LeoGrin Aug 5, 2023

jovan-stojanovic left a comment


		print(self.W_.shape, self.A_.shape, self.B_.shape)

		self.benchmark_results_.append(

Benchmark GapEncoder divergence #593

Benchmark GapEncoder divergence #593

Conversation

LilianBoulard commented Jun 12, 2023 • edited Loading

Vincent-Maladiere left a comment

Choose a reason for hiding this comment

Vincent-Maladiere Jun 13, 2023

Choose a reason for hiding this comment

LilianBoulard Jun 13, 2023

Choose a reason for hiding this comment

LilianBoulard Jun 14, 2023 • edited Loading

Choose a reason for hiding this comment

Vincent-Maladiere Jun 14, 2023

Choose a reason for hiding this comment

Vincent-Maladiere commented Jun 22, 2023

LilianBoulard commented Jul 17, 2023 • edited Loading

GaelVaroquaux commented Jul 17, 2023

LeoGrin left a comment

Choose a reason for hiding this comment

LeoGrin Aug 4, 2023

Choose a reason for hiding this comment

LilianBoulard Aug 5, 2023 • edited Loading

Choose a reason for hiding this comment

LeoGrin Aug 5, 2023

Choose a reason for hiding this comment

jovan-stojanovic left a comment

Choose a reason for hiding this comment

Benchmark `GapEncoder` divergence #593

Benchmark `GapEncoder` divergence #593

LilianBoulard commented Jun 12, 2023 •

edited

Loading

LilianBoulard Jun 14, 2023 •

edited

Loading

LilianBoulard commented Jul 17, 2023 •

edited

Loading

LilianBoulard Aug 5, 2023 •

edited

Loading