Add golden master with artificial data #101

MarcAntoineSchmidtQC · 2020-04-30T20:15:05Z

To add new "simple" golden master tests, you need to modify the gm_model_parameters dictionary by adding an entry. Then, run the file with python (python tests/sklearn_fork/test_golden_master.py). This will create the golden master data for the tests (stored in the golden_master folder).

Whenever pytest is called, it will compare the new result to the one stored.

…ting

ElizabethSantorellaQC · 2020-05-01T01:10:51Z

Could the golden master tests be based off our existing benchmark suite?

ElizabethSantorellaQC · 2020-05-01T01:21:49Z

Looks good to me!

MarcAntoineSchmidtQC · 2020-05-01T01:53:14Z

@ElizabethSantorellaQC, sure! Do you want the public data that the benchmarks are using or the wrapper around the results, or both?

ElizabethSantorellaQC · 2020-05-01T15:16:50Z

@MarcAntoineSchmidtQC I'm not sure. I'm not sure if including the benchmarks is a good idea at all, since we might want to change those and since they take a while to run. Just a thought.

tbenthompson · 2020-05-01T15:27:10Z

From a development efficiency standpoint, I really really like having the golden master testing integrated into the benchmarking suite. That way while I make optimizations or refactorings, I can just run one command to profile/benchmark and that command will also make sure I didn't introduce a bug.

ElizabethSantorellaQC · 2020-05-06T19:07:02Z

src/glm_benchmarks/bench_sklearn_fork.py

        random_state=random_seed,
+        start_params="zero",


why not the default of "guess"? that gives better performance in my experience

I got some of the problems diverging when I used "guess" but easily converging when starting at zero. I feel like it's not a robust feature. Maybe we can add a **kwargs to the benchmark function so that you can specify this type of things without changing the default like I did. Do you like that idea?

How recently did you try this? I improved the _guess_start_params function in one of the more recent commits to the offset branch. I'd be curious to hear for which problems guess_start_params is not producing good estimates (for #111 and maybe #112)

I've dropped this (now using "guess"), but the main issue is with gamma. see issue #113

tests/sklearn_fork/test_benchmark_golden_master.py

ElizabethSantorellaQC · 2020-05-06T19:23:58Z

tests/sklearn_fork/test_benchmark_golden_master.py

+            regularization_strength=0.1,
+        )
+
+        with open(git_root(f"golden_master/benchmarks_gm/{Pn}.pkl"), "wb") as fh:


maybe we could just store the intercept and coefficients, since that's all we're testing anyway? Then we could store in a more human-readable format, which I have a mild preference for

That's a really good idea! JSON here we come.

(I started by also testing n_iter but this is platform-dependent for some problems, so I dropped it)

Woah, any guess why n_iter would be platform-dependent?

tests/sklearn_fork/test_benchmark_golden_master.py

tests/sklearn_fork/test_golden_master.py

ElizabethSantorellaQC · 2020-05-12T16:26:33Z

src/glm_benchmarks/main.py

@@ -94,6 +94,7 @@ def execute_problem_library(
    single_precision: bool = False,
    print_diagnostics: bool = True,
    regularization_strength: float = None,
+    model_kwargs: dict = {},


what is model_kwargs doing here? And should line 116 not be **model_kwargs?

It let's you run a benchmark with arbitrary model arguments without changing the underlying bench_sklearn_fork() function.. For instance, something I did was to set model_kwargs = {'start_params': 'zero', 'solver': 'cd'}.

The original reason I kept it a dictionary and did not use the **kwargs is to allow us to have different kwargs in the future (e.g. a fit_kwargs). This is similar to https://seaborn.pydata.org/generated/seaborn.lmplot.html#seaborn.lmplot where you can set kwargs for the lineplot and the scatterplot separatly. But now I think this is not a good idea. Let me switch it to the **kwargs thing.

ElizabethSantorellaQC

looks great

* adding a bunch of parametrization * Moved Link and ExponentialDispersionModel classes to their own files * refactoring * removed util file * added sparse test and OHE features * removed old file * overwrite golden master * added gaussian * getting ready to merge master * golden master for benchmarks * regularization strenght overwriting parameter * changed test tolerance * changed abs tolerance * removed changes to distribution.py * added offset * switch to json and cleanup * automating skipped problems * switch from dict to ** * forgot to uncomment something * removed regularization_strenght param because we can use **kwargs * back to regularization_strenght. It was a bad idea. Co-authored-by: Marc-Antoine Schmidt <mschmidt@wayfair.com> Co-authored-by: Elizabeth Santorella <elizabeth.santorella@gmail.com>

Marc-Antoine Schmidt and others added 10 commits April 29, 2020 15:57

adding a bunch of parametrization

bc71cc6

Moved Link and ExponentialDispersionModel classes to their own files

1f8d54b

Merge remote-tracking branch 'origin/refactor' into mschmidt_more_tes…

804735a

…ting

Merge branch 'master' into mschmidt_more_testing

31d1c57

refactoring

43f1816

removed util file

89cd339

added sparse test and OHE features

c8e3c49

removed old file

befcbf0

overwrite golden master

0296af0

added gaussian

e72f2ab

Marc-Antoine Schmidt added 7 commits May 1, 2020 13:15

Merge branch 'master' into mschmidt_more_testing

65bd0dc

getting ready to merge master

60b4b78

Merge branch 'master' into mschmidt_more_testing

ed78beb

golden master for benchmarks

f93adf3

regularization strenght overwriting parameter

1d2bbe1

changed test tolerance

dac4310

changed abs tolerance

9cc2734

MarcAntoineSchmidtQC changed the title ~~[WIP] Add golden master with artificial data~~ Add golden master with artificial data May 6, 2020

Marc-Antoine Schmidt added 3 commits May 6, 2020 12:58

Merge remote-tracking branch 'origin/master' into mschmidt_more_testing

88c0dee

removed changes to distribution.py

9976369

added offset

5e1dae2