quantile estimation v1 #2223

lukebrawleysmith · 2024-03-12T22:59:35Z

This code has 3 primary classes:

Quantile Statistic - creates effect estimate and CI for quantile nu when data are iid (e.g., event-level data).

Quantile Statistic - creates effect estimate and CI for quantile nu when data are clustered (e.g., user-level data).

GaussianBayesianABTest - creates posterior distribution using prior on treatment effect and frequentist effect estimate.

github-actions · 2024-03-12T23:16:21Z

Your preview environment pr-2223-bttf has been deployed.

Preview environment endpoints are available at:

lukesonnet

looking good! some dataclass config feedback, but otherwise it looks close.

packages/stats/gbstats/bayesian/tests.py

…ntile level

packages/stats/gbstats/bayesian/tests.py

packages/stats/gbstats/models/statistics.py

lukesonnet · 2024-03-18T23:22:25Z

packages/stats/gbstats/models/statistics.py

+
+    @property
+    def variance_init(self) -> float:
+        multiplier = scipy.stats.norm.ppf(0.975, loc=0, scale=1)


Doesn't 0.975 need to be computed from some alpha from the configuration? Maybe QuantileStatistic also needs to take alpha as an input and use it here and use it when computing the n for the issue of requesting a quantile < 0 or > 1 (the other comment above?)

Nit: Is there a reason you picked ccr over alpha for the name here? I feel like I'd like to start moving just towards using alpha everywhere across engines, but lmk if you disagree.

updated to alpha

lukesonnet · 2024-03-20T15:59:12Z

A few nits, a few slightly bigger questions. I think once there's a few unit tests in place we can land this in anticipation of merging it with the SQL PR.

packages/stats/gbstats/bayesian/tests.py

lukesonnet · 2024-03-20T15:48:34Z

packages/stats/gbstats/models/statistics.py

+    n: int  # number of events here
+    nu: float
+    ccr: float
+    q_hat: float  # sample estimate


nit: Can we call these quantile_* instead? I'd rather be a touch more verbouse, but up to you.

I don't think you made this change?

apologies, forgot to save my file, will be in latest push.

packages/stats/gbstats/models/statistics.py

lukesonnet · 2024-03-20T15:58:41Z

packages/stats/gbstats/models/statistics.py

+
+    @property
+    def variance_init(self) -> float:
+        multiplier = scipy.stats.norm.ppf(0.975, loc=0, scale=1)


Nit: Is there a reason you picked ccr over alpha for the name here? I feel like I'd like to start moving just towards using alpha everywhere across engines, but lmk if you disagree.

…ile, and renaming q_* to quantile_*

lukesonnet · 2024-03-21T15:56:16Z

packages/stats/gbstats/bayesian/tests.py

+            mu_1 = mu_0 + self.prior_effect.mean
+            v = 0.5 * self.prior_effect.variance
+        self.prior_a = GaussianPrior(mean=mu_0, variance=v, pseudo_n=1)
+        self.prior_b = GaussianPrior(mean=mu_1, variance=v, pseudo_n=1)


Should the pseudo_n from the self.prior_effect GaussianPrior be passed through here?

I only ask because it seems like there isn't a way to pass through a flat, improper prior to this test class, right? Should we set up a way to do that?

You can kind of hack it into the existing GaussianPrior where if prior_effect.pseudo_n = 0, then we just pass this through like I suggest. Or we can be more explicit, where you can also specify no prior and that modifies this method as well.

good idea, I implemented this. When we use non-flat priors in the future, we still want to use pseudo-n = 1.

…ior_effect in GaussianEffectABTest

lukesonnet · 2024-03-22T05:11:54Z

packages/stats/tests/bayesian/test_tests.py

+        self.assertEqual(result.chance_to_win, 0.5)
+        self.assertEqual(result.expected, 0)
+
+    def test_inexact_log_approximation(self):


Why do we need to replicate this test here? Maybe copy paste?

good catch, i removed this

lukesonnet · 2024-03-22T05:19:41Z

packages/stats/gbstats/gbstats.py

@@ -211,20 +216,40 @@ def get_configured_test(
                ),
            )
    else:
+        assert isinstance(
+            stat_a, type(stat_b)
+        ), "stat_a and stat_b must be of same type."


Sorry to nit, I should have pointed this out initially, but this will pass even so long as stat_a is the same class or a subclass of stat_b. e.g is stat_a is a SampleMeanStatistic and stat_b is a Statistic, this will pass.

So I think we should instead just do assert type(stat_a) is type(stat_b)

…corporating n* approximation in stats engine

lukesonnet

Looks good to me! There were two further nits that I just encountered when reading your PR again (sorry for missing these earlier), but you can accommodate them by first merging in my PR to your branch: #2288

…s to clusterd statistic (#2288)

quantile estimation v1

fc13342

lukebrawleysmith requested a review from lukesonnet March 12, 2024 22:59

lukesonnet reviewed Mar 13, 2024

View reviewed changes

packages/stats/gbstats/bayesian/tests.py Show resolved Hide resolved

Luke Smith added 3 commits March 15, 2024 09:30

addressing first round of sonnet comments.

bd6c0c3

correcting frequentist_variance method

03553fb

adding logic to return 0 variance if sample size is too small for qua…

a311066

…ntile level

lukesonnet reviewed Mar 18, 2024

View reviewed changes

addressing sonnet comments

6c58ac0

lukesonnet reviewed Mar 20, 2024

View reviewed changes

changing ccr to alpha, adding additional sample size checks for quant…

3a6b635

…ile, and renaming q_* to quantile_*

lukesonnet reviewed Mar 21, 2024

View reviewed changes

renaming q_* to quantile_*, adding unit tests, using pseudo_n from pr…

bf8369c

…ior_effect in GaussianEffectABTest

lukesonnet reviewed Mar 22, 2024

View reviewed changes

removing unneeded tests, improving type testing in gbstats.py, and in…

1b3d03d

…corporating n* approximation in stats engine

lukesonnet approved these changes Mar 27, 2024

View reviewed changes

lukesonnet mentioned this pull request Mar 27, 2024

Quantile Testing - Release with FE and API creation #2289

Merged

12 tasks

Modify QuantileClusteredStatistic name; move unneeded input statistic…

0980590

…s to clusterd statistic (#2288)

lukebrawleysmith merged commit 91d0d99 into main Mar 27, 2024
3 checks passed

lukebrawleysmith deleted the smith-quantile-testing branch March 27, 2024 15:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quantile estimation v1 #2223

quantile estimation v1 #2223

lukebrawleysmith commented Mar 12, 2024

github-actions bot commented Mar 12, 2024 •

edited

lukesonnet left a comment

lukesonnet Mar 18, 2024

lukebrawleysmith Mar 18, 2024 •

edited

lukesonnet Mar 20, 2024

lukebrawleysmith Mar 20, 2024

lukesonnet commented Mar 20, 2024

lukesonnet Mar 20, 2024

lukebrawleysmith Mar 20, 2024

lukesonnet Mar 21, 2024

lukebrawleysmith Mar 21, 2024

lukesonnet Mar 20, 2024

lukesonnet Mar 21, 2024 •

edited

lukesonnet Mar 21, 2024

lukebrawleysmith Mar 21, 2024

lukesonnet Mar 22, 2024

lukebrawleysmith Mar 27, 2024

lukesonnet Mar 22, 2024

lukebrawleysmith Mar 26, 2024

lukesonnet left a comment

quantile estimation v1 #2223

quantile estimation v1 #2223

Conversation

lukebrawleysmith commented Mar 12, 2024

github-actions bot commented Mar 12, 2024 • edited

lukesonnet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukebrawleysmith Mar 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukesonnet commented Mar 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukesonnet Mar 21, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukesonnet left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 12, 2024 •

edited

lukebrawleysmith Mar 18, 2024 •

edited

lukesonnet Mar 21, 2024 •

edited