You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm interested in using this library so I tested to see if the False Positive Rate is controlled within alpha that I set. However, the results are showing that the number of tests that the portion of tests rejected far exceeds alpha. For 100 trials, where each trial is comparing two Bernoulli proportions (of 0.01 conversion rate) and alpha=0.05, beta=0.10, and the total number of visitors is set to 100K, I get the following percent of reject and accept:
% rejected: 43
% accepted: 56.99999999999999
Here's the link to my code: https://gist.github.com/sjoelee/d4fed8b80e1af1d2e0cf7aac37d09a90. Once each visitor arrives and has been bucketed to treatment/control, I simulate a biased coin flip (based on conversion rate for variation) for their conversion. I add each individual data through addData and then look at the results to see if they finished, determining whether the result from addData returned true (accept null) or false (reject null). Could you provide more documentation on how the thresholds are calculated? And have any tests been done to see if A/A tests are still controlled under alpha? Thanks!
The text was updated successfully, but these errors were encountered:
I'm interested in using this library so I tested to see if the False Positive Rate is controlled within alpha that I set. However, the results are showing that the number of tests that the portion of tests rejected far exceeds alpha. For 100 trials, where each trial is comparing two Bernoulli proportions (of 0.01 conversion rate) and alpha=0.05, beta=0.10, and the total number of visitors is set to 100K, I get the following percent of reject and accept:
Here's the link to my code: https://gist.github.com/sjoelee/d4fed8b80e1af1d2e0cf7aac37d09a90. Once each visitor arrives and has been bucketed to treatment/control, I simulate a biased coin flip (based on conversion rate for variation) for their conversion. I add each individual data through
addData
and then look at the results to see if they finished, determining whether the result fromaddData
returned true (accept null) or false (reject null). Could you provide more documentation on how the thresholds are calculated? And have any tests been done to see if A/A tests are still controlled under alpha? Thanks!The text was updated successfully, but these errors were encountered: