Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group Sequential - Percentile Issue #176

Closed
louisryan opened this issue Dec 20, 2017 · 10 comments
Closed

Group Sequential - Percentile Issue #176

louisryan opened this issue Dec 20, 2017 · 10 comments

Comments

@louisryan
Copy link

Hi there,

I have upgraded from Expan 0.6.2 -> 0.6.5 and upon re-running the group sequential method, the percentiles are showing incorrect values:

screen shot 2017-12-20 at 16 58 01

Upon downgrading, the issue has been resolved.

@shansfolder
Copy link
Contributor

hi @louisryan by "incorrect" do you mean percentile becomes 0 and 100? This can happen when the current data size (2065 in your example) is much less your provided estimated sample size.

@gbordyugov
Copy link
Contributor

@shansfolder I'm wondering that this problem appeared after the version upgrade

@shansfolder
Copy link
Contributor

@louisryan this might be actually a bug fix instead of a problem :)
could you tell us what is the estimated sample size you are using? so that I can confirm for you..

@louisryan
Copy link
Author

@shansfolder, in my case, estimated sample size is 50000. After lowering it to 10000, the percentiles do change to 2.5/97.5. So this is expected behaviour? And is the fix you put in place previously?

@shansfolder
Copy link
Contributor

hi @louisryan , yes this is a fix.
There was a bug fix on this line in 0.6.2, and it changed to this in 0.6.5.

@shansfolder
Copy link
Contributor

@louisryan let me try to explain intuitively: when the current sample size is small(compared to estimated sample size), to be conservative, we make the confidence interval very large. A very large interval infers that it should cover 0 --- so it's almost impossible to conclude significance when sample size is still small.

@louisryan
Copy link
Author

louisryan commented Dec 22, 2017

@shansfolder that makes sense, but do you not think that the confidence interval should be displayed regardless? With stop set to False?

The reason why I say this is down to experience running experiments with one core success metric, but other metrics of interest that we want to monitor that might only achieve a fraction of the sample size. Using the fixed horizon approach would yield confidence intervals but would not be statistically significant(lower bound not crossing the zero line). With the fix in place, we loose the confidence interval representation.

Would love to know your thoughts

@shansfolder
Copy link
Contributor

Hi, @louisryan my understanding is that if the group sequential method gives you stop equals false, it means the results are not statistically valid(due to it only achieve a fraction of the sample size). Therefore, we shouldn't use the values of confidence intervals in this case anyways.

But I admit the way of our result format is confusing. Let me know if you have any suggestions. :)

@louisryan
Copy link
Author

Good point! :)

@louisryan
Copy link
Author

Thanks for the clarification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants