Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect p-values for nonparameteric statistical tests of Generalized Pareto Distribution #305

Closed
Datseris opened this issue Jul 24, 2023 · 2 comments

Comments

@Datseris
Copy link

Here is a MWE:

using Distributions, HypothesisTests

sigma = 1 / 2.0
xi = -0.1

gpd = GeneralizedPareto(0.0, sigma, xi)

X = rand(gpd, 10000)

TestType = OneSampleADTest
test = TestType(X, gpd)

p = pvalue(test)

fig, ax = hist(X; bins = 50, normalization = :pdf, label = "pvalue = $(round(p; digits=3))")
xrange = range(0, maximum(X); length = 100)
lines!(xrange, pdf.(gpd, xrange); color = :black, label = "analytic")
axislegend(ax)
fig 

image

Irrespectively of the parameters sigma, xi, and the RNG realization, the result is always very high p values. Instead the correct result would have been very low p values, because the data have very high confidence to come from the prescribed distribution. In the MWE the data are literally sampled by the distribution.

I've tried as hypothesis tests: OneSampleADTest, ApproximateOneSampleKSTest, ExactOneSampleKSTest. They all "fail" in the sense of not giving low enough p values.

@Datseris
Copy link
Author

crosslinking https://discourse.julialang.org/t/testing-whether-data-come-from-a-generalized-pareto-distribution/102008 which shows that tthe tests fail also for a hand coded Cramer Von Mises test, so this may not be an issue with HypothjesisTests.jl...

@Datseris
Copy link
Author

There is nothing wrong (see discord), I have simply misunderstood the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant