-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add White's test (heteroskedasticity) #206
Conversation
Thanks this looks great! It might be nice to explicitly define BreuschPaganTest(X, e) = WhiteTest(X, e, :Linear) |
@azev77 Thanks for the feedback. One way of doing that is to (1) add the function as you suggest, export it Sounds OK? |
Sounds good |
added Breusch-Pagan test and more precise |
Is |
@azev77 Thanks. I just tried to follow what appears to be the default in this package. Another thing: it would be nice to make this package and |
src/white.jl
Outdated
|
||
if TestType == :Linear | ||
z = X | ||
TestDescription = string("Breusch-Pagan's test for heteroskedasticity (linear)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to call string
function here, string is any double-quoted text. Same below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. These are left-overs from another attempt. The fix will be in the next push.
@PaulSoderlind |
@PaulSoderlind Also, in theory these are just nested hypothesis tests, so in principal Wald/LR/LM can be used. |
@azev77 Yes, there are lots of versions of these tests. For instance, what I called the BP test is really the BP/Koenker (1981) test. The code above can be used to perform some of the other versions applied in other packages: just specify
Maybe we should add this information to the documentation of the function? As you point out, these are restricted versions, sometimes applied save dfs/increase power. I believe Stata has some interesting routines for this, and even through on a Bonferroni approach for multiple testing. I have Julia code for Bonferroni/Holm so that could be done here - at some point in time. |
You forgot to add the test file to |
@wildart I added white.jl to The test passes. I did notice, however, that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Disregard nightly builds fails, especially if it doesn't related to your PR. |
It seems the feedback so far is positive. Anything else I can do to make this move forward? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm late to the party, with small design comments.
src/white.jl
Outdated
end | ||
|
||
""" | ||
WhiteTest(X, e, TestType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestType
should use lowercase. Also, maybe better make it a keyword argument
WhiteTest(X, e, TestType) | |
WhiteTest(X, e; type=:White) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lowercase is fine
keyword seems trickier: it seems to interfere with the dispatch, probably because the struct which is also called WhiteTest
(as per the convention in this package). I'll send more information on this in a later email.
src/white.jl
Outdated
Compute White's (or Breusch-Pagan's) test for heteroskedasticity. | ||
|
||
`X` is a matrix of regressors and `e` is the vector of residuals from the original model. | ||
`TestType` is either `:Linear` for the Breusch-Pagan/Koenker test, `:LinearAndSquares` for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use lowercase :linear
and linearandsquares
(or linear_and_squares
), for consistency with e.g. pvalue
's tail
argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
src/white.jl
Outdated
constant. In some applications, `X` is a subset of the regressors in the original model, | ||
or just the fitted values. This saves degrees of freedom and may give a more powerful test. | ||
The test statistic is T*R2 where R2 is from the regression of `e^2` on the terms | ||
mentioned above. Under the null hypothesis it is distributed as `Chisquare(dof)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use the actual name of the type in Distributions.jl:
mentioned above. Under the null hypothesis it is distributed as `Chisquare(dof)` | |
mentioned above. Under the null hypothesis it is distributed as `Chisq(dof)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
src/white.jl
Outdated
n == length(e) || throw(DimensionMismatch("inputs must have the same length")) | ||
|
||
if TestType == :Linear | ||
z = X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
z = X | |
z = X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
src/white.jl
Outdated
for i = 1:K, j = i:K | ||
z[:,vv] = X[:,i].*X[:,j] #eg. x1*x1, x1*x2, x2*x2 | ||
vv = vv + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better avoid unnecessary copies:
for i = 1:K, j = i:K | |
z[:,vv] = X[:,i].*X[:,j] #eg. x1*x1, x1*x2, x2*x2 | |
vv = vv + 1 | |
@views for i = 1:K, j = i:K | |
z[:, vv] .= X[:, i] .* X[:, j] #eg. x1*x1, x1*x2, x2*x2 | |
vv += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, except that I dislike the X[:, j]
thing. I would prefer no space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only things that matters is consistency -- but I don't know which convention is used in this package, if any.
test/white.jl
Outdated
bp_test = BreuschPaganTest(X, e,) | ||
@test pvalue(bp_test) ≈ 0.1287 atol=atol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bp_test = BreuschPaganTest(X, e,) | |
@test pvalue(bp_test) ≈ 0.1287 atol=atol | |
bp_test = BreuschPaganTest(X, e,) | |
@test pvalue(bp_test) ≈ 0.1287 atol=atol |
Sorry, I don't understand the problem. You need to write |
src/white.jl
Outdated
dof::Int #degrees of freedom in test | ||
LM::Float64 #test statistic, distributed as Chisq(dof) | ||
TestDescription::String #short description of the test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dof::Int #degrees of freedom in test | |
LM::Float64 #test statistic, distributed as Chisq(dof) | |
TestDescription::String #short description of the test | |
dof::Int # degrees of freedom in test | |
lm::Float64 # test statistic, distributed as Chisq(dof) | |
descr::String # short description of the test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and what does LM
stand for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought LM = Lagrange Multiplier test statistic?
(the only other context is LM = Linear Model)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, Lagrange Multiplier test statistic. Perhaps we could use LMstat
or lmstat
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked around in the lit: and it's LM everywhere. I'll stick to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though better use lowercase lm
, in Julia uppercase is used for types.
@nalimilan I have made the changes you requested and make an update of the PR. And, yes, you are right about the keyword argument. |
@nalimilan and @azev77 I have some 2nd thoughts about one of latest design changes: To test the
This clearly allocates memory, but just one column at a time. |
I have now implemented (almost all of) the suggestions by @nalimilan limilan. In addition, I have added checks on @wildart good to go? |
src/white.jl
Outdated
end | ||
|
||
""" | ||
WhiteTest(X, e; testtype = :White) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to repeat "test"?
WhiteTest(X, e; testtype = :White) | |
WhiteTest(X, e; type = :White) |
|
||
Compute White's (or Breusch-Pagan's) test for heteroskedasticity. | ||
|
||
`X` is a matrix of regressors and `e` is the vector of residuals from the original model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you say that observations are in rows and/or that variables are in columns? There are different conventions and it can be confusing.
src/white.jl
Outdated
* [White's test on Wikipedia](https://en.wikipedia.org/wiki/White_test) | ||
""" | ||
function WhiteTest(X::AbstractMatrix{<:Real}, e::AbstractVector{<:Real}; testtype = :White) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/white.jl
Outdated
intercept_cols = mapslices(col->all(diff(col).==0) && (first(col) != 0),X,dims=1) | ||
any(intercept_cols) || throw(ArgumentError("one of the colums of X must be a non-zero constant")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will avoid lots of allocations (I hope I got it right but you see the idea):
intercept_cols = mapslices(col->all(diff(col).==0) && (first(col) != 0),X,dims=1) | |
any(intercept_cols) || throw(ArgumentError("one of the colums of X must be a non-zero constant")) | |
intercept_col = any(eachcol(X)) do col | |
first(col) != 0 && all(==(first(col)), col) | |
end | |
intercept_col || throw(ArgumentError("one of the colums of X must be a non-zero constant")) |
src/white.jl
Outdated
Compute Breusch-Pagan's test for heteroskedasticity. | ||
|
||
`X` is a matrix of regressors from the original model and `e` the vector of residuals. | ||
See `WhiteTest` for further details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See `WhiteTest` for further details. | |
This is equivalent to `WhiteTest(X, e, testtype=:linear)`. | |
See [`WhiteTest`](@ref) for further details. |
test/white.jl
Outdated
|
||
@test_throws ArgumentError WhiteTest(rand(4,2), rand(4)) | ||
|
||
bp_test = BreuschPaganTest(X, e,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bp_test = BreuschPaganTest(X, e,) | |
bp_test = BreuschPaganTest(X, e) |
test/white.jl
Outdated
|
||
bp_test = BreuschPaganTest(X, e,) | ||
@test pvalue(bp_test) ≈ 0.1287 atol=atol | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for show
?
src/white.jl
Outdated
dof::Int #degrees of freedom in test | ||
LM::Float64 #test statistic, distributed as Chisq(dof) | ||
TestDescription::String #short description of the test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though better use lowercase lm
, in Julia uppercase is used for types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. I just have a few suggestions.
@nalimilan Thanks for the suggestions. I believe I implemented all of them. Yes, your However, I now notice travis test failures on Julia 1.0 (passes the 1.3 test). Because of |
Ah, yes, you'll need to add a dependency on Compat 2.2.0. |
test/white.jl
Outdated
@test pvalue(w_test) ≈ 0.2774 atol=atol | ||
|
||
w_test = WhiteTest(X, e) | ||
@test pvalue(w_test) ≈ 0.3458 atol=atol | ||
|
||
show(IOBuffer(), w_test) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I should have been more specific: this should also check the contents of the IOBuffer
using e.g. String(take!(buf)) == """..."""
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nalimilan Regarding the eachcol
and Compat. I don't know whether the maintainers like adding another dep. Perhaps this old-fashioned approach could do the trick until support for < Julia 1.1 is dropped?
intercept_col = false
for i = 1:K
col = view(X,:,i)
intercept_col = first(col) != 0 && all(==(first(col)), col)
intercept_col && break
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nalimilan Regarding testing the IOBuffer, I could do something like
buffer = IOBuffer()
show(buffer, w_test)
str = String(take!(buffer))
@test str == refstr
where refstr
is a predfined string with the expected output.
But, I see a problem: the expected output contains point estimate: 5.612418635833716
. That's a lot of digits, and strictly requiring that they all match is somewhat in conflict with the other tests that use isapprox
. Is there a way around this (other than always rounding the printed results?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use $value
to interpolate the exact value in the string.
Regarding Compat, that's always a tradeoff, but I'd say it's a small and common enough dependency that it's OK. Adding version checks everywhere is precisely what Compat allows to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest push includes
- a test for show. (I did't find anything similar in this package, so I am not entirely sure what the expectation is.)
- a loop for testing if
X
has an intercept term. Not so pretty, but compatible with Julia 1.0.5 and actually pretty fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, just a few details and it should be ready.
test/white.jl
Outdated
@test w_pval ≈ 0.3458 atol=atol | ||
|
||
refstr = """ | ||
White's (or Breusch-Pagan's) test for heteroskedasticity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to add a four-space indentation here without breaking the test.
src/white.jl
Outdated
@@ -64,7 +64,13 @@ function WhiteTest(X::AbstractMatrix{<:Real}, e::AbstractVector{<:Real}; type = | |||
|
|||
K >= 2 || throw(ArgumentError("X must have >= 2 columns")) | |||
|
|||
intercept_col = any(col -> first(col) != 0 && all(==(first(col)), col),eachcol(X)) | |||
# intercept_col = any(col -> first(col) != 0 && all(==(first(col)), col),eachcol(X)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# intercept_col = any(col -> first(col) != 0 && all(==(first(col)), col),eachcol(X)) |
src/white.jl
Outdated
@@ -102,11 +108,11 @@ BreuschPaganTest(X, e) = WhiteTest(X, e, type = :linear) | |||
|
|||
testname(t::WhiteTest) = "White's (or Breusch-Pagan's) test for heteroskedasticity" | |||
|
|||
population_param_of_interest(t::WhiteTest) = ("T*R2", 0, t.lm) | |||
population_param_of_interest(t::WhiteTest) = ("T*R2", 0, round(t.lm,digits=4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we don't round value in other tests, so better don't round here either. It would probably make sense to apply some rounding when printing, but that should be applied to all tests at the same time, in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I have skipped the rounding, except for testing pvalue()
which has a built-in round(z,digits=4)
. I also fixed the indentation of that """...""" string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This PR adds White's (and Breusch-Pagan's) test for heteroskedasticity in a regression. It is one of the most commonly applied diagnostic tests (along with various tests for autocorrelation which already are in the package). The test gives the same result as
bptest
in R.Closes issue #199