Skip to content

Commit

Permalink
Merge 2417da6 into 71521b4
Browse files Browse the repository at this point in the history
  • Loading branch information
PaulSoderlind committed May 13, 2020
2 parents 71521b4 + 2417da6 commit dec7262
Showing 1 changed file with 33 additions and 36 deletions.
69 changes: 33 additions & 36 deletions README.md
Expand Up @@ -39,77 +39,74 @@ Available kernel types are:
- `TukeyHanningKernel`
- `QuadraticSpectralKernel`

For example, `ParzenKernel(NeweyWest)` return an instance of `TruncatedKernel` parametrized by `NeweyWest`, the type that corresponds to the optimal bandwidth calculated following Newey and West (1994). Similarly, `ParzenKernel(Andrews)` corresponds to the optimal bandwidth obtained in Andrews (1991). If the bandwidth is known, it can be directly passed, i.e. `TruncatedKernel(2)`.
For example, `ParzenKernel{NeweyWest}()` return an instance of `TruncatedKernel` parametrized by `NeweyWest`, the type that corresponds to the optimal bandwidth calculated following Newey and West (1994). Similarly, `ParzenKernel{Andrews}()` corresponds to the optimal bandwidth obtained in Andrews (1991). If the bandwidth is known, it can be directly passed, i.e. `TruncatedKernel(2)`.

The examples below clarify the API, that is however relatively easy to use.

### Long run variance of the regression coefficient

In the regression context, the function `vcov` does all the work:
```julia
vcov(::DataFrameRegressionModel, ::HAC; prewhite = true)
vcov(::HAC, ::DataFrameRegressionModel; prewhite = true)
```

Consider the following artificial data (a regression with autoregressive error component):
```julia
using CovarianceMatrices
using DataFrames
using Random, DataFrames, GLM
Random.seed!(1)
n = 500
x = randn(n,5)
u = Array{Float64}(2*n)
u = zeros(2*n)
u[1] = rand()
for j in 2:2*n
u[j] = 0.78*u[j-1] + randn()
end
u = u[n+1:2*n]
y = 0.1 + x*[0.2, 0.3, 0.0, 0.0, 0.5] + u
u = u[n+1:2*n]
y = 0.1 .+ x*[0.2, 0.3, 0.0, 0.0, 0.5] + u

df = DataFrame()
df[:y] = y
for j in enumerate([:x1, :x2, :x3, :x4, :x5])
df[j[2]] = x[:,j[1]]
end
df = convert(DataFrame,x)
df[!,:y] = y
```
Using the data in `df`, the coefficient of the regression can be estimated using `GLM`

```julia
lm1 = glm(y~x1+x2+x3+x4+x5, df, Normal(), IdentityLink())
lm1 = glm(@formula(y~x1+x2+x3+x4+x5), df, Normal(), IdentityLink())
```

To get a consistent estimate of the long run variance of the estimated coefficients using a Quadratic Spectral kernel with automatic bandwidth selection _à la_ Andrews
```julia
vcov(lm1, QuadraticSpectralKernel(Andrews), prewhite = false)
vcov(QuadraticSpectralKernel{Andrews}(), lm1, prewhite = false)
```
If one wants to estimate the long-time variance using the same kernel, but with a bandwidth selected _à la_ Newey-West
```julia
vcov(lm1, QuadraticSpectralKernel(NeweyWest), prewhite = false)
vcov(QuadraticSpectralKernel{NeweyWest}(), lm1, prewhite = false)
```
The standard errors can be obtained by the `stderror` method
```julia
stderror(::DataFrameRegressionModel, ::HAC; prewhite::Bool)
stderror( ::HAC, ::DataFrameRegressionModel; prewhite::Bool)
```
Sometime is useful to access the bandwidth selected by the automatic procedures. This can be done using the `optimalbw` method
Sometime is useful to access the bandwidth selected by the automatic procedures. This can be done using the `optimalbandwidth` method
```julia
optimalbw(NeweyWest, QuadraticSpectralKernel, lm1; prewhite = false)
optimalbw(Andrews, QuadraticSpectralKernel, lm1; prewhite = false)
optimalbandwidth(QuadraticSpectralKernel{NeweyWest}(), lm1; prewhite = false)
optimalbandwidth(QuadraticSpectralKernel{Andrews}(), lm1; prewhite = false)
```

### Long run variance of the average of the process

Sometime interest lies in estimating the long-run variance of the average of the process. At the moment this can be done by carrying out a regression on a constant (the sample mean of the realization of the process) and using `vcov` or `stderror` to obtain a consistent variance estimate (or its diagonal elements).

```julia
lm2 = glm(u~1, df, Normal(), IdentityLink())
vcov(lm1, ParzenKernel(NeweyWest), prewhite = false)
stderr(lm1, ParzenKernel(NeweyWest), prewhite = false)
lm2 = glm(@formula(y~1), df, Normal(), IdentityLink())
vcov(ParzenKernel{NeweyWest}(), lm2, prewhite = false)
stderror(ParzenKernel{NeweyWest}(), lm2, prewhite = false)
```

## HC (Heteroskedastic consistent)

As in the HAC case, `vcov` and `stderr` are the main functions. They know get as argument the type of robust variance being sought
As in the HAC case, `vcov` and `stderror` are the main functions. They know get as argument the type of robust variance being sought
```julia
vcov(::DataFrameRegressionModel, ::HC)
vcov(::HC, ::DataFrameRegressionModel)
```
Where HC is an abstract type with the following concrete types:

Expand All @@ -131,20 +128,20 @@ using GLM
# The weights are added just to test the interface and are not part
# of the original data
clotting = DataFrame(
u = log([5,10,15,20,30,40,60,80,100]),
u = log.([5,10,15,20,30,40,60,80,100]),
lot1 = [118,58,42,35,27,25,21,19,18],
lot2 = [69,35,26,21,18,16,13,12,12],
w = 9*[1/8, 1/9, 1/25, 1/6, 1/14, 1/25, 1/15, 1/13, 0.3022039]
)
wOLS = fit(GeneralizedLinearModel, lot1~u, clotting, Normal(), wts = array(clotting[:w]))
vcov(wOLS, HC0
vcov(wOLS, HC1)
vcov(wOLS, HC2)
vcov(wOLS, HC3)
vcov(wOLS, HC4)
vcov(wOLS, HC4m)
vcov(wOLS, HC5)
wOLS = fit(GeneralizedLinearModel, @formula(lot1~u), clotting, Normal(), wts = clotting[!,:w])
vcov(HC0(),wOLS)
vcov(HC1(),wOLS)
vcov(HC2(),wOLS)
vcov(HC3(),wOLS)
vcov(HC4(),wOLS)
vcov(HC4m(),wOLS)
vcov(HC5(),wOLS)
```

## CRHC (Cluster robust heteroskedasticty consistent)
Expand All @@ -153,7 +150,7 @@ The API of this class of variance estimators is subject to change, so please use
```julia
using RDatasets
df = dataset("plm", "Grunfeld")
lm = glm(Inv~Value+Capital, df, Normal(), IdentityLink())
vcov(lm, CRHC1(convert(Array, df[:Firm])))
stderr(lm, CRHC1(convert(Array, df[:Firm])))
lm = glm(@formula(Inv~Value+Capital), df, Normal(), IdentityLink())
vcov(CRHC1(convert(Array, df[:Firm])), lm)
stderror(lm, CRHC1(convert(Array, df[:Firm])),lm)
```

0 comments on commit dec7262

Please sign in to comment.