No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 1 commit behind mattdwebb:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
R
man
src
tests
.Rbuildignore
.gitignore
DESCRIPTION
NAMESPACE
README.Rmd
README.md
wildclusterboot.Rproj

README.md

wildclusterboot

wildclusterboot is a package designed to implement both clustered standard errors and the wild cluster bootstrap. This includes an implemention similar to the cluster option available in Stata, the standard wild cluster boostrap, based on Cameron, Gelbach, and Miller (2008) along with "multi-way" versions of each.

The core of this package is implemented in C++, providing decent performance, especially if boostrapping large datasets.

Installation

If you already have R and Rstudio installed, the easiest way to install this package is via devtools. To install devtools, run the following code in R:

install.packages('devtools')

Once devtools is installed, run the following code to install the package:

devtools::install_github(repo = 'mattdwebb/wildclusterboot')

Usage

clustered_se

The library provides two main functions. The first is clustered_se, which can be used to get the clustered standard errors out of an lm object. This includes both the standard, "one-way" cluster-robust standard errors, as well as "multi-way" standard errors, due to Cameron, Gelbach and Miller (2011).

An example for one-way standard errors:

#Create dataframe of random normal variables
test_data <- data.frame(Y = rnorm(10), X = rnorm(10), G = c(1,1,1,2,2,2,3,3,3,3))

#Fit model for test data
test_model <- lm(data = test_data, formula = 'Y ~ X')

#Get a vector of clustered standard errors for model
SEs <- clustered_se(model = test_model, clusterby = 'G')

And an example for multi-way standard errors:

#Create dataframe of random normal variables
test_data <- data.frame(Y = rnorm(10),
                        X = rnorm(10),
                        G = c(1,1,1,2,2,2,3,3,3,3),
                        H = (1, 2, 3, 1, 2, 3, 1, 2, 3))

#Fit model for test data
test_model <- lm(data = test_data, formula = 'Y ~ X')

# In this case, supply a formula containing all dimension names to cluster by both G and H
SEs <- clustered_se(model = test_model, clusterby = ~ G + H)

wild_cluster_boot

The second function is wild_cluster_boot, which can be used to perform a wild clustered bootstrap on an lm object and a set of data for a given independent variable of interest. This includes the standard wild cluster boostrap, as below:

p_val <- wild_cluster_boot(data = test_data,
                           model = test_model,
                           x_interest = 'X',
                           clusterby = 'G',
                           boot_reps = 399)

It also provides a number of extensions to the original wild cluster boostrap. This includes a multi-way version:

p_val <- wild_cluster_boot(data = test_data,
                           model = test_model,
                           x_interest = 'X',
                           clusterby = ~ G + H,
                           bootby = 'G',
                           boot_reps = 399)

Note that the bootby variable must be specified separately in this case, or an error will be thrown.