The Stata program scfses
obtains accurate point estimates and standard errors of an arbitrary percentile (or the mean) of a variable in the Survey of Consumer Finances (SCF). For example, scfses
can help you easily obtain the median and standard error on the median. It incorporates weights and accounts for both imputation variability and sampling variability.
Update March 8, 2018: Fixed an error in computing standard errors on means.
Update March 13, 2018: Option changed from nodofcorr
to nodfcorr
for naming consistency. Help file updated with more explanation.
Update November 6, 2018: Added more guidance regarding the degrees of freedom correction.
To install the program, run
net install scfses, from(https://raw.github.com/crafkin/scfses/master/) replace
in Stata.
Alternatively, download scfses.ado
and scfses.sthlp
and place them in your PLUS
folder. (To find your PLUS
folder, run sysdir list
in Stata.)
To download the SCF dataset, go to the SCF website. Download both the "Main Survey Data" and the "Replicate Weight File."
For example, you can download the 2016 SCF data, the 2016 replicate weights file, and the 2016 summary extract (a version of useful SCF variables cleaned from the microdata). Once you merge the replicate weights file with the main dataset and generate a variable indexing the implicates, you can use scfses
to analyze variables' distributions.
Usage notes are documented in detail in the Stata help file.
-
It is not straightforward to generate standard errors on the mean or a specified percentile of the unconditional distribution of an SCF variable. If you are not careful, your standard errors and confidence intervals may be too small.
scfses
follows SCF guidance on combining imputation variability and sampling variability to obtain standard errors. -
scfses
estimates imputation variability by computing the sample variance of within-implicate point estimates.scfses
estimates sampling variability by using the information contained in the replicate draw variables to construct the distribution of the variable from the first implicate; the sample variance of that distribution represents sampling variability. The program combines imputation and sampling variability following the SCF guidance. -
Confidence intervals incorporate a degrees-of-freedom correction (Barnard and Rubin 1999) to account for imputation variance.
-
scfses
stores point estimates and standard errors for post-estimation analysis. -
scfses
requires a vector of replicate sampling variables and replicate weight variables — one for each replicate used to compute sampling variance. -
SCF recommends the command
scfcombo
(written by Jane Brittingham) for generating means and their standard errors.scfcombo
may be useful for other applications (and some ideas inscfses
were inspired byscfcombo
). Butscfses
has the following advantages for summarizing the data:
scfses
makes it easy to generate point estimates and standard errors on an arbitrary percentile (which, to my knowledge,scfcombo
cannot do without some modification)scfses
incorporates a degrees-of-freedom correction for confidence intervals.
-
scfses
incorporates, by default, a degrees-of-freedom correction for obtaining a confidence interval around your test statistic. Stata has the convention of reporting a 95% confidence interval (e.g. in regression coefficients) that tests the statistic against the t distribution. That test is only exact if the variable is normally distributed, but it is conservative otherwise.scfses
, by default, constructs the 95% confidence interval using the t distribution, but the user has the option to test against the normal distribution instead. -
Many variables in the SCF may not be normally distributed, and hence the user may wish to turn off the degrees of freedom correction using the option
nodofcorr
. -
In general, the degrees-of-freedom correction is likely to make very little difference, given how quickly the t distribution with sufficient degrees of freedom approaches the normal distribution.
Charlie Rafkin
National Bureau of Economic Research
crafkin@nber.org
Program developed to obtain estimates in:
Beshears, John, James Choi, David Laibson, and Brigitte C. Madrian. "Household Finance." In Handbook of Behavioral Economics, edited by B. Douglas Bernheim, Stefano DellaVigna, and David Laibson. Elsevier: 2018.