Skip to content

mdroste/stata-binscatter2

Repository files navigation

binscatter2

Overview | Motivation | Installation | Usage | Benchmarks | To-Do | Acknowledgements | License

Faster binned scatterplots in Stata with a few new bells and whistles

version 0.91 19jan2023

Overview

binscatter2 is a program for producing binned scatterplots in Stata. It inherits the syntax and functionality of the excellent binscatter package, but runs substantially faster for big datasets (see benchmarks). In practice, binscatter2 runs approximately 3 to 4 times faster than binscatter, and 2 to 3 times faster than binsreg.

In addition, binscatter2 offers a handful of new features relative to the binscatter. Binscatter2 allows users to plot additional information about the conditional probability distribution of y given x (e.g. quantile intervals), an alternative procedure to adjust for covariates suggested by Cattaneo et al. (2022), additional options for fit lines and saving, and multi-way fixed effects.

New Features

In addition to substantial performance improvements for large datasets (see benchmarks), binscatter2 adds a few new features to binscatter. In particular:

  • Multi-way fixed effects. If reghdfe is installed, multi-way fixed effects can be specified in the absorb() option.
  • Visualize conditional variance and quantiles. Overlay quantiles of the sample distribution on top of the means/medians within each bin, providing more information on the shape of the conditional distribution of y given x.
  • Flexible save commands. Save scatter points out to .dta files and also choose to omit the do-file created by savedata() with the nodofile option.
  • More fit line options. Exponential and logarithmic fits, with higher-order polynomials coming soon.
  • Alternative covariate adjustment procedure. Implements the suggested procedure described in Cattaneo et al. (2019) to control for covariates without residualizing y and x with respect to a vector of controls/fixed effects with the new option altcontrols.

binscatter2 demo

Installation

There are two options for installing binscatter2. The only prerequisite is the gtools command, which can be installed from Github or the SSC repository.

  1. The most recent version can be installed from Github with the following Stata command:
ssc install gtools
net install binscatter2, from("https://raw.githubusercontent.com/mdroste/stata-binscatter2/master/")
  1. A ZIP containing the program can be downloaded and manually placed on the user's adopath from Github.

This project will be submitted to the SSC repository very soon.

Usage

Complete internal documentation is provided with the installation and can be accessed by typing:

help binscatter2

The basic syntax and usage of binscatter2 is inherited from binscatter and should be familiar to existing users of that program.

This repository includes a do-file, check.do, that provides a number of checks to verify the functionality of each option within binscatter2 and demonstrates equivalence to binscatter for options shared by both programs. The file check_speed.do runs Monte Carlo simulations that were used in the benchmark section of this readme.

Benchmarks

binscatter2 benchmark

Todo

The following items will be addressed soon:

  • Save out quantile intervals when using savedata() option
  • More aesthetic options on quantiles() option
  • Comparison against binsreg

Acknowledgements

Binscatter2 builds extensively on binscatter , developed by the illustrious Michael Stepner and Jessica Laird.

In addition, binscatter2 would certainly not have been possible without gtools by Mauricio Caceres Bravo, which in turn would not have happened without ftools, developed by Sergio Correa.

The alternative covariate adjustment procedure (enabled with the option altcontrols) was formalized by Cattaneo et al. (2022).

License

binscatter2 is MIT-licensed.