GitHub - zmjones/rfss: "Exploratory Data Analysis using Random Forests"

This repository contains the code, data, and manuscript for "Exploratory Data Analysis Using Random Forests" by Zachary M. Jones and Fridolin Linder.

The rise of "big data" has made machine learning algorithms more visible and relevant for social scientists, however, they are still widely considered to be "black box" models that are not well suited for substantive research: only prediction. We argue that this need not be the case, and present one method, Random Forests, with an emphasis on practical application for exploratory analysis and substantive interpretation. Random forests detect interaction and nonlinearity without prespecification, have low generalization error in simulations and in many real-world problems, and can be used with many correlated predictors, even when there are more predictors than observations. Importantly, Random Forests can be interpreted in a substantively relevant way with variable importance measures, bivariate and multivariate partial dependence, proximity matrices, and methods for interaction detection. We provide intuition as well as technical detail about how Random Forests work, in theory and in practice, as well as empirical examples from the literature on american and comparative politics. Furthermore, we provide software implementing the methods we discuss to facilitate their use.

The associated software package, which is functional but still under development is edarf. Feel free to open issues or issue pull requests. We welcome corrections or suggestions large or small.

To run all of the code use the Makefile (assuming you are using a Unix system or have Make installed). If you pass an argument CORES the code will be run in parallel.

> make setup
> make code CORES=8
## requires pandoc, pandoc-citeproc and pdflatex to be installed and available in the environment
> make rfss_manuscript.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
R		R
data		data
figures		figures
state_bills		state_bills
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
makefile		makefile
options.sty		options.sty
rfss.bib		rfss.bib
rfss_manuscript.md		rfss_manuscript.md
rfss_manuscript.pdf		rfss_manuscript.pdf
rfss_slides.md		rfss_slides.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Contributors 2

Languages

zmjones/rfss

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages