Skip to content

The codes reproduce the figures and statistics in the paper, "Controlling for multiple covariates," by Mark Tygert. The repo also provides the LaTeX and BibTex sources required for replicating the paper.

License

Notifications You must be signed in to change notification settings

facebookresearch/metamulti

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The accompanying codes reproduce all figures and statistics presented in "Controlling for multiple covariates" by Mark Tygert. This repository also provides the LaTeX and BibTeX sources required for replicating the paper.

Be sure to pip install hilbertcurve prior to running any of this software (the codes depend on HilbertCurve). Also be sure to gunzip codes/cup98lrn.txt.gz prior to running codes/kddcup98.py.

The main files in the repository are the following:

tex/multidim.pdf PDF version of the paper

tex/multidim.tex LaTeX source for the paper

tex/multidim.bib BibTeX source for the paper

tex/diffs0.pdf tex/diffs1.pdf tex/sums0.pdf tex/sums1.pdf tex/partition.pdf Graphics for Subsection 2.3 of the paper

codes/acs.py Python script for processing the American Community Survey

codes/psam_h06.csv Microdata from the 2019 American Community Survey of the U.S. Census Bureau

codes/kddcup98.py Python script for processing the KDD Cup 1998 data

codes/cup98lrn.txt.gz Data from the 1998 KDD Cup

codes/synthetic.py Python script for generating and processing synthetic examples

codes/hilbert.pdf Plot of an approximation with 255 line segments to the Hilbert curve in 2D

codes/disjoint.py Functions for plotting differences between two subpops. with disjoint scores (redistributed from the GitHub repo fbcddisgraph)

codes/subpop.py Functions for plotting differences of a subpop. from the full population (redistributed from the GitHub repo fbcdgraph)

codes/subpop_weighted.py Functions for plotting differences of a subpop. from the full pop. with weights (redistributed from the GitHub repo fbcdgraph)

Regenerating all the figures requires running in the directory codes acs.py, kddcup98.py, and synthetic.py; issue the commands

cd codes
pip install hilbertcurve
gunzip cup98lrn.txt.gz
python acs.py --non-interactive --var 'MV'
python acs.py --non-interactive --var 'NOC'
python acs.py --non-interactive --var 'MV+NOC'
python acs.py --non-interactive --var 'NOC+MV'
python kddcup98.py --non-interactive
python synthetic.py

acs.py and kddcup98.py include interactive modes to facilitate browsing through the associated parameterizations of the covariates used as controls (where the Hilbert curve specifies the parameterization). The default setting is interactive, not saving plots to disk; to save plots to disk, just specify --no-interactive or --non-interactive during invocation as a script on the command-line.


Copyright license

This metamulti software is licensed under the (MIT-type) copyright LICENSE file in the root directory of this source tree.

About

The codes reproduce the figures and statistics in the paper, "Controlling for multiple covariates," by Mark Tygert. The repo also provides the LaTeX and BibTex sources required for replicating the paper.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published