You can install directly from PYPI:
$ pip install econtools
Or you can clone from Github and install directly.
$ git clone http://github.com/dmsul/econtools $ cd econtools $ python setup.py install
- OLS, 2SLS, LIML
- Option to absorb any variable via within-transformation (a la
- Robust standard errors
- HAC (
- Clustered standard errors
- Spatial HAC (SHAC, aka Conley standard errors) with uniform and triangle kernels
- HAC (
- F-tests by variable name or
- Local linear regression.
- WARNING [31 Oct 2019]: Predicted values (yhat and residuals) may not be as expected in transformed regressions (when using fixed effects or using weights). That is, the current behavior is different from Stata. I am looking into this and will post a either a fix or a justification of current behavior in the near future.
import econtools import econtools.metrics as mt # Read Stata DTA file df = econtools.read('my_data.dta') # Estimate OLS regression with fixed-effects and clustered s.e.'s result = mt.reg(df, # DataFrame to use 'y', # Outcome ['x1', 'x2'], # Indep. Variables fe_name='person_id', # Fixed-effects using variable 'person_id' cluster='state' # Cluster by state ) # Results print(result.summary) # Print regression results beta_x1 = result.beta['x1'] # Get coefficient by variable name r_squared = result.r2a # Get adjusted R-squared joint_F = result.Ftest(['x1', 'x2']) # Test for joint significance equality_F = result.Ftest(['x1', 'x2'], equal=True) # Test for coeff. equality
Regression and Summary Stat Tables
outregtakes regression results and creates a LaTeX-formatted tabular fragment.
table_statrowcan be used to add arbitrary statistics, notes, etc. to a table. Can also be used to create a table of summary statistics.
write_notesmakes it easy to save table notes that depend on your data.
Misc. Data Manipulation Tools
pandas.mergeand adds a lot of Stata's merge niceties like a
'_m'flag for successfully merge observations.
group_idgenerates an ID based on the variables past (compare
- Crosswalks of commonly used U.S. state labels.
- State abbreviation to state name (and reverse).
- State fips to state name (and reverse).
write: Use the passed file path's extension to determine which
pandasI/O method to use. Useful for writing functions that programmatically read DataFrames from disk which are saved in different formats. See examples above and below.
load_or_build: A function decorator that caches datasets to disk. This function builds the requested dataset and saves it to disk if it doesn't already exist on disk. If the dataset is already saved, it simply loads it, saving computational time and allowing the use of a single function to both load and build data.
from econtools import load_or_build, read @load_or_build('my_data_file.dta') def build_my_data_file(): """ Cleans raw data from CSV format and saves as Stata DTA. """ df = read('raw_data.csv') # Clean the DataFrame return df
File type is automatically detected from the passed filename. In this case, Stata DTA from
save_cli: Simple wrapper for
argparsethat let's you use a
--saveflag on the command line. This lets you run a regression without over-writing the previous results and without modifying the code in any way (i.e., commenting out the "save" lines).
In your regression script:
from econtools import save_cli def regression_table(save=False): """ Run a regression and save output if `save == True`. """ # Regression guts if __name__ == '__main__': save = save_cli() regression_table(save=save)
In the command line/bash script:
python run_regression.py # Runs regression without saving output python run_regression.py --save # Runs regression and saves output
- Python 3.6+
- Pandas and its dependencies (Numpy, etc.)
- Scipy and its dependencies
- Pytables (optional, if you use HDF5 files)
- PyTest (optional, if you want to run the tests)