# Libraries to use models from Grimmer et al, 2017

## LASSO:

Lasso regression is a model, similar to ridge regression, fights overfitting by allowing a higher mean square error, that will be beneficial on the long run when running on test data. This is obtained by adding a regularization penalty, which is represented by alpha in the library. [Library documentation available here.](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html#examples-using-sklearn-linear-model-lasso)

```class sklearn.linear_model.Lasso(alpha=1.0, *, fit_intercept=True, normalize=False, precompute=False, copy_X=True, max_iter=1000, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic')```



## Elastic Net:

Elastic net regression combines the regularization penalty of both Lasso and Ridge regression models. [Library documentation available here.](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html#)

```class sklearn.linear_model.ElasticNet(alpha=1.0, *, l1_ratio=0.5, fit_intercept=True, normalize=False, precompute=False, max_iter=1000, copy_X=True, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic')```

An example elaborating both Lasso and Elastic Net regression models can be found [here.](https://scikit-learn.org/stable/auto_examples/linear_model/plot_lasso_and_elasticnet.html#sphx-glr-auto-examples-linear-model-plot-lasso-and-elasticnet-py)

## Find It: Finding Heterogeneous Treatment Effects

The method adapts the Support Vector Machine classifier by placing separate LASSO constraints over the pre-treatment parameters and causal heterogeneity parameters of interest. This package can be only found in R. [Package can be found here.](https://cran.r-project.org/web/packages/FindIt/FindIt.pdf)

```FindIt( model.treat, model.main, model.int, data = NULL, type = "binary", treat.type = "multiple", nway, search.lambdas = TRUE, lambdas = NULL, make.twoway = TRUE, make.allway = TRUE, wts = 1, scale.c = 1, scale.int = 1, fit.glmnet = TRUE, make.reference = TRUE, reference.main = NULL, threshold = 0.999999 )```

## Bayesian GLM:

A modified glm function, that finds an approximate posterior mode and variance using extensions of the classical generalized linear model computations. This is only officially available in R [here](https://www.rdocumentation.org/packages/arm/versions/1.11-1/topics/bayesglm), but also available an open source github project (unlicensed) to use in python [here.](https://github.com/dchudz/BayesGLM)

```bayesglm (formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(...), model = TRUE, method = "glm.fit", x = FALSE, y = TRUE, contrasts = NULL, drop.unused.levels = TRUE, prior.mean = 0, prior.scale = NULL, prior.df = 1, prior.mean.for.intercept = 0, prior.scale.for.intercept = NULL, prior.df.for.intercept = 1, min.prior.scale=1e-12, scaled = TRUE, keep.order=TRUE, drop.baseline=TRUE, maxit=100, print.unnormalized.log.posterior=FALSE, Warning=TRUE,...)```


## BART: Bayesian Additive Regression Trees:

BART is a nonparametric Bayesian regression approach which uses dimensionally adaptive random basis elements. Motivated by ensemble methods in general, and boosting algorithms in particular, BART is defined by a statistical model: a prior and a likelihood.

Package is available in R [here.](https://cran.r-project.org/web/packages/BART/BART.pdf)

Licensed Github project to use available [here.](https://github.com/JakeColtman/bartpy)

## Random Forest Regressor:

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. [Library documentation available here.](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html)

```class sklearn.ensemble.RandomForestRegressor(n_estimators=100, *, criterion='mse', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, ccp_alpha=0.0, max_samples=None)```

## KRLS: Kernel Regularized Least Squares:


Function implements Kernel-Based Regularized Least Squares (KRLS), a machine learning method described in Hainmueller and Hazlett (2014) that allows users to solve regression and classification problems without manual specification search and strong functional form assumptions.

This package is available in R. Yet to be found or made in python. [Available here.](https://www.rdocumentation.org/packages/KRLS/versions/1.0-0/topics/krls)

```krls(X = NULL, y = NULL, whichkernel = "gaussian", lambda = NULL, sigma = NULL, derivative = TRUE, binary= TRUE, vcov=TRUE, print.level = 1,L=NULL,U=NULL,tol=NULL,eigtrunc=NULL)```


## Support Vector Machine (SVM):

Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression and outliers detection.

The advantages of support vector machines are:

- Effective in high dimensional spaces.

- Still effective in cases where number of dimensions is greater than the number of samples.

- Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.

Versatile: different Kernel functions can be specified for the decision function. Common kernels are provided, but it is also possible to specify custom kernels.

The disadvantages of support vector machines include:

- If the number of features is much greater than the number of samples, avoid over-fitting in choosing Kernel functions and regularization term is crucial.

- SVMs do not directly provide probability estimates, these are calculated using an expensive five-fold cross-validation.

[Library documentation available here.](https://scikit-learn.org/stable/modules/svm.html)

