High-Dimensional LASSO (Hi-LASSO) can theoretically improves a LASSO model providing better performance of both prediction and feature selection on extremely high-dimensional data. Hi-LASSO alleviates bias introduced from bootstrapping, refines importance scores, improves the performance taking advantage of global oracle property, provides a statistical strategy to determine the number of bootstrapping, and allows tests of significance for feature selection with appropriate distribution. In Hi-LASSO will be applied to Use the pool of the python library to process parallel multiprocessing to reduce the time required for the model.
Hi-LASSO support Python 3.6+, Additionally, you will need
However, these packages should be installed automatically when installing this codebase.
Hi-LASSO is available through PyPI and can easily be installed with a
pip install hi_lasso
Read the documentation on readthedocs
#Data load import pandas as pd X = pd.read_csv('https://raw.githubusercontent.com/datax-lab/Hi-LASSO/master/simulation_data/X.csv') y = pd.read_csv('https://raw.githubusercontent.com/datax-lab/Hi-LASSO/master/simulation_data/y.csv') #General Usage from hi_lasso.hi_lasso import HiLasso # Create a HiLasso model hilasso = HiLasso(q1='auto', q2='auto', L=30, alpha=0.05, logistic=False, random_state=None, n_jobs=None) # Fit the model hilasso.fit(X, y, sample_weight=None) # Show the coefficients hilasso.coef_ # Show the p-values hilasso.p_values_ # Show the intercept hilasso.intercept_