Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented block HSIC lasso #9

Merged
merged 12 commits into from
Jul 18, 2018
Merged

Implemented block HSIC lasso #9

merged 12 commits into from
Jul 18, 2018

Conversation

hclimente
Copy link
Collaborator

I implemented here block HSIC lasso developed by @myamada0321. It is a variant of HSIC lasso that reduces memory usage by estimating HSIC on samples of the data. If blocks are not specified, by setting B = 0 in both HSICLasso.classification or HSICLasso.regression, conventional HSIC lasso is used. I added unit tests for the new block estimator, and assessed that the results don't change wildly, although they do, especially when only 5 features are selected.

Additionally I cleaned a bit the repository from unnecessary files (.egg, diet, build), and updated .gitignore to ignore them in the future.

@hclimente
Copy link
Collaborator Author

As agreed with @myamada0321, I fixed a crucial bug in pyHSIC lasso, where normalisation was dividing by np.sqrt(n - (1 / n)) instead of np.sqrt(float(n - 1) / n).

Additionally, I implemented a multiprocess version for parallel computation of blocks. However, I ended up not using them as block lasso computation is faster than initialising a whole new process + Python environment.

@suecharo suecharo mentioned this pull request Jul 18, 2018
@suecharo suecharo changed the base branch from master to feature/feature_V1.1.0 July 18, 2018 06:21
@suecharo suecharo merged commit 9d404a9 into riken-aip:feature/feature_V1.1.0 Jul 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants