Skip to content

EvgeniDubov/hellinger-distance-criterion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hellinger Dinstance Criterion

Hellinger Distance criterion for sklearn Random Forest and Decision Tree classifiers

I'm working on adding this to scikit-learn-contrib/imbalanced-learn PR #437

Build

You will need a cython "header" file (.pxd) from sklearn.

In case you've installed sklearn from source code package, you've already got it.

In case you've installed sklearn using pip install sklearn then you need to get it.

python setup.py build_ext --inplace

Example

>>> import numpy as np
>>> from hellinger_distance_criterion import HellingerDistanceCriterion
>>> from sklearn.ensemble import RandomForestClassifier
>>>
>>> hdc = HellingerDistanceCriterion(1, np.array([2],dtype='int64'))
>>> clf = RandomForestClassifier(criterion=hdc, max_depth=4, n_estimators=100)
>>> clf.fit(X_train, y_train)
>>> print('hellinger distance score: ', clf.score(X_test, y_test))

About

Random Forest model using Hellinger Distance as split criterion

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages