Skip to content

jtloong/pandas-bin-continuous

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

Pandas-Bin-Continuous

A Pandas extension to allow you to encode binary features based on binned continuous values.

Sometimes keeping continous variables as floating point values or integers may actually throw off your model weights. It may be more useful to encode your continuous variables as seperate binary features decided by their value in different bins, which this extension should address.

Installation

git clone https://github.com/jtloong/pandas-bin-continous
cd pandas-bin-continuous
pip install .

Usage

dataframe = pandas_bin_continuous.create_features(dataframe, bin_edges, feature_name)

Where bin_edges is a list of the different bounds of the bins. The bin_edges follows the same usage of bin lists as in pandas.cut()

You can also view example.ipynb for a more full example of usage.

Notes

Realized after I made this, that sklearn has a binarizer class in their preprocessing module. However, I think this is slightly more useful for features with a wide range of continuous values that you need to seperate into many bins.

About

Encode binary features based on binned continuous variables in Pandas.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages