entro.py

A small suite of functions for calculating information entropy based measurements.

A core concept in machine learning and data science is information entropy. It's important when working with algorithms such as C4.5, ID3 etc. The functions have been carefully implemented following a TDD style and auto-generated documentation is available in the documentation folder.

What Can Be Calculated Using entro.py:

The definitions given below can be found in Han, Kamber, and Pei's fantastic book on Data Mining: Concepts and Techniques.

Entropy

The expected information needed to classify a data point in D is given by:

Information Gain

Information gain is defined as the difference between the original information requirement(i.e., based on just the proportion of the classes) and the new requirement (i.e., obtained after partitioning on A).

Split Information

The split information represents the potential information generated by splitting the training data set, D, into v partitions, corresponding to the v outcomes of a test on attribute A.

Gain Ratio

The gain ratio is the result of dividing the information gain by the split information to eliminate bias towards tests with many splits.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
documentation		documentation
readmeImg		readmeImg
tests		tests
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
entro.py		entro.py
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

documentation

documentation

readmeImg

readmeImg

tests

tests

LICENSE

LICENSE

README.md

README.md

init.py

init.py

entro.py

entro.py

setup.cfg

setup.cfg

setup.py

setup.py

Repository files navigation

entro.py

What Can Be Calculated Using entro.py:

Entropy

Information Gain

Split Information

Gain Ratio

About

Releases 2

Packages

Languages

License

mikesiers/entro.py

Folders and files

Latest commit

History

Repository files navigation

entro.py

What Can Be Calculated Using entro.py:

Entropy

Information Gain

Split Information

Gain Ratio

About

Topics

Resources

License

Stars

Watchers

Forks

Languages