Random Forest Methods In Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.md

Random Forests In Python

Intoduction

I started this project to better understand the way decision trees and random forests work. At this point the classifiers are only based off the gini-index and the regression models are based off the mean square error. Both the classifiers and regression models are built to work with datasets that are lists of lists, where the target variable values are the right most column. It can also work with datasets that use Pandas DataFrames and Pandas Series.

Dependencies

The dependencies for this project are rather minimal, including,

  1. Python 2.7
  2. Pandas
  3. NumPy
  4. Sphinx (for documentation only)

You can install all the dependencies using pip (except for python and Sphinx) by entering into the commandline,

pip install -r requirements.txt

Example

>>> dataset = [[2.771244718, 1.784783929, 0],
		       [1.728571309, 1.169761413, 0],
		       [3.678319846, 2.81281357, 1],
		       [3.961043357, 2.61995032, 1],
		       [2.999208922, 2.209014212, 0],
		       [7.497545867, 3.162953546, 0],
		       [9.00220326, 3.339047188, 1],
		       [7.444542326, 0.476683375, 1],
		       [10.12493903, 3.234550982, 0],
		       [6.642287351, 3.319983761, 1]]
>>>
>>> data_point = pd.Series([2.0, 23.0], index=['feature_1','feature_2'])
>>> import pandas as pd
>>> df = pd.DataFrame(data=dataset,columns =['feature_1','feature_2','target'])
>>>
>>> from TreeMethods.DecisionTreeClassifier import DecisionTreeClassifier
>>> tree = DecisionTreeClassifier(max_depth=2,min_size=1)
>>> tree.fit(df,target='target')
>>>
>>> tree.predict(data_point)
0
>>>
>>> from TreeMethods.RandomForestClassifier import RandomForestClassifier
>>> forest = RandomForestClassifier(n_trees=10
                               max_depth=5,
                               min_size=1)
>>> forest.fit(df, target='target')
>>> forest.predict(data_point)
0

Testing

To test the code type the following command from the terminal in the RandomForest directory,

py.test tests

More tests will be added in the near future.

Documentation

To build the documentation on your local machine type the following commands from RandomForest directory,

sphinx-apidoc -F -o doc/ TreeMethods/

Then cd into the doc/ directory and type,

make html

The html documentation will be in the directory _build/html/. Open the file index.html.

References