Skip to content
This repository provides various scripts of code outlining different machine learning tools.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

The Round Pegs and Square Holes of Big Data - Code Scripts

The following scripts have been written in python, and correspond to various examples written on our website on big data. In these examples I have made use of MatPlotLib, and SciKit. I have also made use of the sample datasets available with SciKit.

Basic Regression Examples

Referred to here. Shows the advantage in some instances of implementing random forests. When working with noisy data.

Demonstrates the use of SciKit's DecisionTreeRegressor to predict sin values. Also makes use of graphviz to export the decision tree and pyplot to show the predictions.

Demonstrates the use of SciKit's RandomForestRegressor to predict sin values more effectively than with a standard decision tree. Uses pyplot to show the predictions.

Case Study - Digits Dataset

Referred to here. Demonstrates another, more interesting example of improving the performance of a standard decision tree classifier with higher dimensioned noisy data.

Demonstrates a standard DecisionTreeClassifier from SciKit in order to build a handwriting recognition model.

Demonstrates a RandomForestClassifier from SciKit which improves on the accuracy of the previous standard decision tree model.

Allows the user to view datapoints of the 64 dimensional data as a picture (via Pyplot), to associate to a handwritten digit. Can be useful in order to analyse the quality of the data manually.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.