Skip to content
Rapid Machine Learning Prototyping in Python
Branch: master
Clone or download
Ken Van Haren
Ken Van Haren Outline for new docs
Latest commit 8618ce6 Nov 11, 2015
Type Name Latest commit message Commit time
Failed to load latest commit information.
docs Outline for new docs Nov 11, 2015
examples Reporter refactor Apr 23, 2014
ramp Adding utility estimators Nov 11, 2015
.gitignore Re-write Features, Selectors, Folds and shortcuts for new API. Apr 17, 2014
LICENSE Adding license May 6, 2012 Add manifest, authors. Version bump. Nov 18, 2012 Readme formatting Apr 21, 2014
requirements.txt Adding prettytable requirement Jun 28, 2013 Re-write Features, Selectors, Folds and shortcuts for new API. Apr 17, 2014

Ramp - Rapid Machine Learning Prototyping

Ramp is a python library for rapid prototyping of machine learning solutions. It's a light-weight pandas-based machine learning framework pluggable with existing python machine learning and statistics tools (scikit-learn, rpy2, etc.). Ramp provides a simple, declarative syntax for exploring features, algorithms and transformations quickly and efficiently.


Why Ramp?

  • Clean, declarative syntax

  • Complex feature transformations

    Chain and combine features:

Interactions([Log('x1'), (F('x2') + F('x3')) / 2])
Reduce feature dimension:
DimensionReduction([F('x%d'%i) for i in range(100)], decomposer=PCA(n_components=3))
Incorporate residuals or predictions to blend with other models:
Residuals(simple_model_def) + Predictions(complex_model_def)
  • Data context awareness

    Any feature that uses the target ("y") variable will automatically respect the current training and test sets. Similarly, preparation data (a feature's mean and stdev, for example) is stored and tracked between data contexts.

  • Composability

    All features, estimators, and their fits are composable, pluggable and storable.

  • Easy extensibility

    Ramp has a simple API, allowing you to plug in estimators from scikit-learn, rpy2 and elsewhere, or easily build your own feature transformations, metrics, feature selectors, reporters, or estimators.

Quick start

Getting started with Ramp: Classifying insults

Or, the quintessential Iris example:

import pandas
from ramp import *
import urllib2
import sklearn
from sklearn import decomposition

# fetch and clean iris data from UCI
data = pandas.read_csv(urllib2.urlopen(
data = data.drop([149]) # bad line
columns = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class']
data.columns = columns

# all features
features = [FillMissing(f, 0) for f in columns[:-1]]

# features, log transformed features, and interaction terms
expanded_features = (
    features +
    [Log(F(f) + 1) for f in features] +
        F('sepal_width') ** 2,

# Define several models and feature sets to explore,
# run 5 fold cross-validation on each and print the results.
# We define 2 models and 4 feature sets, so this will be
# 4 * 2 = 8 models tested.

    # report feature importance scores from Random Forest

    # Try out two algorithms

    # and 4 feature sets

        # Feature selection
            # use random forest's importance to trim
            target=AsFactor('class'), # target to use
            n_keep=5, # keep top 5 features

        # Reduce feature dimension (pointless on this dataset)

        # Normalized features
        [Normalize(f) for f in expanded_features],


Ramp is alpha currently, so expect bugs, bug fixes and API changes.


  • Numpy
  • Scipy
  • Pandas
  • PyTables
  • Sci-kit Learn
  • gensim


Ken Van Haren. Email with feedback/questions: @squaredloss


You can’t perform that action at this time.