Switch branches/tags
last_OK jenkins-tomk-hadoop-1 jenkins-tomas_jenkins-7 jenkins-tomas_jenkins-6 jenkins-tomas_jenkins-5 jenkins-tomas_jenkins-4 jenkins-tomas_jenkins-3 jenkins-tomas_jenkins-2 jenkins-tomas_jenkins-1 jenkins-sample-docs-3 jenkins-sample-docs-2 jenkins-sample-docs-1 jenkins-rel-wright-3 jenkins-rel-wright-2 jenkins-rel-wright-1 jenkins-rel-wolpert-11 jenkins-rel-wolpert-10 jenkins-rel-wolpert-9 jenkins-rel-wolpert-8 jenkins-rel-wolpert-7 jenkins-rel-wolpert-6 jenkins-rel-wolpert-5 jenkins-rel-wolpert-4 jenkins-rel-wolpert-3 jenkins-rel-wolpert-2 jenkins-rel-wolpert-1 jenkins-rel-wheeler-12 jenkins-rel-wheeler-11 jenkins-rel-wheeler-10 jenkins-rel-wheeler-9 jenkins-rel-wheeler-8 jenkins-rel-wheeler-7 jenkins-rel-wheeler-6 jenkins-rel-wheeler-5 jenkins-rel-wheeler-4 jenkins-rel-wheeler-3 jenkins-rel-wheeler-2 jenkins-rel-wheeler-1 jenkins-rel-weierstrass-7 jenkins-rel-weierstrass-6 jenkins-rel-weierstrass-5 jenkins-rel-weierstrass-4 jenkins-rel-weierstrass-3 jenkins-rel-weierstrass-2 jenkins-rel-weierstrass-1 jenkins-rel-vapnik-1 jenkins-rel-vajda-4 jenkins-rel-vajda-3 jenkins-rel-vajda-2 jenkins-rel-vajda-1 jenkins-rel-ueno-12 jenkins-rel-ueno-11 jenkins-rel-ueno-10 jenkins-rel-ueno-9 jenkins-rel-ueno-8 jenkins-rel-ueno-7 jenkins-rel-ueno-6 jenkins-rel-ueno-5 jenkins-rel-ueno-4 jenkins-rel-ueno-3 jenkins-rel-ueno-2 jenkins-rel-ueno-1 jenkins-rel-tverberg-6 jenkins-rel-tverberg-5 jenkins-rel-tverberg-4 jenkins-rel-tverberg-3 jenkins-rel-tverberg-2 jenkins-rel-tverberg-1 jenkins-rel-tutte-2 jenkins-rel-tutte-1 jenkins-rel-turnbull-2 jenkins-rel-turnbull-1 jenkins-rel-turing-10 jenkins-rel-turing-9 jenkins-rel-turing-8 jenkins-rel-turing-7 jenkins-rel-turing-6 jenkins-rel-turing-5 jenkins-rel-turing-4 jenkins-rel-turing-3 jenkins-rel-turing-2 jenkins-rel-turing-1 jenkins-rel-turin-4 jenkins-rel-turin-3 jenkins-rel-turin-2 jenkins-rel-turin-1 jenkins-rel-turchin-11 jenkins-rel-turchin-10 jenkins-rel-turchin-9 jenkins-rel-turchin-8 jenkins-rel-turchin-7 jenkins-rel-turchin-6 jenkins-rel-turchin-5 jenkins-rel-turchin-4 jenkins-rel-turchin-3 jenkins-rel-turchin-2 jenkins-rel-turchin-1 jenkins-rel-turan-4 jenkins-rel-turan-3 jenkins-rel-turan-2
Nothing to show
Find file History
Permalink
..
Failed to load latest commit information.
EEG_eyestate_sklearn_NOPASS.ipynb fixed urls so that eeg_eyestate_splits.csv is now under the smalldata… Aug 17, 2016
H2O_chicago_crimes.ipynb Remove as.Date from tests and demos May 29, 2016
H2O_tutorial_breast_cancer_classification.ipynb Updated data URL in a python demo Oct 19, 2016
H2O_tutorial_eeg_eyestate.ipynb fixed urls so that eeg_eyestate_splits.csv is now under the smalldata… Aug 17, 2016
H2O_tutorial_medium_NOPASS.ipynb PUBDEV-5230: Update import statement for PCA (#1962) Jan 24, 2018
LeNET.ipynb Comment out system command. Oct 14, 2016
Predict_w_Unseen_Categorical_Levels.ipynb checking if notebook passes py3 unit test by replaced nopass notebook… May 16, 2016
README.md Update README.md (#2603) Jul 6, 2018
airlines_demo_small.ipynb Revert back to master's version of airlines_demo_small. Oct 14, 2016
citi_bike_large.ipynb [PUBDEV-3176] Make sure that AstUniOps can be applied to numeric colu… Feb 22, 2017
citi_bike_small.ipynb [PUBDEV-3176] Make sure that AstUniOps can be applied to numeric colu… Feb 22, 2017
cm_roc.ipynb update all the notebooks to be py3/2 compat Dec 8, 2015
confusion_matrices_binomial.ipynb update all the notebooks to be py3/2 compat Dec 8, 2015
deeplearning.ipynb updoot da ipynbs _locate to the new location Dec 8, 2015
glrm_census_large.ipynb PUBDEV-4702-maxruntime-tests: completed pyunit and runit tests to set… ( Aug 10, 2017
imputation.ipynb remove the show methods from list results Feb 5, 2016
isax2.ipynb Isax v2 (#413) Nov 3, 2016
kmeans_aic_bic_diagnostics.ipynb Fix bug in diagnostics function (#118) Aug 22, 2016
not_equal_factor.ipynb update all the notebooks to be py3/2 compat Dec 8, 2015
prep_airlines.ipynb update all the notebooks to be py3/2 compat Dec 8, 2015
prostate_gbm.ipynb one more ipynb to fix Dec 8, 2015
rf_balance_classes.ipynb updoot da ipynbs _locate to the new location Dec 8, 2015
turbofan_NOPASS_phm.ipynb updoot da ipynbs _locate to the new location Dec 8, 2015
turbofan_phm_gtkerror_NOPASS.ipynb updoot da ipynbs _locate to the new location Dec 8, 2015
walking_gait.ipynb PUBDEV-4702-maxruntime-tests: completed pyunit and runit tests to set… ( Aug 10, 2017
word2vec_craigslistjobtitles.ipynb Add tokenize(), grep(), and transform() to Python API. This will also… Apr 6, 2017

README.md

Launching iPython Examples

Prerequisites:

  • Python 2.7

Install iPython Notebook

  1. Download pip, a Python package manager (if it's not already installed):

    $ sudo easy_install pip

  2. Install iPython using pip install:

    $ sudo pip install "ipython[notebook]"


Install dependencies

This module uses requests and tabulate modules, both of which are available on pypi, the Python package index.

$ sudo pip install requests
$ sudo pip install tabulate

Install and Launch H2O

To use H2O in Python, follow the instructions on the Install in Python tab after selecting the H2O version on the H2O Downloads page.

Launch H2O outside of the iPython notebook. You can do this in the top directory of your H2O build download. The version of H2O running must match the version of the H2O Python module for Python to connect to H2O. To access the H2O Web UI, go to https://localhost:54321 in your web browser.


Open Demos Notebook

Open the prostate_gbm.ipynb file. The notebook contains a demo that starts H2O, imports a prostate dataset into H2O, builds a GBM model, and predicts on the training set with the recently built model. Use Shift+Return to execute each cell and proceed to the next cell in the notebook .

$ ipython notebook prostate_gbm.ipynb

All demos are available here:


Running Python Examples

To set up your Python environment to run these examples, download and install H2O from Python using the instructions above.

Available Demos

  • Predict Airline Delays - Uses historical airlines flight data to build multiple classification models to label any flight as either delayed or not delayed.
  • Chicago Crime Rate - Uses weather and city statistics to compare arrest rates with the total crimes for each category.
  • NYC Citibike Demand with Weather - Takes monthly bike ride data (~10 million rows) for the past two years to predict bike demand at each bike share station. Weather data is also incorporated to better predict bike usage.
  • NYC Citibike Demand with Weather - smaller dataset - Takes monthly bike ride data (~1 million rows) for the past two years to predict bike demand at each bike share station. Weather data is also incorporated to better predict bike usage.
  • Confusion Matrix & ROC - Creates a GBM and GLM model using the airlines dataset, including confusion matrices, ROCs, and scoring histories.
  • Imputation - Substitutes values for missing data (imputes) the airlines dataset.
  • Not Equal Factor - Try to slice the airlines dataset using != factor_level.
  • Airline Confusion Matrices - Uses the airlines dataset to generate confusion matrices for algorithm performance analysis.
  • Deep Learning for Prostate Cancer Analysis - Uses the prostate dataset to build a Deep Learning model.
  • Airlines Prep - Condition the airline dataset by filtering out NAs if the departure delay in the input dataset is unknown. Anything longer than minutesOfDelayWeTolerate is treated as delayed.
  • GBM model using prostate dataset - Creates a GBM model using the prostate dataset.
  • Balance Classes - Imports the airlines dataset, parses it, displays a summary, and runs GLM with a binomial link function.
  • Clustering with KMeans - Demonstrates kmeans clusters and different diagnostics for selecting the number of clusters. Link to data is provided in the notebook.
  • EEG Eye State - Uses EEG data collected from an Emotiv Neuroheadset and classifies eye state (open vs closed) with a GBM.

Corresponding Datasets

Airlines Datasets

Chicago Crime

Citibike Data

Prostate Data