Skip to content
This repository

Data sets for machine learning in Python

skdata (scikit-data)

Skdata is a library of data sets for machine learning and statistics. This
module provides standardized Python access to toy problems as well
as popular computer vision and natural language processing data sets.

The project is hosted at github:


There are several options for installation:

  * From scratch:

      * pip install --user

  * From a fresh git checkout:

      * python develop

      * python install




Join the mailing list:!forum/skdata


Github maintains an up-to-date list of direct contributors:

A special thanks goes to David Cox, who provided inspiration and design
guidance, and generally got this project started.

This work was supported in part by the National Science Foundation (IIS-0963668).

Something went wrong with that request. Please try again.