Skip to content


Subversion checkout URL

You can clone with
Download ZIP
SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.
Python Shell
Fetching latest commit...
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


SciKit-Learn Laboratory

Build status PyPI downloads Latest version on PyPI Bitdeli badge

This Python package provides utilities to make it easier to run machine learning experiments with scikit-learn.

Command-line Interface

run_experiment is a command-line utility for running a series of learners on datasets specified in a configuration file. For more information about using run_experiment (including a quick example), go here.

Python API

If you just want to avoid writing a lot of boilerplate learning code, you can use our simple Python API. The main way you'll want to use the API is through the load_examples function and the Learner class. For more details on how to simply train, test, cross-validate, and run grid search on a variety of scikit-learn models see the documentation.

A Note on Pronunciation

SciKit-Learn Laboratory (SKLL) is pronounced "skull": that's where the learning happens.



You can view the slides for the talk Dan Blanchard gave at PyData NYC 2013 here.


  • v0.21.0
    • Added support for ElasticNet, Lasso, and LinearRegression learners.
    • Reorganized examples, and created new example based on the Kaggle Titanic data set.
    • Added ability to easily create multiple files at once when using write_feature_file.
    • Added support for the .ndj file extension for new-line delimited JSON files. It's the same format as .jsonlines, just with a different name.
    • Added support for comments and skipping blank lines in .jsonlines files.
    • Made some efficiency tweaks when creating logging messages.
    • Made labels in .results files a little clearer for objective function scores.
    • Fixed some misleading error messages.
    • Fixed issue with backward-compatibility unit test in Python 2.7.
    • Fixed issue where predict mode required data to already be labelled.
  • v0.20.0
    • Refactored experiments module to remove unnecessary child processes, and greatly simplify ablation code. This should fix issues #73 and #49.
    • Deprecated run_ablation function, as its functionality has been folded into run_configuration.
    • Removed ability to run multiple configuration files in parallel, since this lead to too many processes being created most of the time.
    • Added ability to run multiple ablation experiments from the same configuration file by adding support for multiple featuresets.
    • Added min_feature_count value to results files, which fixes #62.
    • Added more informative error messages when we run out of memory while converting things to dense. They now say why something was converted to dense in the first place.
    • Added option to skll_convert for creating ARFF files that can be used for regression in Weka. Previously, files would always contain non-numeric labels, which would not work with Weka.
    • Added ability to name relation in output ARFF files with skll_convert.
    • Added class_map setting for collapsing multiple classes into one (or just renaming them). See the run_experiment documentation for details.
    • Added warning when using SVC with probability flag set (#2).
    • Made logging much less verbose by default and switched to using QueueHandler and QueueListener instances when dealing with multiple processes/threads to prevent deadlocks (#75).
    • Added simple no-crash unit test for all learners. We check results with some, but not all. (#63)
  • v0.19.0
    • Added support for running ablation experiments with all combinations of features (instead of just holding out one feature at a time) via run_experiment --ablation_all. As a result, we've also changed the names of the ablated_feature column in result summary files to ablated_features.
    • Added ARFF and CSV file support across the board. As a result, all instances of the parameter tsv_label have now been replaced with label_col.
    • Fixed issue #71.
    • Fixed process leak that was causing sporadic issues.
    • Removed arff_to_megam, csv_to_megam, megan_to_arff, and megam_to_csv because they are all superseded by ARFF and CSV support in skll_convert.
    • Switched to using Anaconda for installing Atlas.
    • Switched back to URLs for documentation, now that rtfd/ has been fixed.
  • v0.18.1
    • Updated generate_predictions to use latest API.
    • Switched to using multiprocessing-compatible logging. This should fix some intermittent deadlocks.
    • Switched to using miniconda for install Python on Travis-CI.
  • v0.18.0
    • Fixed crash when modelpath is blank and task is not cross_validate.
    • Fixed crash with convert_examples when given a generator.
    • Refactored's private _*_dict_iter functions to be classes to reduce code duplication.
  • v0.17.1
    • Fixed crash with SVR on Python 3 from kernel type being a byte string.
    • Fixed crash with DecisionTreeRegressor due to an invalid criterion being set.
  • v0.17.0
    • Fixed issue where requirements weren't being installed via pip.
    • Added SKLL version number and Pearson correlation to result summary files.
    • No longer crash if a result summary file doesn't exist, and instead just print an error message.
    • Tweak handling of logging under the hood to make sure logging settings are applied to all loggers.
  • v0.16.1
    • Fixed crash with GradientBoostingRegressor and MultionomialNB from typo in previous release.
    • Fixed crash related to loading feature sets that contain files that are unlabelled.
    • Fixed crash related to loading .megam files with unlabelled examples.
  • v0.16.0
    • Added new versions of kappa metrics that make it so adjacent ratings are not penalized. For example, 1 and 2 will be considered to be equal, whereas 1 and 3 will have a difference of 1 for when building the weights matrix.
    • Cleaned up a bit of the Sphinx documentation.
    • Each module now has its own separate logger, which should make logging messages more informative.
    • Made handling of non-convertible IDs when using ids_to_floats uniform across date file types. All will now raise a ValueError when faced with a string that does not look like a float.
    • Now raise an error when duplicate feature names are encountered in .megam files.
    • No longer set compute_importances for learner based on decision trees, since that is no longer necessary as of scikit-learn 0.14.
  • v0.15.0
    • Added support for DecisionTreeRegressor and RandomForestRegressor.
    • Fixed issue #60 with filtering examples via cv_folds_location file.
    • Added unit tests for Grid Map mode on Travis using the scripts described by this gist.
    • Switched from using mix-in to decorator for handling rescaled versions of regressors. Code's a lot simpler now.
    • Added support for suppressing "Loading..." messages to most functions in the experiments module.
    • Refactored convert_examples to simply call new version of load_examples that can take a list of example dictionaries in addition to filenames.
    • Made all unit tests much less verbose.
    • Fix an obscure issue related to loading examples when SKLL_MAX_CONCURRENT_PROCESSES is set to 1.
  • v0.14.0
    • Added warning when configuration files contain settings that are invalid.
    • Fixed a crash because job_results was not defined in grid-mode.
    • Cleaned up a lot of things related to unit tests and their discovery.
    • Added unit tests to manifest so that people who install this through pip could run the tests themselves if they wanted.
  • v0.13.2
    • Now raise an exception when using ids_to_floats with non-numeric IDs.
    • Fixed a number of inconsistencies with cv_folds_location and ids_to_floats (including GH issue #57).
    • Fixed unit tests for cv_folds_location and ids_to_floats so that they actually test the right things now.
  • v0.13.1
    • Fixed crash when using cv_folds_location with ids_to_floats.
  • v0.13.0
    • Will now skip IDs that are missing from cv_folds/grid_search_folds dicts and print a warning instead of crashing.
    • Added additional kappa unit tests to help detect/prevent future issues.
    • API change: model_type is no longer a keyword argument to Learner constructor, and is now required. This was done to help prevent unexpected issues from defaulting to LogisticRegression.
    • No longer keep extra temporary config files around when running run_experiment in ablation mode.
  • v0.12.0
    • Fixed crash with kappa when given two sets of ratings that are both missing an intermediate value (e.g., [1, 2, 4]).
    • Added summarize_results script for creating a nice summary TSV file from a list of JSON results files.
    • Summary files for ablation studies now have an extra column that says which feature was removed.
  • v0.11.0
    • Added initial version of skll_convert script for converting between .jsonlines, .megam, and .tsv data file formats.
    • Fixed bug in _megam_dict_iter where labels for instances with all zero features were being incorrectly set to None.
    • Fixed bug in _tsv_dict_iter where features with zero values were being retained with values set as '0' instead of being removed completely. This caused DictVectorizer to create extra features, so results may change a little bit if you were using .tsv files.
    • Fixed crash with predict and train_only modes when running on the grid.
    • No longer use process pools to load files if SKLL_MAX_CONCURRENT_PROCESSES is 1.
    • Added more informative error message when trying to load a file without any features.
  • v0.10.1
    • Made processes non-daemonic to fix issue with running multiple configurations files at the same time with run_experiment.
  • v0.10.0
    • run_experiment can now take multiple configuration files.
    • Fixed issue where model parameters and scores were missing in evaluate mode
  • v0.9.17
    • Added function to convert a list dictionaries to an ExamplesTuple.
    • Added a new optional field to configuration file, ids_to_floats, to help save memory if you have a massive number of instances with numeric IDs.
    • Replaced use_dense_features and scale_features options with feature_scaling. See the run_experiment documentation for details.
  • v0.9.16
    • Fixed summary output for ablation experiments. Previously summary files would not include all results.
    • Added ablation unit tests.
    • Fixed issue with generating PDF documentation.
  • v0.9.15
    • Added two new required fields to the configuration file format under the General heading: experiment_name and task. See the run_experiment documentation for details.
    • Fixed an issue where the "loading..." message was never being printed when loading data files.
    • Fixed a bug where keyword arguments were being ignored for metrics when calculating final scores for a tuned model. This means that previous reported results may be wrong for tuning metrics that use keywords arguments: f1_score_micro, f1_score_macro, linear_weighted_kappa, and quadratic_weighted_kappa.
    • Now try to convert IDs to floats if they look like them to save memory for very large files.
    • kappa now supports negative ratings.
    • Fixed a crash when specifing grid_search_jobs and pre-specified folds.
  • v0.9.14
    • Hotfix to fix issue where grid_search_jobs setting was being overriden by grid_search_folds.
  • v0.9.13
    • Added (also available as skll.write_feature_file) to simplify outputting .jsonlines, .megam, and .tsv files.
    • Added more unit tests for handling .megam and .tsv files.
    • Fixed a bug that caused a crash when using gridmap.
    • grid_search_jobs now sets both n_jobs and pre_dispatch for GridSearchCV under the hood. This prevents a potential memory issue when dealing with large datasets and learners that cannot handle sparse data.
    • Changed logging format when using run_experiment to be a little more readable.
  • v0.9.12
    • Fixed serious issue where merging feature sets was not working correctly. All experiments conducted using feature set merging (i.e., where you specified a list of feature files and had them merged into one set for training/testing) should be considered invalid. In general, your results should previously have been poor and now should be much better.
    • Added more verbose regression output including descriptive statistics and Pearson correlation.
  • v0.9.11
    • Fixed all known remaining compatibility issues with Python 3.
    • Fixed bug in skll.metrics.kappa which would raise an exception if full range of ratings was not seen in both y_true and y_pred. Also added a unit test to prevent future regressions.
    • Added missing configuration file that would cause a unit test to fail.
    • Slightly refactored skll.Learner._create_estimator to make it a lot simpler to add new learners/estimators in the future.
    • Fixed a bug in handling of sparse matrices that would cause a crash if the number of features in the training and the test set were not the same. Also added a corresponding unit test to prevent future regressions.
    • We now require the backported configparser module for Python 2.7 to make maintaining compatibility with both 2.x and 3.x a lot easier.
  • v0.9.10
    • Fixed bug introduced in v0.9.9 that broke predict mode.
  • v0.9.9
    • Automatically generate a result summary file with all results for experiment in one TSV.
    • Fixed bug where printing predictions to file would cause a crash with some learners.
    • Run unit tests for Python 3.3 as well as 2.7.
    • More unit tests for increased coverage.
  • v0.9.8
    • Fixed crash due to trying to print name of grid objective which is now a str and not a function.
    • Added --version option to shell scripts.
  • v0.9.7
    • Can now use any objective function scikit-learn supports for tuning (i.e., any valid argument for scorer when instantiating GridSearchCV) in addition to those we define.
    • Removed ml_metrics dependency and we now support custom weights for kappa (through the API only so far).
    • Require's scikit-learn 0.14+.
    • accuracy, quadratic_weighted_kappa, unweighted_kappa, f1_score_micro, and f1_score_macro functions are no longer available under skll.metrics. The accuracy and f1 score ones are no longer needed because we just use the built-in ones. As for quadratic_weighted_kappa and unweighted_kappa, they've been superseded by the kappa function that takes a weights argument.
    • Fixed issue where you couldn't write prediction files if you were classifying using numeric classes.
  • v0.9.6
    • Fixes issue with importing from package when trying to install it (for real this time).
  • v0.9.5
    • You can now include feature files that don't have class labels in your featuresets. At least one feature file has to have a label though, because we only support supervised learning so far.
    • Important: If you're using TSV files in your experiments, you should either name the class label column 'y' or use the new tsv_label option in your configuration file to specify the name of the label column. This was necessary to support feature files without labels.
    • Fixed an issue with how version number was being imported in that would prevent installation if you didn't already have the prereqs installed on your machine.
    • Made random seeds smaller to fix crash on 32-bit machines. This means that experiments run with previous versions of skll will yield slightly different results if you re-run them with v0.9.5+.
    • Added megam_to_csv for converting .megam files to CSV/TSV files.
    • Fixed a potential rounding problem with csv_to_megam that could slightly change feature values in conversion process.
    • Cleaned up a little bit.
    • Updated documentation to include missing fields that can be specified in config files.
  • v0.9.4
    • Documentation fixes
    • Added requirements.txt to manifest to fix broken PyPI release tarball.
  • v0.9.3
    • Fixed bug with merging feature sets that used to cause a crash.
    • If you're running scikit-learn 0.14+, we use their StandardScaler, since the bug fix we include in FixedStandardScaler is in there.
    • Unit tests all pass again
    • Lots of little things related to using travis (which do not affect users)
  • v0.9.2
    • Fixed example.cfg path issue. Updated some documentation.
    • Made path in consistent with the updated one in example.cfg
  • v0.9.1
    • Fixed bug where classification experiments would raise an error about class labels not being floats
    • Updated documentation to include quick example for run_experiment.
Something went wrong with that request. Please try again.