Skip to content

Commit

Permalink
DOC documentation improvements: readme, overview, contributing docs
Browse files Browse the repository at this point in the history
  • Loading branch information
kmike committed Nov 14, 2016
1 parent 497e827 commit ea0c36d
Show file tree
Hide file tree
Showing 8 changed files with 109 additions and 45 deletions.
65 changes: 30 additions & 35 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,41 +20,36 @@ ELI5


ELI5 is a Python package which helps to debug machine learning
classifiers and explain their predictions.

Currently it allows to:

* explain weights and predictions of scikit-learn linear classifiers
and regressors;
* explain weights of scikit-learn decision trees and tree-based ensemble
classifiers (via feature importances);
* text-based and svg-based scikit-learn decision tree visualization;
* debug scikit-learn pipelines which contain HashingVectorizer,
by undoing hashing;
* explain weights and predictions of
https://github.com/scikit-learn-contrib/lightning models;
* explain weights of https://github.com/TeamHG-Memex/sklearn-crfsuite
CRF models;
* explain predictions of any black-box classifier using LIME
( http://arxiv.org/abs/1602.04938 ) algorithm (experimental).

TODO:

* IPython interactive widgets
* eli5.lime improvements
* explain predictions of https://github.com/TeamHG-Memex/sklearn-crfsuite
models
* https://github.com/scikit-learn-contrib/polylearn
* fasttext (?)
* xgboost (?)
* image input
* built-in support for non-text data in eli5.lime
* tensorflow, theano, lasagne, keras
* Naive Bayes from scikit-learn
(see https://github.com/scikit-learn/scikit-learn/issues/2237)
* Reinforcement Learning support
* explain predictions of decision trees and treee-based ensembles
classifiers and explain their predictions. It provides support for the
following machine learning frameworks and packages:

* scikit-learn_. Currently ELI5 allows to explain weights and predictions
of scikit-learn linear classifiers and regressors, print decision trees
as text or as SVG, show feature importances of random forests. ELI5
understands text processing utilities from scikit-learn and can highlight
text data accordingly. It also allows to debug scikit-learn pipelines which
contain HashingVectorizer, by undoing hashing.

* lightning_ - explain weights and predictions of lightning classifiers and
regressors.

* sklearn-crfsuite_. ELI5 allows to check weights of sklearn_crfsuite.CRF
models.

ELI5 also provides an alternative implementation of LIME_ algorithm,
which allows to explain predictions of any black-box classifier. This feature
is currently experimental.

Explanation and formatting are separated; you can get text-based explanation
to display in console, HTML version embeddable in an IPython notebook
or web dashboards, or JSON version which allows to implement custom
rendering and formatting on a client.

.. _lightning: https://github.com/scikit-learn-contrib/lightning
.. _scikit-learn: https://github.com/scikit-learn/scikit-learn
.. _sklearn-crfsuite: https://github.com/TeamHG-Memex/sklearn-crfsuite
.. _LIME: http://arxiv.org/abs/1602.04938

License is MIT.

Check `docs <http://eli5.readthedocs.io/>`_ for more (sorry, also TODO).
Check `docs <http://eli5.readthedocs.io/>`_ for more.
1 change: 1 addition & 0 deletions docs/source/changes.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.. include:: ../../CHANGES.rst
21 changes: 21 additions & 0 deletions docs/source/contribute.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Contributing
============

ELI5 uses MIT license; contributions are welcome!

* Source code: https://github.com/TeamHG-Memex/eli5
* Issue tracker: https://github.com/TeamHG-Memex/eli5/issues

ELI5 supports Python 2.7 and Python 3.4+
To run tests make sure tox_ Python package is installed, then run

::

tox

from source checkout.

We like high test coverage and mypy_ type annotations.

.. _tox: https://tox.readthedocs.io/en/latest/
.. _mypy: https://github.com/python/mypy
18 changes: 16 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,17 +1,31 @@
Welcome to ELI5's documentation!
================================

.. image:: https://img.shields.io/pypi/v/eli5.svg
:target: https://pypi.python.org/pypi/eli5
:alt: PyPI Version

.. image:: https://travis-ci.org/TeamHG-Memex/eli5.svg?branch=master
:target: http://travis-ci.org/TeamHG-Memex/eli5
:alt: Build Status

.. image:: http://codecov.io/github/TeamHG-Memex/eli5/coverage.svg?branch=master
:target: http://codecov.io/github/TeamHG-Memex/eli5?branch=master
:alt: Code Coverage

ELI5 is a Python library which allows to visualize and debug
various Machine Learning models using unified API. It has
built-in support for several ML frameworks and provides a way to
explain black-box models.

.. toctree::
:maxdepth: 2
:maxdepth: 1

install
usage
overview
contribute
api/index
changes

Indices and tables
==================
Expand Down
6 changes: 4 additions & 2 deletions docs/source/install.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
Installation
============

ELI5 works in Python 2.7 and Python 3.3+. Currently it requires scikit-learn,
so make sure scikit-learn is installed first, then install eli5 using pip::
ELI5 works in Python 2.7 and Python 3.4+. Currently it requires
scikit-learn 0.18+, so make sure scikit-learn is installed first,
then install eli5 using pip::

pip install 'scikit-learn > 0.18'
pip install eli5

35 changes: 35 additions & 0 deletions docs/source/overview.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
Overview
========

ELI5 is a Python package which helps to debug machine learning
classifiers and explain their predictions. It provides support for the
following machine learning frameworks and packages:

* scikit-learn_. Currently ELI5 allows to explain weights and predictions
of scikit-learn linear classifiers and regressors, print decision trees
as text or as SVG, show feature importances of random forests. ELI5
understands text processing utilities from scikit-learn and can highlight
text data accordingly. It also allows to debug scikit-learn pipelines which
contain HashingVectorizer, by undoing hashing.

* lightning_ - explain weights and predictions of lightning classifiers and
regressors.

* sklearn-crfsuite_. ELI5 allows to check weights of sklearn_crfsuite.CRF
models.

ELI5 also provides an alternative implementation of LIME_ algorithm,
which allows to explain predictions of any black-box classifier. This feature
is currently experimental.

Explanation and formatting are separated; you can get text-based explanation
to display in console, HTML version embeddable in an IPython notebook
or web dashboards, or JSON version which allows to implement custom
rendering and formatting on a client.

.. _lightning: https://github.com/scikit-learn-contrib/lightning
.. _scikit-learn: https://github.com/scikit-learn/scikit-learn
.. _sklearn-crfsuite: https://github.com/TeamHG-Memex/sklearn-crfsuite
.. _LIME: http://arxiv.org/abs/1602.04938

License is MIT.
5 changes: 0 additions & 5 deletions docs/source/usage.rst

This file was deleted.

3 changes: 2 additions & 1 deletion eli5/sklearn_crfsuite/explain_weights.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,8 @@ def filter_transition_coefs(transition_coef, indices):

def ner_default_target_order(crf_classes):
"""
Return default order of labels for NER tasks
Return default order of labels for NER tasks:
>>> ner_default_target_order(['B-ORG', 'B-PER', 'O', 'I-PER'])
['O', 'B-ORG', 'B-PER', 'I-PER']
"""
Expand Down

0 comments on commit ea0c36d

Please sign in to comment.