Skip to content

Commit

Permalink
remove ipython ext
Browse files Browse the repository at this point in the history
  • Loading branch information
TomAugspurger committed Sep 10, 2018
1 parent 1c77aed commit 5218de9
Show file tree
Hide file tree
Showing 7 changed files with 21 additions and 26 deletions.
2 changes: 1 addition & 1 deletion docs/source/compose.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Dask-ML estimators follow the scikit-learn API. This means Dask-ML estimators li

See http://scikit-learn.org/dev/modules/compose.html for more on using pipelines in general.

.. ipython:: python
.. code-block:: python
from sklearn.pipeline import Pipeline # regular scikit-learn pipeline
from dask_ml.cluster import KMeans
Expand Down
5 changes: 0 additions & 5 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,6 @@
"sphinx.ext.autodoc",
"sphinx.ext.autosummary",
"sphinx.ext.extlinks",
"IPython.sphinxext.ipython_console_highlighting",
"IPython.sphinxext.ipython_directive",
"nbsphinx",
"numpydoc",
# 'sphinx_gallery.gen_gallery',
]

intersphinx_mapping = {
Expand Down
4 changes: 2 additions & 2 deletions docs/source/cross_validation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This document only describes the extensions made to support Dask arrays.

The simplest way to split one or more Dask arrays is with :func:`dask_ml.model_selection.train_test_split`.

.. ipython:: python
.. code-block:: python
import dask.array as da
from dask_ml.datasets import make_regression
Expand All @@ -17,7 +17,7 @@ The simplest way to split one or more Dask arrays is with :func:`dask_ml.model_s
The interface for splitting Dask arrays is the same as scikit-learn's version.

.. ipython:: python
.. code-block:: python
X_train, X_test, y_train, y_test = train_test_split(X, y)
X_train # A dask Array
Expand Down
2 changes: 1 addition & 1 deletion docs/source/glm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ and dataframes.
Example
-------

.. ipython:: python
.. code-block:: python
from dask_ml.linear_model import LogisticRegression
from dask_ml.datasets import make_classification
Expand Down
10 changes: 5 additions & 5 deletions docs/source/incremental.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ fact that models are typically much smaller than data, and so faster to move
between machines.


.. ipython:: python
.. code-block:: python
from dask_ml.datasets import make_classification
from dask_ml.wrappers import Incremental
Expand Down Expand Up @@ -83,20 +83,20 @@ the wrapped ``fit``.

We can get the accuracy score on our dataset.

.. ipython:: python
.. code-block:: python
clf.score(X, y)
All of the attributes learned durning training, like ``coef_``, are available
on the ``Incremental`` instance.

.. ipython:: python
.. code-block:: python
clf.coef_
If necessary, the actual estimator trained is available as ``Incremental.estimator_``

.. ipython:: python
.. code-block:: python
clf.estimator_
Expand All @@ -110,7 +110,7 @@ To search over the hyper-parameters of the underlying estimator, use the usual s
prefixing the parameter name with ``<name>__``. For ``Incremental``, ``name`` is always ``estimator``.


.. ipython:: python
.. code-block:: python
from sklearn.model_selection import GridSearchCV
Expand Down
10 changes: 5 additions & 5 deletions docs/source/meta-estimators.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ your training dataset is relatively small (fits in a single machine's memory),
and prediction or transformation must be done on a much larger dataset (perhaps
larger than a single machine's memory).

.. ipython:: python
.. code-block:: python
from sklearn.ensemble import GradientBoostingClassifier
import sklearn.datasets
Expand All @@ -39,23 +39,23 @@ larger than a single machine's memory).
In this example, we'll make a small 1,000 sample training dataset

.. ipython:: python
.. code-block:: python
X, y = sklearn.datasets.make_classification(n_samples=1000,
random_state=0)
Training is identical to just calling ``estimator.fit(X, y)``. Aside from
copying over learned attributes, that's all that ``ParallelPostFit`` does.

.. ipython:: python
.. code-block:: python
clf = ParallelPostFit(estimator=GradientBoostingClassifier())
clf.fit(X, y)
This class is useful for predicting for or transforming large datasets.
We'll make a larger dask array ``X_big`` with 10,000 samples per block.

.. ipython:: python
.. code-block:: python
X_big, _ = dask_ml.datasets.make_classification(n_samples=100000,
chunks=10000,
Expand All @@ -67,7 +67,7 @@ cause the scheduler to compute tasks in parallel. If you've connected to a
``dask.distributed.Client``, the computation will be parallelized across your
cluster of machines.

.. ipython:: python
.. code-block:: python
clf.predict_proba(X_big).compute()[:10]
Expand Down
14 changes: 7 additions & 7 deletions docs/source/preprocessing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ good alternative to `DictVectorizer`_ and `CountVectorizer`_) and `HashingVector
(best suited for use in text over `CountVectorizer`_). They are not
stateful, which allows easy use with Dask with ``map_partitions``:

.. ipython:: python
.. code-block:: python
import dask.bag as db
from sklearn.feature_extraction import FeatureHasher
Expand Down Expand Up @@ -100,7 +100,7 @@ support the same API as the NumPy ndarray, so most methods won't work on the
result. Even basic things like ``compute`` will fail. To work around this,
we currently recommend converting the sparse matricies to dense.

.. ipython:: python
.. code-block:: python
from dask_ml.preprocessing import OneHotEncoder
import dask.array as da
Expand All @@ -114,7 +114,7 @@ we currently recommend converting the sparse matricies to dense.
Each block of ``result`` is a scipy sparse matrix

.. ipython:: python
.. code-block:: python
result.blocks[0].compute()
# This would fail!
Expand Down Expand Up @@ -152,7 +152,7 @@ In this toy example, we use a dataset with two columns. ``'A'`` is numeric and
2. Dummy encode the categorical data
3. Fit a linear regression

.. ipython:: python
.. code-block:: python
from dask_ml.preprocessing import Categorizer, DummyEncoder
from sklearn.linear_model import LogisticRegression
Expand Down Expand Up @@ -180,7 +180,7 @@ the ``object`` dtype columns.
categorical column with multiple columns, where the values are either 0 or 1,
depending on whether the value in the original.

.. ipython:: python
.. code-block:: python
df['B']
pd.get_dummies(df['B'])
Expand All @@ -195,13 +195,13 @@ depend on the values present*. For example, suppose that we just saw the first
two rows in the training, and the last two rows in the tests datasets. Then,
when training, our transformed columns would be:

.. ipython:: python
.. code-block:: python
pd.get_dummies(df.loc[[0, 1], 'B'])
while on the test dataset, they would be:

.. ipython:: python
.. code-block:: python
pd.get_dummies(df.loc[[2, 3], 'B'])
Expand Down

0 comments on commit 5218de9

Please sign in to comment.