DOC Remove mentions of milk

This was very old an un-maintained. closes #101
luispedro · Jun 11, 2020 · 44debfc · 44debfc
1 parent 90e85fe
commit 44debfc
Show file tree

Hide file tree

Showing 2 changed files with 14 additions and 80 deletions.
diff --git a/docs/source/classification.rst b/docs/source/classification.rst
@@ -1,14 +1,13 @@
 ======================================
 Tutorial: Classification Using Mahotas
 ======================================
-.. versionadded:: 0.8
-    Before version 0.8, texture was under mahotas, not under mahotas.features
 
-Here is an example of using mahotas and `milk <http://luispedro.org/software/milk>`_
-for image classification (but most of the code can easily be adapted to use
-another machine learning package).  I assume that there are three important
-directories: ``positives/`` and ``negatives/`` contain the manually labeled
-examples, and the rest of the data is in an ``unlabeled/`` directory.
+Here is an example of using mahotas and `scikit-learn
+<https://scikit-learn.org>`__ for image classification (but most of the code
+can easily be adapted to use another machine learning package).  I assume that
+there are three important directories: ``positives/`` and ``negatives/``
+contain the manually labeled examples, and the rest of the data is in an
+``unlabeled/`` directory.
 
 Here is the simple algorithm:
 
@@ -25,7 +24,6 @@ We start with a bunch of imports::
     from glob import glob
     import mahotas
     import mahotas.features
-    import milk
     from jug import TaskGenerator
 
 Now, we define a function which computes features. In general, texture features
@@ -40,16 +38,18 @@ are very fast and give very decent results::
 the mean (sometimes you use the spread ``ptp()`` too).
 
 Now a pair of functions to learn a classifier and apply it. These are just
-``milk`` functions::
+``scikit-learn`` functions::
 
     @TaskGenerator
     def learn_model(features, labels):
-        learner = milk.defaultclassifier()
-        return learner.train(features, labels)
+        from sklearn.ensemble import RandomForestClassifier
+        clf = RandomForestClassifier()
+        clf.fit(features, labels)
+        return clf
 
     @TaskGenerator
     def classify(model, features):
-         return model.apply(features)
+         return model.predict(features)
 
 We assume we have three pre-prepared directories with the images in jpeg
 format. This bit you will have to adapt for your own settings::
@@ -73,6 +73,7 @@ This uses texture features, which is probably good enough, but you can play
 with other features in ``mahotas.features`` if you'd like (or try
 ``mahotas.surf``, but that gets more complicated).
 
-(This was motivated by `a question on Stackoverflow <http://stackoverflow.com/questions/5426482/using-pil-to-detect-a-scan-of-a-blank-page/5505754>`__).
+(This was motivated by `a question on Stackoverflow
+<http://stackoverflow.com/questions/5426482/using-pil-to-detect-a-scan-of-a-blank-page/5505754>`__).
 
 
diff --git a/docs/source/surfref.rst b/docs/source/surfref.rst
@@ -67,70 +67,3 @@ We now compute all features for all images in widefield dataset::
             features.append(surf_ref(f, ref))
             labels.append(dir)
             origins.append(origin_counter)
-
-Classification
---------------
-
-With all the precomputed features, we can now run 10~fold cross-validation on
-these features.
-
-We will using milk for machine learning::
-
-    import milk
-
-Milk's interface is around learner objects. We are going to define a function::
-
-    def train_model(features, labels):
-
-The first step is to find centroids::
-
-    # concatenate all the features:
-    concatenated = np.concatenate(features)
-
-We could use the whole array concatenated for kmeans. However, that would take
-a long time, so we will use just 1/16th of it::
-
-    concatenated = concatenated[::16]
-    _,centroids = milk.kmeans(concatenated, k=len(labels)//4, R=123)
-
-The R argument is the random seed. We set it to a constant to get reproducible
-results, but feel free to vary it.
-
-Based on these centroids, we project the features to histograms. Now, we are
-using all of the features::
-
-    features = np.array([
-        project_centroids(centroids, fs, histogram=True)
-            for fs in features])
-
-Finally, we can use a traditional milk learner (which will perform feature
-selection, normalization, and SVM training)::
-
-    learner = milk.defaultlearner()
-    model = learner.train(features, labels)
-
-We must return both the centroids that were used and the classification model::
-
-    return centroids, model
-
-To classify an instance, we define another function, which uses the centroids
-and the model::
-
-    def apply_many(centroids, model, features):
-        features = np.array([
-                project_centroids(centroids, fs, histogram=True)
-                    for fs in features])
-        return model.apply_many(features)
-
-In fact, while the above will work well, milk already provides a learner object
-which will perform all of those tasks!
-
-::
-
-    import milk
-    from milk.supervised.precluster import frac_precluster_learner
-
-    learner = frac_precluster_learner(kfrac=4, sample=16)
-    cmatrix,names = milk.nfoldcrossvalidation(features, labels, origins=origins, learner=learner)
-    acc = cmatrix.astype(float).trace()/cmatrix.sum()
-    print('Accuracy: {.1}%'.format(100.*acc))