## Label image data in plate 180528_Plate3
*Vladislav Kim*


* [Introduction](#1)
* [Initial training set](#2)

<a id="1"></a> 
## Introduction
The idea of this notebook series is to train a pseudo-online random forest classifier for AML vs stroma cell classification. From selected plates we (for now) sample 6 DMSO wells with the highest Calcein cell count, generate predictions and correct misclassified instances and check in live ("online") mode how the predictions improve as we add more data. Note that the classifier is not truly an online classifier as we don't update the model as we go, but completely retrain the RF classfier in multicore mode.

In general we can implement targeted online learning strategy: we can select a number of wells that are of interest to us (target wells), e.g. DMSO control wells or wells with certain high-priority drugs, the accuracy of which we want to improve, in the first place. We sample from these target wells from selected plates and evaluate the classification accuracy as we go (pseudo online learning).


<a id="2"></a>
## Initial Training Set: 180528_Plate3
At first we will re-train the classifier on the plate `180528_Plate3`, as it manifests a very striking contrast between mono- and co-cultures. We want to rule out the fact that this could be a segmentation (classification in this case) artefact 

In [None]:
# load third-party Python modules
import javabridge
import bioformats as bf
import skimage
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sn
import pandas as pd

import sys
sys.path.append('../../..')

javabridge.start_vm(class_path=bf.JARS)

In [None]:
from base.utils import load_imgstack
imgstack = load_imgstack(fname="../../data/AML_trainset/180528_Plate3/r02c14.tiff")

# remove a 'dummy' z-axis
img = np.squeeze(imgstack)

# nuclei
hoechst = img[:,:,0]**0.3

In [None]:
df = pd.read_csv('../../data/AML_trainset/180528_Plate3/r02c14.csv')

In [None]:
from segment.tools import read_bbox
rmax, cmax = hoechst.shape

bbox = read_bbox(df=df, rmax=rmax, cmax=cmax, pad=0)

In [None]:
from base.plot import show_bbox
#show_bbox(hoechst, bbox)

**Plotly visualization works!**

In [None]:
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.graph_objs as go
init_notebook_mode(connected=True)

In [None]:
from extra.viz import plotly_viz

In [None]:
layout, feats = plotly_viz(hoechst, bb=bbox)

In [None]:
#iplot(dict(data=feats, layout=layout))

**Modify `IncrementalClassifier` class to adapt to our use**

In [None]:
# incremental ("online") classifier
clf_incr = OT.IncrementalClassifier(path=path, featdir=featdir,
                                 select_well=select_inst[0],
                                 target_names=target_names,
                                 X_train_norm=X_train_norm,
                                 X_train_prop=X_train_prop,
                                 y_train=y_train
                                )

In [None]:
clf_incr = (clf_incr.load_img().
            train_classifier().
           generate_predictions().
           set_scene())

In [None]:
#clf_incr.plot()

In [None]:
newlabels = np.array([[45,2], [91,5], [85,0], [2,2]])

In [None]:
clf_incr = (clf_incr.
            add_instances(newlabels=newlabels).
            train_classifier().
            generate_predictions().update_scene())