# Tutorial notebook of rhsegmentor

This tutorial guides you throuhg the main use cases of the `rhsegmentor` package.

### Step 0: imports

Import the required modules

In [1]:
import os
import sys
sys.path.append("..")

# imports the rhsegmentor (most important functions are available at the top level of the package)
import rhsegmentor as rh
from rhsegmentor import utils

import matplotlib.pyplot as plt
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from skimage import io

%matplotlib tk


### Step 1: look at some sample data

The method `load_training_image` allows to read an image and the tracings for training as well. The `auto_transform` option allows to automatically transform the tracings coordinates into the coordinate system of the image.

In [2]:
im, names, vertices_s, vertices_e = rh.load_training_image(img_file = "../sample_newMachine/AG00DA6Z_000002.jpg",
                                                        root_traces_file = "../sample_newMachine/AG00DA6Z_000002 vertices.csv",
                                                        auto_transform=False)

#transform into row-column coordinates
vertices_s_RC = utils.flip_XY_RC(vertices_s)
vertices_e_RC = utils.flip_XY_RC(vertices_e)

To create training data from the loaded images, the tracings are first transformed into a root-segmentation mask with `root_segmentation_mask`. This function create a np.ndarray mask image containing root-pixels (1), relevant background pixels for making a classification (2) and unclassified pixels (3). To do that, buffer zones are used around the images.


In [3]:
# create segmentation mask
mask = rh.root_segmentation_mask(im = im,
                          vertices_s_RC = vertices_s_RC,
                          vertices_e_RC = vertices_e_RC,
                          dilatation_radius= 2,
                          buffer_radius = 5,
                          no_root_radius = 30)

The function `show_traces` allows to plot an image with the traincing on top (similar to imshow). Use `%matplotlib tk` for pop-up viewer

In [4]:
plt.subplot(1, 2, 1)
rh.show_traces(vertices_s, vertices_e, im)
plt.subplot(1, 2, 2)
rh.show_traces(vertices_s, vertices_e, mask)

### Step 2: Compile a dataset for training

The tracings of multiple images are combined to learn a pixel-classifier. To achieve this goal, the following steps are taken:
* All images and tracings in `some_folder` are listed
* Per image, pixel-level features are computed (texture, gradient image etc.)
* The per image, the function `create_root_buffer_background_image` is used to comput the label of every pixel
* A fraction is points is sampled (reducing training dataset size and rebalancing it somewhat)
* The preveous steps are applied to all images in `some_folder` and combined in a features dataset `X` and labels dataset `Y`

The first step only computes labels and features per image and stores them as `npy` files.

In [7]:

# compute FEATURES and LABELS for each image in a given folder
files_list = utils.listdir_with_path('../sample_newMachine/small/', suffix = ".jpg")
rh.imgs_to_XY_data(img_file_list = files_list,
                    root_traces_file_list = None,
                    auto_transform = False,
                    dilatation_radius = 2,
                    buffer_radius = 5,
                    no_root_radius = 30,
                    sigma_max = 10,
                    save_masks_as_im = True,
                    save_dir = '../sample_newMachine/small/features')

The second step combines the generated files to create `X` and `Y`

In [2]:
# create training datasets
features_file_list = ['../sample_newMachine/small/features/'+f for f in os.listdir('../sample_newMachine/small/features/') if f[-3:] == "npy" and "FEATURES" in f]
X, Y = rh.compile_training_dataset_from_precomputed_features(features_file_list, sample_fraction=(1.0, 1.0))

### Step 3: Train a model and save it

The compiled dataset is used to train a random forest classifier

In [3]:
# fit random forest classifier (any other classifier)
clf = RandomForestClassifier(n_estimators=100, n_jobs=-1,
                            max_depth=10, max_samples=0.05)
clf.fit(X, Y)
# dump the model to a file
rh.dump_model(clf, '../models/RF_AGO_000and002_NTrees-100_TEST.joblib')

['../models/RF_AGO_000and002_NTrees-100_TEST.joblib']

### Step 4: Load a saved model

Select a saved model and load it

In [2]:
clf = rh.load_model('../models/RF_AGO_000and002_NTrees-100_TEST.joblib')

### Step 5: Load new image to make predictions (and compare with tracings)

In [3]:
im = io.imread("../sample_newMachine/test_cases/AG00IHWX_000000.jpg")
# compute features
features = rh.im2features(im, sigma_max = 10)
# predict
predicted_segmentation = rh.predict_segmentor(clf, features)
# clean detected roots
roots = rh.clean_predicted_roots(predicted_segmentation, small_objects_threshold=150, closing_diameter = 4)

Visualize the results

In [None]:
# draw detected roots
im_out = rh.draw_detected_roots(roots, im, root_thickness = 7, minimalBranchLength = 10)
# measure root properties and show as table
rh.measure_roots(roots, root_thickness = 7, minimalBranchLength = 10)

### Step 6: Export the results to a file

The lenths, orientation, position etc. of the roots can be exported to a file

In [6]:
results_df = rh.measure_roots(roots)
results_df.to_excel("../sample_newMachine/out/measurements.xlsx")

### Step 7: Automate classification per folder

Lists all files in `some_dir`, detects roots and saves the results in a xlsx file. All detected roots are saved for quality checking.

In [3]:
im = io.imread("../sample_newMachine/test_cases/AG00IHWX_000000.jpg")

In [4]:
res = rh.extract_rh_props(im, clf = clf)

In [8]:
features_file_list = utils.listdir_with_path('../sample_newMachine/test_cases/', suffix = ".jpg")
features_file_list = utils.listdir_with_path('../sample_newMachine/small/', suffix = ".jpg")
save_dir = "../sample_newMachine/out/"
res = rh.batch_extract_rh_props(file_list=features_file_list, clf = clf, save_dir=save_dir)

In [9]:
res

Unnamed: 0,X,Y,orientation,length,approx_check,adjusted_length,adjusted_length_with_sqrt2,fname
0,1477.160000,183.842921,0.055571,252.539235,Bad approx,561.000000,594.379726,../sample_newMachine/small/AG003B8X000001.jpg
1,1046.026606,474.975057,-0.352643,124.017761,Bad approx,312.000000,364.190909,../sample_newMachine/small/AG003B8X000001.jpg
2,1299.638649,730.645115,0.023735,106.438163,OK,106.438163,106.438163,../sample_newMachine/small/AG003B8X000001.jpg
3,1461.243243,763.148148,-0.244384,84.571346,OK,84.571346,84.571346,../sample_newMachine/small/AG003B8X000001.jpg
4,1317.775348,802.182903,-0.537929,54.820845,OK,54.820845,54.820845,../sample_newMachine/small/AG003B8X000001.jpg
...,...,...,...,...,...,...,...,...
42,317.220837,1509.621590,0.149442,353.749693,Bad approx,442.000000,493.948268,../sample_newMachine/small/AG00IHWX_000007.jpg
43,191.334179,1486.811337,0.395039,224.284518,OK,224.284518,224.284518,../sample_newMachine/small/AG00IHWX_000007.jpg
44,100.286811,1514.358175,0.423810,227.724730,OK,227.724730,227.724730,../sample_newMachine/small/AG00IHWX_000007.jpg
45,324.590164,1467.551913,0.313729,64.045784,OK,64.045784,64.045784,../sample_newMachine/small/AG00IHWX_000007.jpg


In [5]:
# list all test images in folder
features_file_list = utils.listdir_with_path('../sample_newMachine/test_cases/', suffix = ".jpg")
save_dir = "../sample_newMachine/out/"

all_results = []

for fname in features_file_list:
    # read image
    im = io.imread(fname)
    # compute features
    features = rh.im2features(im, sigma_max = 10)
    # predict
    predicted_segmentation = rh.predict_segmentor(clf, features)
    # clean detected roots
    roots = rh.clean_predicted_roots(predicted_segmentation, small_objects_threshold=150, closing_diameter = 4)
    # compute root properties
    results_df = rh.measure_roots(roots, root_thickness = 7, minimalBranchLength = 10)
    results_df["fname"] = fname
    # append to results list
    all_results.append(results_df)
    # save image for quality check
    fname_save = utils.get_save_fname(fname = fname,
                                      save_dir = save_dir,
                                      suffix = "result.png")
    rh.save_detected_roots_im(clean_root_image = roots,
                              original_im = im,
                              fname = fname_save,
                              root_thickness = 7,
                              minimalBranchLength = 10)

# concatenate and save in excel-format
pd.concat(all_results).to_excel(os.path.join(save_dir,"measurements.xlsx"))
