## Classify Embryo Orientation 

Uses a Support Vector Classifier to estimate if the embryo is in ventral or lateral orientation.

The classification is important because the VNC length calculation depends on the embryo orientation. 
This is therefore the first step for calculating VNC length.

We need annotated data to train the SVC. 
The classification is going to be done inspecting the first movie frames, so it's important to annotate frames that also belong to the beggining of the movie.
To speed up processing, the images were downsampled and saved as numpy arrays.
I used the first 300 frames of each movie to collect the annotated data. 
Annotated data is not very time consuming: you can visualize one embryo with `pre_process.display` and save its features with `feature_extraction.extract`.
This means that from a single movie you can already take 300 data points, since the embryo usually holds the same orientation, specially at the beginning of the episodes.

The features will be saved at `./data/downsampled/features/`.
We can then feed the data to the SVC, and check the cross validation score.

Once the model is built, we can predict new data.
In practice, we will first check if there are available trained models, if not we'll fit a new SVC and save the results for future analyses.

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib widget
import os
from pathlib import Path
import matplotlib.pyplot as plt

from pasnascope import pre_process, feature_extraction, classifier


Movies are first downsampled to speed up feature calculation.

To downsample, call `pre_process.downsample_all`.


In [None]:
project_dir = Path(os.getcwd()).parent
downsampled_dir = os.path.join(project_dir, 'data', 'downsampled')
file_names = sorted([f for f in os.listdir(downsampled_dir) if f.startswith('ds')])
print(file_names)

An example on how to inspect a movie to annotate it.

To display the movie, change `show` to `True` and select the file index.

In [None]:
show = False
i = 0
if show:
    pre_process.display(os.path.join(downsampled_dir, file_names[i]))

Extracting features

Features calculated are: centroid x and y positions, first and second Hu moments, and area. 

In [None]:
feature_extraction.extract_all(downsampled_dir, save=False)

Training the SVC

We are ready to pass the features to the SVC.
The data still needs to be annotated (each one of the features file must be associated with a ventral or lateral orientation).

In [None]:
output_dir = os.path.join(project_dir, 'data', 'models')
samples_dir = os.path.join(project_dir, 'data', 'downsampled', 'features')

# Flip save to True to actually save the model
classifier.fit_SVC(800, samples_dir, save=False, output_dir=output_dir)

In [None]:
img_path = os.path.join(project_dir, 'data', 'embryos')
imgs = [f for f in os.listdir(img_path) if f.endswith('ch2.tif')]

i = 3

img_to_classify = classifier.pre_process_tiff(os.path.join(img_path, imgs[i]))

orientation = classifier.classify_image(os.path.join(img_path, imgs[i]))

if orientation == 'v':
    print(f"{imgs[i]} classified as ventral orientation.")
else:
    print(f"{imgs[i]} classified as lateral orientation.")

fig, ax = plt.subplots()
ax.imshow(img_to_classify)
plt.show()

