# Plotting and Analysis

The role of this notebook is to plot and analyze logs results of a run (or runs) of a simulator, given some fixed timing configuration.
These logs (bboxes.csv) are obtained by running a simulator on some experiments. The goal of these plots is to analyze worm's behavior,
and to analyze the systems error and how it's affected by different behaviors the worm exhibits.

It's important to note that for proper analysis, all the experiments that are analyzed by this notebook *at once* must have the same timing configuration (TimingConfig) parameters.

In [1]:
# fix imports
import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

In [2]:
import matplotlib.pyplot as plt
from wtracker.eval import *
from wtracker.sim.config import TimingConfig
from wtracker.utils.gui_utils import UserPrompt

### Timing configuration and log files selection

In [4]:
from pprint import pprint

base_path = "D:\\Guy_Gilad\\FinalEvaluations\\Exp0_config4_CSV\\"

################################ User Input ################################

# path to the timing config file. 
# If None, a file dialog will open to select a file
time_config_path = base_path + "time_config.json"

# list containing paths to simulation log files.
# All of these simulations must have been run with the above timing config.
# If empty, a file dialog will open to select files.
log_files = [base_path + "bboxes.csv"]

data_save_path = base_path + "data.pkl"

############################################################################

timing_config = TimingConfig.load_json(time_config_path)

if len(log_files) == 0:
    log_files = UserPrompt.open_file(title="Select log files", filetypes=[("Log files", "*.csv")], multiple=True)


pprint(timing_config)
pprint(log_files)

TimingConfig(px_per_mm=90,
             mm_per_px=0.011111111111111112,
             frames_per_sec=60,
             ms_per_frame=16.666666666666668,
             imaging_time_ms=200,
             imaging_frame_num=12,
             pred_time_ms=40,
             pred_frame_num=3,
             moving_time_ms=50,
             moving_frame_num=3,
             camera_size_mm=[4, 4],
             camera_size_px=[360, 360],
             micro_size_mm=[0.22, 0.22],
             micro_size_px=[20, 20])
['D:\\Guy_Gilad\\FinalEvaluations\\Exp0_config4_CSV\\bboxes.csv']


In [6]:
from wtracker.eval.plotter import Plotter
from wtracker.eval.data_analyzer import DataAnalyzer

log1 = "D:\\Guy_Gilad\\FinalEvaluations\\Exp0_config4_CSV\\bboxes.csv"
log2 = "D:\\Guy_Gilad\\FinalEvaluations\\Exp0_config1_Optimal\\analysis.csv"

analyzer = DataAnalyzer.load(timing_config, log2)
analyzer._unit="sec"
analyzer.change_unit("frame")
analyzer.clean(imaging_only=False, bounds=(73, 38, 1551, 1359), trim_cycles=True)
analyzer.change_unit("sec")
display(analyzer.describe(["time", "wrm_x", "wrm_speed_x", "wrm_speed"]))

analyzer = DataAnalyzer.load(timing_config, log1)
analyzer.initialize(period=10)
analyzer.clean(imaging_only=False, bounds=(73, 38, 1551, 1359), trim_cycles=True)
analyzer.change_unit("sec")
display(analyzer.describe(["time", "wrm_x", "wrm_speed_x", "wrm_speed"]))


Unnamed: 0,time,wrm_x,wrm_speed_x,wrm_speed
count,53972.0,53972.0,53959.0,53959.0
mean,497.369415,14017.01486,2.596402,130.055597
std,308.975199,2149.834057,112.312627,106.339422
min,0.25,9719.2315,-582.06278,0.2847
25%,225.129165,12730.693267,-53.60286,43.249235
50%,464.408335,12889.71761,0.17752,110.74474
75%,788.520835,16566.453635,53.558105,191.05653
max,1019.23333,17153.1879,609.42741,860.66298


Unnamed: 0,time,wrm_x,wrm_speed_x,wrm_speed
count,53987.0,53987.0,53974.0,53974.0
mean,497.514449,14017.871312,2.611503,130.043656
std,309.054746,2150.149213,112.303388,106.329085
min,0.25,9719.231444,-582.06,0.286667
25%,225.191667,12730.781111,-53.586667,43.26
50%,464.533333,12890.839667,0.186667,110.736667
75%,788.708333,16566.903944,53.586667,191.018333
max,1019.483333,17153.187889,609.426667,860.66


### Plotting configuration

Notice that all of below plots accept `condition` as a parameter.
`condition` is expected to be a function of the following signature:

```python
def cond_func1(input_df: pd.DataFrame) -> pd.DataFrame:
    return (input_df["wrm_speed"] > 5) &  (input_df["wrm_speed"] <= 30)
```

In python, such functions can be also declared without an explicit name and declaration, using the following syntax:
(for more information read about lambda functions)

```python
cond_func1 = lambda input_df: (input_df["wrm_speed"] > 5) & (input_df["wrm_speed"] <= 30)
cond_func2 = lambda input_df: input_df["phase"] == "imaging"
```

##### Optionally, Calculate precise error

To calculate precise error of the system, run the following cell, otherwise skip it.
Note, that running this cell might take a while.

For each frame, the exact pixels in which worm's head is located are calculated. To this end, there is a need to access worm images which were extracted during the experiment initialization process.
Afterwards, the error is calculated as the proportion of worm pixels that are outside of the microscope view. 
Since to calculate this error there is a need to load images from the disk, the calculation is relatively slow.

In [None]:
import numpy as np
from wtracker.utils.frame_reader import FrameReader

# TODO: TEST
# TODO: ADD DOCS FOR THIS SECTION

################################ User Input ################################

background_path = "data\\Exp2_GuyGilad_logs_yolo\\background.npy"

worm_folder_path = "D:\\Guy_Gilad\\Exp2_GuyGilad\\logs_yolo\\worms"

diff_thresh = 20

############################################################################

if background_path is None:
    background_path = UserPrompt.open_file(title="Select background images", file_types=[("Numpy files", "*.npy")])

if worm_folder_path is None:
    worm_folder_path = UserPrompt.open_directory(title="Select worm image folders")

print("Background Files: ", background_path)
print("Worm Image Folders: ", worm_folder_path)

background = np.load(background_path, allow_pickle=True)

worm_reader = FrameReader.create_from_directory(worm_folder_path)

##### Calibrate Threshold [Optional]

In [None]:
from wtracker.eval.vlc import StreamViewer
from wtracker.eval.error_calculator import ErrorCalculator
from wtracker.utils.frame_reader import FrameReader
import pandas as pd
import numpy as np

viewer = StreamViewer(window_name="Threshold Calibration")

In [None]:

################################ User Input ################################
threshold = 20
exp_number = 0 # the number of the experiment in the list
delay = 0
############################################################################
def show_sementation(wrm_view:np.ndarray, wrm_mask:np.ndarray) -> None:
    wrm_view[~wrm_mask] = 0
    viewer.imshow(wrm_view)
    viewer.waitKey(delay)


ErrorCalculator.probe_hook = show_sementation

reader = FrameReader.create_from_directory(worm_folder_path)
log = pd.read_csv(log_files[exp_number])

viewer.open()
shape = [*reader.frame_shape]
shape[:2] = background.shape[:2]
background.reshape(shape)

ErrorCalculator.calculate_precise(
    background=background,
    worm_bboxes=log[["wrm_x", "wrm_y", "wrm_w", "wrm_h"]].to_numpy(),
    mic_bboxes=log[["mic_x", "mic_y", "mic_w", "mic_h"]].to_numpy(),
    frame_nums=log['frame'].astype(int).to_list(),
    worm_reader=reader,
    diff_thresh=threshold
)

##### Calculate Precise error

In [None]:
analyzer.calc_precise_error(
    worm_reader=worm_reader,
    background=background,
    diff_thresh=diff_thresh,
)

In [None]:
analyzer.describe(["precise_error"], percentiles=[0.25, 0.5, 0.75, 0.9, 0.95, 0.99])

##### Save Data

In [None]:
data_save_path = base_path + "data.pkl"

if data_save_path is None:
    data_save_path = UserPrompt.save_file(title="Save data", filetypes=[("Pickle files", "*.pkl")])

analyzer.save(data_save_path)

In [None]:
raise KeyError()

### Plotting and analysis

In [None]:
# TODO: FIX
# MAKE SURE THAT WE LOAD DATAANALYZERS AND MANUALLY CHANGE THE UNIT TO "sec" INSTEAD OF "frame" since the current logs were saved in the "sec" unit

data_list = [
    "/mnt/c/Users/freid/Desktop/FinalEvaluations/Exp0_config1_Optimal/data.pkl",
    "/mnt/c/Users/freid/Desktop/FinalEvaluations/Exp1_config1_Optimal/data.pkl",
    "/mnt/c/Users/freid/Desktop/FinalEvaluations/Exp2_config1_Optimal/data.pkl",
    "/mnt/c/Users/freid/Desktop/FinalEvaluations/Exp3_config1_Optimal/data.pkl",
    "/mnt/c/Users/freid/Desktop/FinalEvaluations/Exp4_config1_Optimal/data.pkl",
]

data_list = [DataAnalyzer.load(path) for path in data_list]

if len(data_list) == 0:
    file_paths = UserPrompt.open_file(title="Select data files", filetypes=[("Pickle files", "*.pkl")], multiple=True)
    data_list = [DataAnalyzer.load(path) for path in file_paths]

In [None]:
# create the plotter
pltr = Plotter(data_list, plot_height=7, palette="bright")

In [None]:
# print column names of the data
pprint([f"{i}: {col}" for i, col in enumerate(analyzer.column_names())])

In [None]:
analyzer.print_stats()

In [None]:
pltr.plot_trajectory(hue_col="log_num", condition=lambda x: x["wrm_y"] >= 0)
plt.show()

In [None]:
pltr.plot_speed(log_wise=True, condition=lambda x: x["wrm_speed"] <= 800, hue_col="log_num")
plt.show()

In [None]:
pltr.plot_error(log_wise=True, error_kind="bbox", hue_col="log_num")
plt.show()

In [None]:
pltr.plot_speed_vs_error(error_kind="bbox", condition=lambda x: (x["wrm_speed"] < 2000) & (x["bbox_error"] > 1e-5))
plt.show()

In [None]:
pltr.plot_deviation(percentile=0.99, log_wise=False)
plt.show()

In [None]:
pltr.plot_head_size(hue_col="log_num", alpha=0.5)
plt.show()

In [None]:
for analyzer in data_list:
    display(analyzer.describe(columns=["wrm_speed", "bbox_error", "worm_deviation"], num=19))

In [None]:
import numpy as np

# find anomalies in the data
analyzer.calc_anomalies(
    no_preds=True,
    min_bbox_error=1.0,
    min_dist_error=np.inf,
    min_speed=np.inf,
    min_size=300,
)