# Development 

## Image to Value(s)

### Primary Focus: Explainable Object Counting in Microscopy Images
### Application: Explainable Virus Capsid Quantification
#### Challenge: Deep Learning as Black Box
#### Required Labels: Location Labels


TL;DR üß¨‚ú® We developed a regression model to quantify maturation states ("naked", "budding", "enveloped") of human cytomegalovirus (HCMV) during its final envelopment process i.e. secondary envelopment. Researchers can adapt the provided notebook for their own EM data analysis. 

![Teaser](./images/Teaser.png)




# Setup and Imports

*By executing the cell below, we import external libraries, which simplify the implementation of the notebook.*

In [1]:
# auto reload imports
%load_ext autoreload
%autoreload 2

# imports from the template 
from deepEM.Utils import create_text_widget, print_info, print_error, print_warning, find_file, load_json, extract_defaults
from deepEM.Logger import Logger
from deepEM.ModelTuner import ModelTuner

# costum implementation
from src.ModelTrainer import ModelTrainer


# import all required libraries
from pathlib import Path 
import os



In [2]:
# defaults should be defined by DL expert

default_datapath = "data/tem-herpes/"
data_link="https://viscom-ulm.github.io/DeepEM/your-contribution.html"

Execute the cell below to visualize a form.
Within this form, you can decide what you'd like to do for the development of the model. You can choose from three options (or do all three):

### 1. Hyperparameter Search
**Hyperparameters** in deep learning are settings that determine how the model is trained. Unlike model parameters, which are learned from data, hyperparameters are set before training. Hyperparameter tuning involves exploring a range of hyperparameters by training the model multiple times and evaluating performance on a validation set. The best-performing hyperparameters are selected for further training.

- **Hyperparameter search** is time- and resource-intensive. To reduce this, training runs are often limited to fewer epochs or smaller subsets of the data.
- Our playground offers an automated **grid search** for hyperparameter tuning. The model is trained multiple times using all possible combinations of selected hyperparameters. 
- Deep learning (DL) experts define the search space and provide explanations for each parameter, enabling electron microscopy (EM) specialists to adjust it as needed.
- The logging system provides estimates of the remaining time for a sweep, though these estimates can be inaccurate, especially at the beginning.

While a hyperparameter search isn‚Äôt strictly required, it can significantly impact a model's performance. We strongly recommend performing it, especially when training with your own annotated data. Interrupting the search before it finishes may lead to suboptimal results.


### 2. Model Training and Validation
**Model Training and Validation** are key steps in developing a deep learning model. In this phase, the model is trained using the data you provide, and its performance is validated using a separate validation set. 

- **Training** involves feeding data to the model, adjusting weights based on the loss function, and iterating through multiple epochs to learn the optimal parameters.
- **Validation** helps in assessing the model's performance on unseen data, which provides insights into how well it will generalize to new data. The validation set is crucial in preventing overfitting, ensuring the model does not memorize the training data but instead learns the underlying patterns.

Choosing appropriate hyperparameters and training data are essential for successful training and validation. After the training phase, you can evaluate the model's performance to fine-tune further.

### 3. Evaluation
**Evaluation** is the final phase, where you test the model's performance after training. The evaluation phase involves assessing the model on a separate test set to determine its accuracy, precision, recall, or any other relevant metrics.

- It provides an objective measurement of how well the model generalizes to new, unseen data.
- During this phase, you may also perform error analysis to understand where the model performs well and where it fails.
- Evaluation metrics are often used to compare the performance of different models or configurations, helping you select the best model for your needs.

### Annotated Data Path
Additionally, you need to define a **path to the annotated data** for model development. This data is essential for training, validating, and evaluating the model effectively. The data should be annotated according to the task you are working on (e.g., segmentation, classification).


In [7]:
from deepEM.EMDL_API import *

with open(os.path.join("configs","parameters.json"), "r") as f:
    config = json.load(f)

# The shared data path widget (to be displayed once)
input_data_path_shared = widgets.Text(description="Data Path:", placeholder="Enter path...", value=default_datapath, continuous_update=False)
datapath_explanation = widgets.HTML(f"<p>If you want to use your own data for model development please define the path to this folder here. For details about how to collect, annotate and preprocess your own dataset for model development please see <a href={data_link}>here.</a></p><hr>")


# ---------- Hyperparameter Search ----------
checkbox_hyperparam = widgets.Checkbox(description="‚öôÔ∏è Hyperparameter Search", style={"description_width": "0px"})
html_hyperparam_explanation = widgets.HTML("<p>Select if you want to do a hyperparameter search. This is recommended before training a model, but requires lots of compute and resources, as multiple training runs are done to select the best set of hyperparameters.</p><hr>")

radio_search_type = widgets.RadioButtons(
    options=['Default Search', 'Custom Search'], 
    description='Search Type:', 
    disabled=False
)
html_radio_search_type_explanation = widgets.HTML("<i>Select whether to use the default search space, defined by the DL expert, or a custom search space for hyperparameter tuning.</i>")

form_custom_hyperparam, input_widgets = create_hyperparameter_widgets(config)
#widgets.Textarea(description="Custom Hyperparams:", placeholder="Enter values...")
html_custom_hyperparam_explanation = widgets.HTML("<p>For defining parameters to check during the hyperparameter sweep separate them with a comma. Floating-point values should be written as `0.1`</p>")

# Group hyperparameter widgets with layout and background color
hyperparam_container = widgets.VBox([
    html_radio_search_type_explanation,
    radio_search_type, 
], layout=widgets.Layout(
    border="1px solid black", 
    padding="10px", 
    margin="10px", 
    background_color="lightblue"
))

# ---------- Training + Validation ----------
checkbox_training = widgets.Checkbox(description="üìä Training + Validation", style={"description_width": "0px"})
html_training_explanation = widgets.HTML("<p>Select if you'd like to train and validate a model.</p><hr>")

# html_selected_hyperparams = widgets.HTML("<b>Currently selected hyperparameters: Default</b>")
html_selected_hyperparams = edit_hyperparameters(input_data_path_shared.value)
checkbox_resume_training = widgets.Checkbox(description="Resume Training")
html_resume_training_explanation = widgets.HTML("<i>Select 'Resume Training' to load a previously saved model and continue training.</i>")

input_resume_model_path = widgets.Text(description="Model Path:", placeholder="Enter model path...")
html_resume_model_path_explanation = widgets.HTML("<i>Provide the path to the model checkpoint to resume training.</i>")

# Group training widgets with layout and background color
training_container = widgets.VBox([
    html_resume_training_explanation,
    checkbox_resume_training,
], layout=widgets.Layout(
    border="1px solid black", 
    padding="10px", 
    margin="10px", 
    background_color="lightgreen"
))

# ---------- Evaluation ----------
checkbox_evaluation = widgets.Checkbox(description="üßê Evaluation", style={"description_width": "0px"})
html_checkbox_evaluation_explanation = widgets.HTML("<i>Select 'Evaluation' to evaluate your model on a test dataset.</i><hr>")

input_model_path_eval = widgets.Text(description="Model Path:", placeholder="Enter path...")
html_model_path_eval_explanation = widgets.HTML("<i>Provide the path to the model checkpoint you want to evaluate.</i>")

slider_data_split = widgets.FloatSlider(description="Data Split", min=0.1, max=1.0, step=0.1, value=0.2)
html_slider_data_split_explanation = widgets.HTML("<i>Select the proportion of the data to use for evaluation (default is 0.2).</i>")

# Group evaluation widgets with layout and background color
evaluation_container = widgets.VBox([
    checkbox_evaluation,
    input_model_path_eval, 
    html_model_path_eval_explanation,
    slider_data_split,
    html_slider_data_split_explanation
], layout=widgets.Layout(
    border="1px solid black", 
    padding="10px", 
    margin="10px", 
    background_color="lightyellow"
))

# ---------- Dynamic Display Update ----------
output = widgets.Output()
import threading

# Debounce function
def debounce_update(change):
    global debounce_timer
    if debounce_timer:
        debounce_timer.cancel()  # Cancel previous timer

    # Start a new timer with 0.5s delay
    debounce_timer = threading.Timer(0.5, update_hyperparams)
    debounce_timer.start()



def update_resume(change):
    # Check if Resume Training is selected and update the container dynamically
    if checkbox_resume_training.value:
        # Add Model Path input to training container dynamically if not already added
        if input_resume_model_path not in training_container.children:
            training_container.children = list(training_container.children) + [input_resume_model_path]
    else:
        # Remove Model Path input if Resume Training is deselected
        if input_resume_model_path in training_container.children:
            training_container.children = [child for child in training_container.children if child != input_resume_model_path]



                    
def update_ui(change):
    with output:
        clear_output()

        # Display shared Data Path widget only once if Hyperparameter Search, Training, or Evaluation is selected
        display(input_data_path_shared, datapath_explanation)
        
        # Hyperparameter Search
        if checkbox_hyperparam.value:
            display(checkbox_hyperparam, hyperparam_container, html_hyperparam_explanation)
            
            
            # Check if Custom Search is selected and update the container dynamically
            if radio_search_type.value == 'Custom Search':
                # Add custom search form to hyperparameter section dynamically if not already added
                if form_custom_hyperparam not in hyperparam_container.children:
                    hyperparam_container.children = list(hyperparam_container.children) + [html_custom_hyperparam_explanation, form_custom_hyperparam]
            else:
                # Remove custom search form if Default Search is selected
                if form_custom_hyperparam in hyperparam_container.children:
                    hyperparam_container.children = [child for child in hyperparam_container.children if child != form_custom_hyperparam]
                    hyperparam_container.children = [child for child in hyperparam_container.children if child != html_custom_hyperparam_explanation]            
        else: 
            display(checkbox_hyperparam, html_hyperparam_explanation)

        # Training + Validation
        if checkbox_training.value:
            display(checkbox_training, training_container, html_training_explanation)
            
            if(not checkbox_hyperparam.value):
                training_container.children = [html_selected_hyperparams] + list(training_container.children) 
            else: 
                training_container.children = [child for child in training_container.children if child != html_selected_hyperparams]

            
            
        else: 
            display(checkbox_training, html_training_explanation)
            training_container.children = [child for child in training_container.children if child != html_selected_hyperparams]

        # Evaluation
        if checkbox_evaluation.value:
            # If Training is selected, hide Model Path and Data Split input for Evaluation
            if checkbox_training.value:
                evaluation_container.children = []  # Remove all elements from evaluation container
            else:
                # Show both Model Path and Data Split slider if Training is not selected
                evaluation_container.children = [input_model_path_eval, html_model_path_eval_explanation, slider_data_split, html_slider_data_split_explanation]
            display(checkbox_evaluation, evaluation_container, html_checkbox_evaluation_explanation)
        else: 
            display(checkbox_evaluation, html_checkbox_evaluation_explanation)
            
def update_hyperparams(change=None):
    global html_selected_hyperparams
    # Recompute html_selected_hyperparams whenever input_data_path_shared changes
    training_container.children = [child for child in training_container.children if child != html_selected_hyperparams]
    html_selected_hyperparams = edit_hyperparameters(input_data_path_shared.value)
    update_ui(None)  # Refresh UI to reflect changes

# Initialize the selected hyperparameters
update_hyperparams(None)    
    

# Attach event listeners
checkbox_hyperparam.observe(update_ui, names="value")
radio_search_type.observe(update_ui, names="value")
checkbox_training.observe(update_ui, names="value")
checkbox_resume_training.observe(update_resume, names="value")
checkbox_evaluation.observe(update_ui, names="value")
input_data_path_shared.observe(update_hyperparams, names="value")

# Initial Display
update_ui(None)
display(output)



Output()

In [9]:
# Set data path
data_path = input_data_path_shared.value
if(not os.path.isdir(data_path)):
    print_error(f" Please check the provided Data Path for model development. Data Path {data_path} is not a directory.")
if(not os.listdir(data_path)):
    print_error(f" Please check the provided Data Path for model development. Data Path {data_path} is empty.")
print(f"[INFO]::Data path was set to: {data_path}")
logger = Logger(data_path)
model_trainer = ModelTrainer(data_path, logger)


# hyperparameter search
do_sweep = checkbox_hyperparam.value
hyperparameter_tuner = ModelTuner(model_trainer, data_path, logger)
best_config = None
if(do_sweep):
    sweep_config = update_config(config, input_widgets)
    hyperparameter_tuner.update_config(sweep_config)
    best_config = hyperparameter_tuner.tune()


# train model
if(not best_config):
    best_config = hyperparameter_tuner.update_hyperparameters(html_selected_hyperparams)
    print_info(f"Will use following hyperparameters for future training: {best_config}")    
resume_training = None
if checkbox_resume_training.value:
    resume_training = input_resume_model_path.value
if(resume_training):
    resume_training = Path(resume_training)
    if(resume_training.is_dir()):
        resume_training = find_file(resume_training, "latest_model.pth")
    if(not resume_training.is_file()):
        logger.log_error(f"Could not find resume path at {resume_training}. Will start training from scatch.")
        resume_training = None
logger.init("TrainingRun")
model_trainer.resume_from_checkpoint = resume_training
model_trainer.prepare(best_config)
if(checkbox_training.value):
    model_trainer.fit()

# Evaluation
if(checkbox_evaluation.value):
    start_evaluation = False
    eval_model = input_model_path_eval.value
    if(eval_model):
        eval_model = Path(eval_model)
        if(eval_model.is_dir()):
            eval_model = Path(find_file(eval_model, "best_model.pth")) 
        if(not eval_model.is_file()):
            logger.log_error(f"Could not find model at {eval_model}. Make sure to train a model before evaluation.")
            eval_model = None
        else: 
            start_evaluation = True
    else:
        recent_logs = logger.get_most_recent_logs()
        eval_model = ""
        for dataname, log_path in recent_logs.items():
            if(dataname == Path(data_path).stem):
                eval_model = Path(log_path+"/TrainingRun/checkpoints/best_model.pth")
                if(not eval_model.is_file()):
                    logger.log_error(f"Cound not find a trained model at {eval_model}. Make sure you fully train a model first before evaluating.")
                else:
                    logger.log_info(f"Found most recent log at {eval_model}")
                    start_evaluation = True
            else: 
                continue
        if(not start_evaluation):
            logger.log_error(f"Cound not find a trained model for dataset '{data_path}' at '{eval_model}'. Make sure you train a model first before evaluating.")
        
    if(start_evaluation):
        model_trainer.load_checkpoint(eval_model)
        model_trainer.test()      

 




2025-03-26 15:08:44,640 - INFO - Hyperparameters saved to logs/tem-herpes_2025-03-26_15-08-44/TrainingRun/hyperparameters.json


[INFO]::Data path was set to: data/tem-herpes/
Logger initialized. Logs will be saved to: logs/tem-herpes_2025-03-26_15-08-44
[INFO]::Will use following hyperparameters for future training: {'learning_rate': 0.0001, 'batch_size': 16}
Logger initialized. Logs will be saved to: logs/tem-herpes_2025-03-26_15-08-44/TrainingRun


  state = torch.load(
2025-03-26 15:08:47,663 - ERROR - Cound not find a trained model at logs/tem-herpes_2025-03-26_15-08-44/TrainingRun/checkpoints/best_model.pth. Make sure you fully train a model first before evaluating.
2025-03-26 15:08:47,664 - ERROR - Cound not find a trained model for dataset 'data/tem-herpes/' at 'logs/tem-herpes_2025-03-26_15-08-44/TrainingRun/checkpoints/best_model.pth'. Make sure you train a model first before evaluating.
