# Starting a training round for a task
In this notebook we'll trigger training for a task in the project, and monitor the training job progress. We'll use the project created in notebook [004](004_create_pipeline_project_from_dataset.ipynb) so if you haven't run that one yet, it is recommended to do it first to make sure the project exists and is ready for training

In [None]:
# As usual we'll connnect to the platform first, using the credentials from the .env file.

from dotenv import dotenv_values
from sc_api_tools import SCRESTClient

env_variables = dotenv_values(dotenv_path=".env")

if not env_variables:
    print("Unable to load login details from .env file, please make sure the file exists at the root of the notebooks directory.")

client = SCRESTClient(
    host=env_variables.get('HOST'),
    username=env_variables.get('USERNAME'),
    password=env_variables.get('PASSWORD')
)

### Selecting a project for training
As before, let's list all projects in the workspace and select one that we want to train

In [None]:
from sc_api_tools.rest_managers import ProjectManager

project_manager = ProjectManager(session=client.session, workspace_id=client.workspace_id)
projects = project_manager.list_projects()

We'll use the `COCO multitask animal demo` that we created in notebook [004](004_create_pipeline_project_from_dataset.ipynb). 

In [None]:
PROJECT_NAME = "COCO multitask animal demo"

project = project_manager.get_project_by_name(project_name=PROJECT_NAME)

## Preparing to start training

#### Setting up the TrainingManager
To start and monitor training jobs on the platform, a `TrainingManager` needs to be created for the project:

In [None]:
from sc_api_tools.rest_managers import TrainingManager

training_manager = TrainingManager(session=client.session, workspace_id=client.workspace_id, project=project)

#### Selecting a task to train
First thing to do is to select the task that we want to train. Let's go with the `detection` task in our project, which is the first trainable task in the pipeline. We'll print a summary of the task to make sure we pick the right one

In [None]:
task = project.get_trainable_tasks()[0]
print(task.summary)

#### Listing the available algorithms
Now, let's list the available algorithms for this task. The training_manager can be used for this:

In [None]:
available_algorithms = training_manager.get_algorithms_for_task(task=task)
print(available_algorithms.summary)

Let's go with the `ATSS` algorithm, which is a larger and more accurate model than the `SSD` one. Because of it's size it is also slower, but let's say we care most about accuracy for now.

In [None]:
algorithm = available_algorithms.get_by_name(name='ATSS')

## Checking platform status
Before we start a new training round it may be a good idea to check the platform status, to make sure the project isn't running another job already. In that case submitting a new job might not start training as expected, depending on what job is already running. The `training_manager` can also be used to check the project status:

In [None]:
status = training_manager.get_status()
print(status.summary)

## Starting the training
At this point we can start the training, using the `training_manager.train_task()` method. The method takes additional optional parameters such as `train_from_scratch` and `enable_pot_optimization`, but we'll leave these to their default values (`False`) for now. The `train_task()` method will return a `Job` object, that we can use to monitor the training progress.

In [None]:
job = training_manager.train_task(
    algorithm=algorithm, 
    task=task,
)

### Monitoring the training process
Using the training_manager and the training `job` we just started, we can monitor the training progress on the platform. The `training_manager.monitor_jobs()` method can monitor the status of one or several jobs, and will update the job status every 15 seconds. Program execution is halted untill all jobs are completed (either successfully or cancelled/failed). Even if you only want to monitor a single job, be sure to pass it to the monitor_jobs method in a list as shown in the cell below.

> **NOTE**: Because training the task will take quite a bit of time, you may want to interrupt the monitoring at some point. This can be done by selecting the cell in which the monitoring is running and pressing the 'Interrupt the kernel' (solid square) button at the top of the page, or by navigating to the 'Kernel' menu in the top menu bar and selecting 'Interrupt the kernel' there. This will not cancel the job on the platform, it will just abort the job progress monitoring in the notebook.

In [None]:
training_manager.monitor_jobs([job])

## Getting the model resulting from the training job
Once the training has finished successfully, we can set up a `ModelManager` for the project and use it to get the model that was trained in this particular job

In [None]:
from sc_api_tools.rest_managers import ModelManager

model_manager = ModelManager(session=client.session, workspace_id=client.workspace_id, project=project)

To get the model information, simply pass the job to the `model_manager.get_model_for_job()` method. Note that this will not download the actual model weights itself: Instead, it will return a `Model` object that holds all metadata for the model, such as the score it achieved on the test dataset, it's creation date, the algorithm that it implements, etc. 

Trying to request the model while the training job is still running will result in a ValueError. In that case, please be patient and try again when the job is completed.

In [None]:
model = model_manager.get_model_for_job(job)
print(model.overview)