# End-to-End Model Workflow Tutorial

This notebook provides a comprehensive, step-by-step guide for a complete model training workflow on the Dataloop platform. We will start with dataset preparation, proceed to a labeling task, annotate the data, and finally train and deploy a model.

### Overview

This guide will walk you through the following key stages:

1. **[Dependencies & Setup](#dependencies-setup):** Installing necessary libraries and connecting to your Dataloop environment.
2. **[Project & Dataset Preparation](#dataset-preparation):** Setting up a Dataloop project, installing a dataset from the Marketplace, and splitting it for machine learning tasks.
3. **[Labeling Task Creation & Annotation](#labeling-task):** Creating a task with a subset of data, assigning it for annotation, and programmatically completing the annotations.
4. **[Model Training & Deployment](#model-training):** Configuring a pre-trained model, fine-tuning it on our annotated data, and deploying it as a live service.
5. **[Conclusion & Next Steps](#conclusion):** Summarizing the process and suggesting further actions.

Let's get started!

## <a id='dependencies-setup'></a>1. Dependencies & Setup

### Install Dependencies

First, let's ensure that the necessary Python libraries are installed. This notebook requires `dtlpy` for interacting with the Dataloop platform. The following cell will install or upgrade the library quietly.

In [None]:
!pip install dtlpy --upgrade --quiet

### Import Required Libraries

Now, we import all the Python libraries that will be used throughout this tutorial.

In [None]:
import dtlpy as dl
from tqdm import tqdm
import pathlib
import time

### Set Up Dataloop Environment

To begin, we need to connect to the Dataloop platform. If you're not already logged in, running the cell below will prompt you to do so. Then, we'll either create a new project or get an existing one to work with.

In [None]:
if dl.token_expired():
    dl.login()

PROJECT_NAME = "<your project name here>"
project = dl.projects.create(project_name=PROJECT_NAME)
print(f"Project created: [Name: {project.name}, ID: {project.id}]")

> **Action Required:** In the cell above, replace `"<your project name here>"` with the desired name for your Dataloop project. If a project with this name already exists, the SDK will retrieve it; otherwise, a new project with this name will be created.

## <a id='dataset-preparation'></a>2. Project & Dataset Preparation

### Install Dataset from Marketplace

For model training (like ResNet), we need a dataset with multiple classes. Here we will install the "Agricultural Seedlings Dataset" from the Dataloop App Marketplace. This Dataloop App (DPK) not only provides the dataset but also pre-installs the ResNet model package, which we will use later.

The code below will install the app, retrieve the created dataset, and wait for the app's setup services to complete.

In [None]:
dpk = dl.dpks.get(dpk_name="ml-compare-solution")
app = project.apps.install(dpk=dpk)
print(f"Seedlings Datasets installed: [Name: {app.name}, ID: {app.id}]")

dataset = project.datasets.get(dataset_name="V2 Plant Seedlings - Annotated")
print(f"Got Annotated Dataset: [Name: {dataset.name}, ID: {dataset.id}]")

filters = dl.Filters(resource=dl.FiltersResource.SERVICE, field="appId", values=app.id)
services = project.services.list(filters=filters)
if isinstance(services, dl.entities.PagedEntities):
    services = services.all()

service: dl.Service
print("Waiting for app services to finish, please wait...")
for service in services:
    service_executions = service.executions.list()
    if isinstance(service_executions, dl.entities.PagedEntities):
        service_executions = service_executions.all()
    for service_execution in tqdm(service_executions):
        service_execution.wait()
print("App services finished")

### Split Dataset for Training

For effective model training, it's crucial to split your dataset into training, validation, and test subsets. The Dataloop SDK provides a convenient method to do this by automatically tagging items. We will use an 80/10/10 split.

In [None]:
SUBSET_PERCENTAGES = {'train': 80, 'validation': 10, 'test': 10}
dataset.split_ml_subsets(percentages=SUBSET_PERCENTAGES)
print(f"Dataset split completed with {SUBSET_PERCENTAGES}")

## <a id='labeling-task'></a>3. Labeling Task Creation & Annotation

### Select Items and Create Task

In a real-world scenario, you might have a large dataset that needs annotation. Here, we'll simulate this by selecting a few items from each of our newly created subsets (train, validation, and test) and placing them into a labeling task.

In [None]:
train_filters = dl.Filters(field="metadata.system.tags.train", values=True)
valid_filters = dl.Filters(field="metadata.system.tags.validation", values=True)
test_filters = dl.Filters(field="metadata.system.tags.test", values=True)

train_items = list(dataset.items.list(filters=train_filters).all())[:2]
valid_items = list(dataset.items.list(filters=valid_filters).all())[:2]
test_items = list(dataset.items.list(filters=test_filters).all())[:2]

print(f"Train items: {[item.filename for item in train_items]}")
print(f"Valid items: {[item.filename for item in valid_items]}")
print(f"Test items: {[item.filename for item in test_items]}")

Now we will create a new annotation task for these selected items. We will assign the task to ourselves for this tutorial. The task uses the dataset's `recipe`, which defines the set of possible labels (the ontology) for annotation.

In [None]:
your_user_email = dl.info()["user_email"]

task_name = "Model Training Task"
assignee_ids = [your_user_email]
workload = dl.Workload.generate(assignee_ids=assignee_ids)
task_owner = your_user_email
recipe = dataset.recipes.list()[0]
items = train_items + valid_items + test_items

task: dl.Task = project.tasks.create(
    task_name=task_name, 
    assignee_ids=assignee_ids,
    workload=workload,
    dataset=dataset,
    task_owner=task_owner,
    recipe_id=recipe.id,
    items=items
)
print(f"Created Task: [Name: {task.name}, ID: {task.id}, Items Count: {task.total_items}]")

When a task is created, the system generates `assignments` for each assignee. Let's retrieve the assignment created for our user.

In [None]:
assignment: dl.Assignment = task.assignments.list()[0]
print(f"Task assignment: [Name: {assignment.name}, ID: {assignment.id}]")

### Annotate and Complete the Task Items

The next stage involves creating annotations for the dataset items and marking them as completed. For this tutorial, we'll programmatically add classification annotations. The label for each annotation will be derived from the item's parent folder name.

We'll use the script below to add annotations that are properly linked to both the task and assignment. Once annotated, we'll update the status of each item to `completed` to indicate it is ready for the training phase.

In [None]:
# We will use the following labels for our annotations
print(list(dataset.labels_flat_dict.keys()))

Now we will iterate through all items in the task, add annotations, and mark them as complete.

In [None]:
items = task.get_items()
if isinstance(items, dl.entities.PagedEntities):
    items = items.all()
items = list(items)

item: dl.Item
for item in items:
    # Delete the item annotations
    filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)  # Filter is required to delete all annotations
    item.annotations.delete(filters=filters)

    # Creating new classification annotation based on the item folder name
    builder = item.annotations.builder()
    label = pathlib.Path(item.filename).parent.name
    classification = dl.Classification(label=label, description=f"Created by assignment {assignment.id}")

    # Linking the annotations to the task
    metadata = {
        "system": {
            "recipeId": recipe.id,
            "taskId": task.id,
            "assignmentId": assignment.id
        }
    }

    # Adding the annotations to the item
    builder.add(
        annotation_definition=classification,
        metadata=metadata
    )
    item.annotations.upload(builder)
    item.update_status(
        status="completed", 
        assignment_id=assignment.id, 
        task_id=task.id
    )
    print(f"Uploaded annotations to Item: [ID: {item.id}, Filename: {item.filename}]")

## <a id='model-training'></a>4. Model Training & Deployment

### Configure Base Model

With our data annotated, we can now begin the training process. We will retrieve the pre-trained ResNet model that was installed with the dataset DPK. We then configure it by specifying which data subsets to use for training and validation, and by setting key hyperparameters like the number of epochs and batch size.

In [None]:
base_model = project.models.get(model_name="pretrained-resnet")

# Configure model metadata and subsets
train_filters = dl.Filters(field="metadata.system.tags.train", values=True)
val_filters = dl.Filters(field="metadata.system.tags.validation", values=True)

base_model.add_subset("train", train_filters)
base_model.add_subset("validation", val_filters)

# Set model configuration for ResNet training
base_model.configuration = {
    "num_epochs": 5,
    "batch_size": 32,
}

print(f"Base model configured: {base_model.name}")

### Clone and Train Model

To fine-tune a model, we first `clone` the pre-trained model. This creates a new, trainable model entity in your project. We associate our dataset and its labels with this new model and then start the training process.

> **NOTE**: The training process can take a significant amount of time, depending on your dataset size, model configuration, and the available compute resources (GPU type).

In [None]:
finetuned_model_name = base_model.name + "-finetuned"
labels = [label.tag for label in dataset.labels]
finetuned_model = base_model.clone(model_name=finetuned_model_name, dataset=dataset, labels=labels)
print(f"Created new model for finetuning: [Name: {finetuned_model.name}, ID: {finetuned_model.id}]")

In [None]:
print(f"Training model: {finetuned_model.name}")
execution = finetuned_model.train()
print(f"Training execution: [ID: {execution.id}, status: {execution.status}]")

The cell below will periodically check the status of the training execution and wait for it to complete. You can also monitor the training progress, view logs, and see performance metrics directly in the Dataloop platform by navigating to your model's page.

In [None]:
# Wait for training to complete
print("Waiting for training to complete...")

while execution.in_progress():
    print("Training in progress... checking again in 2 minutes")
    time.sleep(120)  # Sleep for 2 minutes
    execution = dl.executions.get(execution_id=execution.id)

if execution.get_latest_status()['status'] == "success":
    print("Training completed successfully!")
else:
    print(f"Training failed with status: {execution.get_latest_status()['status']}")

### Deploy the Trained Model

Once training is complete, the final step is to deploy the fine-tuned model. Deployment creates a live service endpoint that can be used for inference on new, unseen data.

In [None]:
# Get the updated model entity
finetuned_model = project.models.get(model_id=finetuned_model.id)

# Deploy the model
service = finetuned_model.deploy()
print(f"Model deployed successfully: [Name: {service.name}, ID: {service.id}]")

## <a id='conclusion'></a>5. Conclusion

Congratulations! You have successfully walked through the entire process of an end-to-end model workflow on Dataloop. You have accomplished the following:

1. Set up your Dataloop environment and project.
2. Installed a dataset from the Marketplace and prepared it for training by creating ML subsets.
3. Created a labeling task for a subset of data.
4. Programmatically annotated items and marked them as complete.
5. Configured and fine-tuned a pre-trained model on your annotated data.
6. Deployed the trained model as a live service for inference.

### Next Steps

From here, you can explore more advanced concepts:
- Use your deployed model to make predictions on new items.
- Create a full pipeline that automates this entire workflow.
- Explore active learning loops to intelligently select which items to annotate next.
- Check out our [Active Learning Pipeline Tutorial](https://docs.dataloop.ai/docs/active-learning-pipeline) for an example.