# PyTorch DDP Fashion MNIST Training Example

This example demonstrates how to train a neural network to classify images using the [Fashion MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset and [PyTorch DDP](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).

## Install the Kubeflow Training Python SDK

You need to install the Kubeflow Training SDK to run this Notebook.

## Create the Kubeflow Training Client

In [None]:
from kubeflow.training import Trainer, TrainingClient
from mnist import train_fashion_mnist

In [None]:
client = TrainingClient()

## Start the Train Job

In [None]:
job_name = client.train(
    runtime_ref="torch-distributed",
    trainer=Trainer(
        func=train_fashion_mnist,
        func_args={
            "backend": "nccl",
            "batch_size": 100,
            "test_batch_size": 100,
            "epochs": 100,
            "lr": 1e-1,
            "lr_gamma": 0.7,
            "lr_period": 25,
            "seed": 0,
            "log_interval": 10,
            "save_model": False,
        },
        num_nodes=4,
        resources_per_node={
            "nvidia.com/gpu": 1,
        },
    ),
)

## Watch the Train Job Logs

In [None]:
client.get_job_logs(job_name, follow=True)