<a href="https://colab.research.google.com/github/Irenekayla/image-classification-flowers/blob/main/Part_2_image_classifier_project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## AI Programming with Python Nanodegree: Image Classifier Project
  - Do not make changes to the first 2 code cells, they are being used for setting up the `flowers` dataset and `cat_to_name.json`. Start writing code from third code cell onwards.
  - To use this notebook: `File > Save a copy in Drive`
  

### Code Explanation:

- **Setting Up Flower Dataset:**
  - `data_dir = './flowers'`: Defines the directory path for the flower dataset.
  - `FLOWERS_DIR = Path(data_dir)`: Uses `Path` from `pathlib` for handling PosixPath.

- **Downloading and Extracting Dataset:**
  - `if not FLOWERS_DIR.is_dir()`: Checks if the dataset directory exists.
    - `FLOWERS_DIR.mkdir(parents=True, exist_ok=True)`: Creates the directory if not present.
  - `TARBALL = FLOWERS_DIR / "flower_data.tar.gz"`: Defines the tarball path.
  - Downloads and extracts the dataset if not already present:
    - `request = requests.get(...)`: Downloads the 'flower_data.tar.gz' file.
    - `with open(TARBALL, "wb") as file_ref`: Writes the downloaded content to the tarball.
    - `with tarfile.open(TARBALL, "r") as tar_ref`: Extracts the tarball contents to the dataset directory.

- **Cleaning Up:**
  - `os.remove(TARBALL)`: Deletes the downloaded tarball to save space.

- **Status Messages:**
  - Prints informative messages about the directory creation, download, extraction, and cleanup.


In [None]:
!pip install colab-xterm


Collecting colab-xterm
  Downloading colab_xterm-0.2.0-py3-none-any.whl.metadata (1.2 kB)
Downloading colab_xterm-0.2.0-py3-none-any.whl (115 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m115.6/115.6 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: colab-xterm
Successfully installed colab-xterm-0.2.0


In [None]:
%load_ext colabxterm


In [None]:
# imports
import os
import requests
from pathlib import Path
import tarfile

# defining dataset directory
data_dir = './flowers'

# using pathlib.Path for handling PosixPath
FLOWERS_DIR = Path(data_dir)

# downloading and setting up data if not already present
if not FLOWERS_DIR.is_dir():
    # creating directory
    FLOWERS_DIR.mkdir(parents=True, exist_ok=True)
    print(f"[INFO] Directory created: ./{FLOWERS_DIR}")

    print() # for readability

    # tarball path
    TARBALL = FLOWERS_DIR / "flower_data.tar.gz"

    # downloading and writing the tarball to './flowers' directory
    print(f"[INFO] Downloading the file 'flower_data.tar.gz' to ./{FLOWERS_DIR}")
    request = requests.get('https://s3.amazonaws.com/content.udacity-data.com/nd089/flower_data.tar.gz')
    with open(TARBALL, "wb") as file_ref:
        file_ref.write(request.content)
        print(f"[INFO] 'flower_data.tar.gz' saved to ./{FLOWERS_DIR}")

    print() # for readability

    # extracting the downloaded tarball
    print(f"[INFO] Extracting the downloaded tarball to ./{FLOWERS_DIR}")
    with tarfile.open(TARBALL, "r") as tar_ref:
        tar_ref.extractall(FLOWERS_DIR)
        print(f"[INFO] 'flower_data.tar.gz' extracted successfully to ./{FLOWERS_DIR}")

    print() # for readability

    # using os.remove to delete the downloaded tarball
    print("[INFO] Deleting the tarball to save space.")
    os.remove(TARBALL)
else:
    print(f"[INFO] Dataset already setup at ./{FLOWERS_DIR}")

[INFO] Directory created: ./flowers

[INFO] Downloading the file 'flower_data.tar.gz' to ./flowers
[INFO] 'flower_data.tar.gz' saved to ./flowers

[INFO] Extracting the downloaded tarball to ./flowers
[INFO] 'flower_data.tar.gz' extracted successfully to ./flowers

[INFO] Deleting the tarball to save space.


### Code Explanation:

- **Creating a JSON File for Flower Categories:**
  - `data`: Defines a dictionary containing numerical keys and corresponding flower names.
  - `with open('cat_to_name.json', 'w') as file`: Opens the file 'cat_to_name.json' for writing.
  - `json.dump(data, file)`: Writes the dictionary data to the JSON file.

- **Interpreting the Output:**
  - The code creates a JSON file named 'cat_to_name.json' that serves as a mapping between numerical keys and flower names. This mapping can be useful for associating numerical labels with human-readable names in machine learning tasks.


In [None]:
import json

data = {
    "21": "fire lily", "3": "canterbury bells", "45": "bolero deep blue", "1": "pink primrose", "34": "mexican aster",
    "27": "prince of wales feathers", "7": "moon orchid", "16": "globe-flower", "25": "grape hyacinth", "26": "corn poppy",
    "79": "toad lily", "39": "siam tulip", "24": "red ginger", "67": "spring crocus", "35": "alpine sea holly",
    "32": "garden phlox", "10": "globe thistle", "6": "tiger lily", "93": "ball moss", "33": "love in the mist",
    "9": "monkshood", "102": "blackberry lily", "14": "spear thistle", "19": "balloon flower", "100": "blanket flower",
    "13": "king protea", "49": "oxeye daisy", "15": "yellow iris", "61": "cautleya spicata", "31": "carnation",
    "64": "silverbush", "68": "bearded iris", "63": "black-eyed susan", "69": "windflower", "62": "japanese anemone",
    "20": "giant white arum lily", "38": "great masterwort", "4": "sweet pea", "86": "tree mallow",
    "101": "trumpet creeper", "42": "daffodil", "22": "pincushion flower", "2": "hard-leaved pocket orchid",
    "54": "sunflower", "66": "osteospermum", "70": "tree poppy", "85": "desert-rose", "99": "bromelia", "87": "magnolia",
    "5": "english marigold", "92": "bee balm", "28": "stemless gentian", "97": "mallow", "57": "gaura",
    "40": "lenten rose", "47": "marigold", "59": "orange dahlia", "48": "buttercup", "55": "pelargonium",
    "36": "ruby-lipped cattleya", "91": "hippeastrum", "29": "artichoke", "71": "gazania", "90": "canna lily",
    "18": "peruvian lily", "98": "mexican petunia", "8": "bird of paradise", "30": "sweet william",
    "17": "purple coneflower", "52": "wild pansy", "84": "columbine", "12": "colt's foot", "11": "snapdragon",
    "96": "camellia", "23": "fritillary", "50": "common dandelion", "44": "poinsettia", "53": "primula",
    "72": "azalea", "65": "californian poppy", "80": "anthurium", "76": "morning glory", "37": "cape flower",
    "56": "bishop of llandaff", "60": "pink-yellow dahlia", "82": "clematis", "58": "geranium", "75": "thorn apple",
    "41": "barbeton daisy", "95": "bougainvillea", "43": "sword lily", "83": "hibiscus", "78": "lotus lotus",
    "88": "cyclamen", "94": "foxglove", "81": "frangipani", "74": "rose", "89": "watercress", "73": "water lily",
    "46": "wallflower", "77": "passion flower", "51": "petunia"
}

with open('cat_to_name.json', 'w') as file:
    json.dump(data, file)

In [None]:
%%writefile utils.py

import argparse
import json
import torch
from torchvision import datasets, transforms, models
from torch import nn, optim
from collections import OrderedDict
import torch.nn.functional as F
from PIL import Image
import numpy as np
import time
import copy

def load_data(data_dir):
  """Loads the image dataset.

  Args:
    data_dir (str): Path to the directory containing the image dataset.

  Returns:
    dataloaders (dict): A dictionary containing the dataloaders for training, validation, and testing.
    image_datasets (dict): A dictionary containing the image datasets for training, validation, and testing.
    class_to_idx (dict): A dictionary mapping class indices to class names.
  """

  # Define data transformations
  data_transforms = {
      'train': transforms.Compose([
          transforms.RandomRotation(30),
          transforms.RandomResizedCrop(224),
          transforms.RandomHorizontalFlip(),
          transforms.ToTensor(),
          transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
      ]),
      'test': transforms.Compose([
          transforms.Resize(256),
          transforms.CenterCrop(224),
          transforms.ToTensor(),
          transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
      ]),
      'valid': transforms.Compose([
          transforms.Resize(256),
          transforms.CenterCrop(224),
          transforms.ToTensor(),
          transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
      ]),
  }

  # Create image datasets
  image_datasets = {
      'train': datasets.ImageFolder(os.path.join(data_dir, 'train'), transform=data_transforms['train']),
      'test': datasets.ImageFolder(os.path.join(data_dir, 'test'), transform=data_transforms['test']),
      'valid': datasets.ImageFolder(os.path.join(data_dir, 'valid'), transform=data_transforms['valid'])
  }

  # Create dataloaders
  dataloaders = {
      'train': torch.utils.data.DataLoader(image_datasets['train'], batch_size=64, shuffle=True),
      'test': torch.utils.data.DataLoader(image_datasets['test'], batch_size=32, shuffle=True),
      'valid': torch.utils.data.DataLoader(image_datasets['valid'], batch_size=32, shuffle=True)
  }

  # Get class to index mapping
  class_to_idx = image_datasets['train'].class_to_idx

  return dataloaders, image_datasets, class_to_idx


def build_network(arch, hidden_units, drop_prob):
  """Builds the neural network.

  Args:
    arch (str): The name of the architecture to use.
    hidden_units (int): The number of hidden units in the classifier.
    drop_prob (float): The dropout probability.

  Returns:
    model (nn.Module): The built neural network.
  """

  # Load the pretrained model
  if arch == 'vgg13':
    model = models.vgg13(pretrained=True)
  elif arch == 'resnet50':
    model = models.resnet50(pretrained=True)
  else:
    raise ValueError('Invalid architecture. Choose from vgg13 or resnet50.')

  # Freeze the pretrained model's parameters
  for param in model.parameters():
    param.requires_grad = False

  # Replace the classifier with a custom one
  classifier = nn.Sequential(OrderedDict([
      ('fc1', nn.Linear(model.classifier[0].in_features, hidden_units)),
      ('relu1', nn.ReLU()),
      ('dropout1', nn.Dropout(p=drop_prob)),
      ('fc2', nn.Linear(hidden_units, 102)),
      ('output', nn.LogSoftmax(dim=1))
  ]))

  model.classifier = classifier

  # Move the model to the GPU if available
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
  model.to(device)

  return model

def train_model(model, dataloaders, criterion, optimizer, epochs=5):
  """Trains the model.

  Args:
    model (nn.Module): The neural network model.
    dataloaders (dict): A dictionary containing the dataloaders for training, validation, and testing.
    criterion (nn.Module): The loss function.
    optimizer (torch.optim): The optimizer.
    epochs (int): The number of epochs to train for.

  Returns:
    model (nn.Module): The trained model.
  """

  # Track the best validation accuracy
  best_acc = 0.0

  # Iterate over epochs
  for epoch in range(epochs):
    print('Epoch {}/{}'.format(epoch+1, epochs))
    print('-' * 10)

    # Iterate over training and validation phases
    for phase in ['train', 'valid']:
      if phase == 'train':
        model.train()
      else:
        model.eval()

      # Track running loss and accuracy
      running_loss = 0.0
      running_corrects = 0

      # Iterate over data batches
      for inputs, labels in dataloaders[phase]:
        inputs = inputs.to(device)
        labels = labels.to(device)

        # Zero out the gradients
        optimizer.zero_grad()

        # Forward pass
        with torch.set_grad_enabled(phase == 'train'):
          outputs = model(inputs)
          loss = criterion(outputs, labels)

          _, preds = torch.max(outputs, 1)

          # Backward pass and optimization
          if phase == 'train':
            loss.backward()
            optimizer.step()

        # Update running loss and accuracy
        running_loss += loss.item() * inputs.size(0)
        running_corrects += torch.sum(preds == labels.data)

      # Calculate epoch loss and accuracy
      epoch_loss = running_loss / len(dataloaders[phase].dataset)
      epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)

      print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

      # Deep copy the model if better validation accuracy is found
      if phase == 'valid' and epoch_acc > best_acc:
        best_acc = epoch_acc
        best_model_wts = copy.deepcopy(model.state_dict())

    print()

  # Load best model weights
  model.load_state_dict(best_model_wts)
  return model


def save_checkpoint(model, save_dir, arch, hidden_units, drop_prob, class_to_idx):
  """Saves the checkpoint.

  Args:
    model (nn.Module): The trained model.
    save_dir (str): The directory to save the checkpoint in.
    arch (str): The name of the architecture.
    hidden_units (int): The number of hidden units in the classifier.
    drop_prob (float): The dropout probability.
    class_to_idx (dict): A dictionary mapping class indices to class names.
  """

  # Create the checkpoint dictionary
  checkpoint = {
      'arch': arch,
      'hidden_units': hidden_units,
      'drop_prob': drop_prob,
      'class_to_idx': class_to_idx,
      'state_dict': model.state_dict()
  }

  # Save the checkpoint
  torch.save(checkpoint, os.path.join(save_dir, 'checkpoint.pth'))

def load_checkpoint(filepath):
  """Loads the checkpoint.

  Args:
    filepath (str): The path to the checkpoint file.

  Returns:
    model (nn.Module): The loaded model.
    class_to_idx (dict): A dictionary mapping class indices to class names.
  """

  # Load the checkpoint
  checkpoint = torch.load(filepath)

  # Build the network
  model = build_network(checkpoint['arch'], checkpoint['hidden_units'], checkpoint['drop_prob'])

  # Load the state dictionary
  model.load_state_dict(checkpoint['state_dict'])

  # Set the class to index mapping
  model.class_to_idx = checkpoint['class_to_idx']

  return model, checkpoint['class_to_idx']


def process_image(image_path):
  """Processes an image for prediction.

  Args:
    image_path (str): The path to the image file.

  Returns:
    img (torch.Tensor): The processed image as a PyTorch tensor.
  """

  # Load the image
  img = Image.open(image_path)

  # Preprocess the image
  img = data_transforms['test'](img)

  # Convert the image to a PyTorch tensor
  img = torch.from_numpy(img).float()

  # Add a batch dimension
  img = img.unsqueeze(0)

  # Move the image to the GPU if available
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
  img = img.to(device)

  return img

def predict(image_path, model, topk=5):
  """Predicts the class of an image.

  Args:
    image_path (str): The path to the image file.
    model (nn.Module): The trained model.
    topk (int): The number of top predictions to return.

  Returns:
    probs (list): A list of probabilities for the top predictions.
    classes (list): A list of class indices for the top predictions.
  """

  # Process the image
  img = process_image(image_path)

  # Set the model to evaluation mode
  model.eval()

  # Make the prediction
  with torch.no_grad():
    output = model(img)

  # Calculate the probabilities and class indices
  probs, classes = torch.exp(output).topk(topk)
  probs = probs.cpu().numpy()[0]
  classes = classes.cpu().numpy()[0]

  return probs, classes

def predict_image(image_path, model, class_to_idx, topk=5, category_names=None):
  """Predicts the class of an image and prints the results.

  Args:
    image_path (str): The path to the image file.
    model (nn.Module): The trained model.
    class_to_idx (dict): A dictionary mapping class indices to class names.
    topk (int): The number of top predictions to return.
    category_names (dict): A dictionary mapping class indices to category names.
  """

  # Predict the class
  probs, classes = predict(image_path, model, topk)

  # Get the class names
  class_names = [idx_to_class[class_] for class_ in classes]

  # Print the results
  print('Top {} predictions:'.format(topk))
  for prob, class_name in zip(probs, class_names):
    print('  {}: {:.4f}'.format(class_name, prob))

  # If category_names is provided, print the category names
  if category_names:
    print('\nCategory names:')
    for class_name in class_names:
      print('  {}'.format(category_names[class_name]))

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Train or predict flower names')
    subparsers = parser.add_subparsers(dest='command')

    # Train command
    train_parser = subparsers.add_parser('train', help='Train a new model')
    train_parser.add_argument('data_dir', help='Path to the image data directory')
    train_parser.add_argument('--save_dir', default='.', help='Directory to save the checkpoint')
    train_parser.add_argument('--arch', default='resnet50', choices=['vgg13', 'resnet50'], help='Model architecture')
    train_parser.add_argument('--learning_rate', default=0.001, type=float, help='Learning rate')
    train_parser.add_argument('--hidden_units', default=512, type=int, help='Number of hidden units')
    train_parser.add_argument('--epochs', default=5, type=int, help='Number of epochs')
    train_parser.add_argument('--gpu', action='store_true', help='Use GPU for training')

    # Predict command
    predict_parser = subparsers.add_parser('predict', help='Predict flower name from an image')
    predict_parser.add_argument('image_path', help='Path to the image file')
    predict_parser.add_argument('checkpoint', help='Path to the checkpoint file')
    predict_parser.add_argument('--top_k', default=5, type=int, help='Number of top predictions to return')
    predict_parser.add_argument('--category_names', default=None, help='Path to a JSON file containing category names')
    predict_parser.add_argument('--gpu', action='store_true', help='Use GPU for inference')

    # Parse the arguments
    args = parser.parse_args()

    # Set the device
    device = torch.device("cuda" if torch.cuda.is_available() and args.gpu else "cpu")
    print(f"Using device: {device}")

    if args.command == 'train':
        # Load the data
        dataloaders, image_datasets, class_to_idx = load_data(args.data_dir)

        # Build the network
        model = build_network(args.arch, args.hidden_units, 0.1)

        # Define the loss function and optimizer
        criterion = nn.NLLLoss()
        optimizer = optim.Adam(model.classifier.parameters(), lr=args.learning_rate)

        # Train the model
        model = train_model(model, dataloaders, criterion, optimizer, epochs=args.epochs)

        # Save the checkpoint
        save_checkpoint(model, args.save_dir, args.arch, args.hidden_units, 0.1, class_to_idx)

        print('Model trained and saved.')

    elif args.command == 'predict':
        # Load the checkpoint
        model, class_to_idx = load_checkpoint(args.checkpoint)

        # Load the category names if provided
        category_names = None
        if args.category_names:
            with open(args.category_names, 'r') as f:
                category_names = json.load(f)

        # Predict the image
        predict_image(args.image_path, model, class_to_idx, topk=args.top_k, category_names=category_names)

    else:
        print('Invalid command. Use "train" or "predict".')

Writing utils.py


In [None]:
%%writefile train.py
import argparse
import json
import torch
from torchvision import datasets, transforms, models
from torch import nn, optim
from collections import OrderedDict
import torch.nn.functional as F
from PIL import Image
import numpy as np
import time
import copy
import os

# Define data transformations
data_transforms = {
    'train': transforms.Compose([
        transforms.RandomRotation(30),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'test': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'valid': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

# Define function to load data
def load_data(data_dir):
    """Loads the image dataset.

    Args:
        data_dir (str): Path to the directory containing the image dataset.

    Returns:
        dataloaders (dict): A dictionary containing the dataloaders for training, validation, and testing.
        image_datasets (dict): A dictionary containing the image datasets for training, validation, and testing.
        class_to_idx (dict): A dictionary mapping class indices to class names.
    """

    # Create image datasets
    image_datasets = {
        'train': datasets.ImageFolder(os.path.join(data_dir, 'train'), transform=data_transforms['train']),
        'test': datasets.ImageFolder(os.path.join(data_dir, 'test'), transform=data_transforms['test']),
        'valid': datasets.ImageFolder(os.path.join(data_dir, 'valid'), transform=data_transforms['valid'])
    }

    # Create dataloaders
    dataloaders = {
        'train': torch.utils.data.DataLoader(image_datasets['train'], batch_size=64, shuffle=True),
        'test': torch.utils.data.DataLoader(image_datasets['test'], batch_size=32, shuffle=True),
        'valid': torch.utils.data.DataLoader(image_datasets['valid'], batch_size=32, shuffle=True)
    }

    # Get class to index mapping
    class_to_idx = image_datasets['train'].class_to_idx

    return dataloaders, image_datasets, class_to_idx

# Define function to build the network
def build_network(arch, hidden_units, drop_prob):
    """Builds the neural network.

    Args:
        arch (str): The name of the architecture to use.
        hidden_units (int): The number of hidden units in the classifier.
        drop_prob (float): The dropout probability.

    Returns:
        model (nn.Module): The built neural network.
    """

    # Load the pretrained model
    if arch == 'vgg13':
        model = models.vgg13(pretrained=True)
    elif arch == 'resnet50':
        model = models.resnet50(pretrained=True)
    else:
        raise ValueError('Invalid architecture. Choose from vgg13 or resnet50.')

    # Freeze the pretrained model's parameters
    for param in model.parameters():
        param.requires_grad = False

    # Replace the classifier with a custom one
    classifier = nn.Sequential(OrderedDict([
        ('fc1', nn.Linear(model.classifier[0].in_features, hidden_units)),
        ('relu1', nn.ReLU()),
        ('dropout1', nn.Dropout(p=drop_prob)),
        ('fc2', nn.Linear(hidden_units, 102)),
        ('output', nn.LogSoftmax(dim=1))
    ]))

    model.classifier = classifier

    # Move the model to the GPU if available
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)

    return model

# Define function to train the model
def train_model(model, dataloaders, criterion, optimizer, epochs=5):
    """Trains the model.

    Args:
        model (nn.Module): The neural network model.
        dataloaders (dict): A dictionary containing the dataloaders for training, validation, and testing.
        criterion (nn.Module): The loss function.
        optimizer (torch.optim): The optimizer.
        epochs (int): The number of epochs to train for.

    Returns:
        model (nn.Module): The trained model.
    """
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Track the best validation accuracy
    best_acc = 0.0

    # Iterate over epochs
    for epoch in range(epochs):
        print('Epoch {}/{}'.format(epoch+1, epochs))
        print('-' * 10)

        # Iterate over training and validation phases
        for phase in ['train', 'valid']:
            if phase == 'train':
                model.train()
            else:
                model.eval()

            # Track running loss and accuracy
            running_loss = 0.0
            running_corrects = 0

            # Iterate over data batches
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # Zero out the gradients
                optimizer.zero_grad()

                # Forward pass
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    loss = criterion(outputs, labels)

                    _, preds = torch.max(outputs, 1)

                    # Backward pass and optimization
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # Update running loss and accuracy
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

            # Calculate epoch loss and accuracy
            epoch_loss = running_loss / len(dataloaders[phase].dataset)
            epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

            # Deep copy the model if better validation accuracy is found
            if phase == 'valid' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    # Load best model weights
    model.load_state_dict(best_model_wts)
    return model

# Define function to save the checkpoint
def save_checkpoint(model, save_dir, arch, hidden_units, drop_prob, class_to_idx):
    """Saves the checkpoint.

    Args:
        model (nn.Module): The trained model.
        save_dir (str): The directory to save the checkpoint in.
        arch (str): The name of the architecture.
        hidden_units (int): The number of hidden units in the classifier.
        drop_prob (float): The dropout probability.
        class_to_idx (dict): A dictionary mapping class indices to class names.
    """

    # Create the checkpoint dictionary
    checkpoint = {
        'arch': arch,
        'hidden_units': hidden_units,
        'drop_prob': drop_prob,
        'class_to_idx': class_to_idx,
        'state_dict': model.state_dict()
    }

    # Save the checkpoint
    torch.save(checkpoint, os.path.join(save_dir, 'checkpoint.pth'))

# Define function to parse command line arguments
def parse_args():
    parser = argparse.ArgumentParser(description='Train a flower image classifier')
    parser.add_argument('data_dir', help='Path to the image data directory')
    parser.add_argument('--save_dir', default='.', help='Directory to save the checkpoint')
    parser.add_argument('--arch', default='resnet50', choices=['vgg13', 'resnet50'], help='Model architecture')
    parser.add_argument('--learning_rate', default=0.001, type=float, help='Learning rate')
    parser.add_argument('--hidden_units', default=512, type=int, help='Number of hidden units')
    parser.add_argument('--epochs', default=5, type=int, help='Number of epochs')
    parser.add_argument('--drop_prob', default=0.1, type=float, help='Dropout probability')  # Add this line
    parser.add_argument('--gpu', action='store_true', help='Use GPU for training')
    args = parser.parse_args()
    return args

# Define main function
def main():
    # Parse command line arguments
    args = parse_args()

    # Set the device
    device = torch.device("cuda" if torch.cuda.is_available() and args.gpu else "cpu")
    print(f"Using device: {device}")

    # Load the data
    dataloaders, image_datasets, class_to_idx = load_data(args.data_dir)

    # Build the network
    model = build_network(args.arch, args.hidden_units,  args.drop_prob)

    # Define the loss function and optimizer
    criterion = nn.NLLLoss()
    optimizer = optim.Adam(model.classifier.parameters(), lr=args.learning_rate)

    # Train the model
    model = train_model(model, dataloaders, criterion, optimizer, epochs=args.epochs)

    # Save the checkpoint
    save_checkpoint(model, args.save_dir, args.arch, args.hidden_units, 0.1, class_to_idx)

    print('Model trained and saved.')

# Run the main function
if __name__ == '__main__':
    main()

Writing train.py


In [18]:
%%writefile predict.py
import argparse
import json
import torch
from torchvision import datasets, transforms, models
from torch import nn, optim
from collections import OrderedDict
import torch.nn.functional as F
from PIL import Image
import numpy as np
import time
import copy

# Define data transformations
data_transforms = {
    'test': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

# Define function to load checkpoint
def load_checkpoint(filepath):
    """Loads the checkpoint.

    Args:
        filepath (str): The path to the checkpoint file.

    Returns:
        model (nn.Module): The loaded model.
        class_to_idx (dict): A dictionary mapping class indices to class names.
    """

    checkpoint = torch.load(filepath, map_location=torch.device('cpu'), weights_only=True)

    # Build the network
    model = build_network(checkpoint['arch'], checkpoint['hidden_units'], checkpoint['drop_prob'])

    # Load the state dictionary
    model.load_state_dict(checkpoint['state_dict'])

    # Set the class to index mapping
    model.class_to_idx = checkpoint['class_to_idx']

    return model, checkpoint['class_to_idx']

# Define function to build the network
def build_network(arch, hidden_units, drop_prob):
    """Builds the neural network.

    Args:
        arch (str): The name of the architecture to use.
        hidden_units (int): The number of hidden units in the classifier.
        drop_prob (float): The dropout probability.

    Returns:
        model (nn.Module): The built neural network.
    """

    # Load the pretrained model
    if arch == 'vgg13':
        model = models.vgg13(weights='IMAGENET1K_V1')  # Use weights argument
    elif arch == 'resnet50':
        model = models.resnet50(weights='IMAGENET1K_V2')  # Use weights argument
    else:
        raise ValueError('Invalid architecture. Choose from vgg13 or resnet50.')

    # Freeze the pretrained model's parameters
    for param in model.parameters():
        param.requires_grad = False

    # Replace the classifier with a custom one
    classifier = nn.Sequential(OrderedDict([
        ('fc1', nn.Linear(model.classifier[0].in_features, hidden_units)),
        ('relu1', nn.ReLU()),
        ('dropout1', nn.Dropout(p=drop_prob)),
        ('fc2', nn.Linear(hidden_units, 102)),
        ('output', nn.LogSoftmax(dim=1))
    ]))

    model.classifier = classifier

    # Move the model to the GPU if available
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)

    return model

# Define function to process an image
def process_image(image_path):
    """Processes an image for prediction.

    Args:
        image_path (str): The path to the image file.

    Returns:
        img (torch.Tensor): The processed image as a PyTorch tensor.
    """

    # Load the image
    img = Image.open(image_path)

    # Preprocess the image
    img = data_transforms['test'](img)

    # Add a batch dimension
    img = img.unsqueeze(0)

    # Move the image to the GPU if available
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    img = img.to(device)

    return img

# Define function to predict the class of an image
def predict(image_path, model, topk=5):
    """Predicts the class of an image.

    Args:
        image_path (str): The path to the image file.
        model (nn.Module): The trained model.
        topk (int): The number of top predictions to return.

    Returns:
        probs (list): A list of probabilities for the top predictions.
        classes (list): A list of class indices for the top predictions.
    """

    # Process the image
    img = process_image(image_path)

    # Set the model to evaluation mode
    model.eval()

    # Make the prediction
    with torch.no_grad():
        output = model(img)

    # Calculate the probabilities and class indices
    probs, classes = torch.exp(output).topk(topk)
    probs = probs.cpu().numpy()[0]
    classes = classes.cpu().numpy()[0]

    return probs, classes

# Define function to predict an image and print the results
def predict_image(image_path, model, class_to_idx, topk=5, category_names=None):
    """Predicts the class of an image and prints the results.

    Args:
        image_path (str): The path to the image file.
        model (nn.Module): The trained model.
        class_to_idx (dict): A dictionary mapping class indices to class names.
        topk (int): The number of top predictions to return.
        category_names (dict): A dictionary mapping class indices to category names.
    """

    # Predict the class
    probs, classes = predict(image_path, model, topk)

    # Create idx_to_class dictionary (reverse mapping)
    idx_to_class = {val: key for key, val in class_to_idx.items()}

    # Get the class names
    class_names = [idx_to_class[class_] for class_ in classes]

    # Print the results
    print('Top {} predictions:'.format(topk))
    for prob, class_name in zip(probs, class_names):
        print('  {}: {:.4f}'.format(class_name, prob))

    # If category_names is provided, print the category names
    if category_names:
        print('\nCategory names:')
        for class_name in class_names:
            print('  {}'.format(category_names[class_name]))

# Define function to parse command line arguments
def parse_args():
    parser = argparse.ArgumentParser(description='Predict flower name from an image')
    parser.add_argument('image_path', help='Path to the image file')
    parser.add_argument('checkpoint', help='Path to the checkpoint file')
    parser.add_argument('--top_k', default=5, type=int, help='Number of top predictions to return')
    parser.add_argument('--category_names', default=None, help='Path to a JSON file containing category names')
    parser.add_argument('--gpu', action='store_true', help='Use GPU for inference')
    args = parser.parse_args()
    return args

# Define main function
def main():
    # Parse command line arguments
    args = parse_args()

    # Set the device
    device = torch.device("cuda" if torch.cuda.is_available() and args.gpu else "cpu")
    print(f"Using device: {device}")

    # Load the checkpoint
    model, class_to_idx = load_checkpoint(args.checkpoint)

    # Load the category names if provided
    category_names = None
    if args.category_names:
        with open(args.category_names, 'r') as f:
            category_names = json.load(f)

    # Predict the image
    predict_image(args.image_path, model, class_to_idx, topk=args.top_k, category_names=category_names)

# Run the main function
if __name__ == '__main__':
    main()

Overwriting predict.py


In [15]:
!python train.py ./flowers --arch vgg13 --learning_rate 0.003 --epochs 5 --gpu --drop_prob 0.1

Using device: cuda
Epoch 1/5
----------
train Loss: 3.2050 Acc: 0.3658
valid Loss: 1.2407 Acc: 0.6699

Epoch 2/5
----------
train Loss: 1.5508 Acc: 0.5894
valid Loss: 0.9215 Acc: 0.7543

Epoch 3/5
----------
train Loss: 1.4000 Acc: 0.6392
valid Loss: 0.8032 Acc: 0.8044

Epoch 4/5
----------
train Loss: 1.2699 Acc: 0.6685
valid Loss: 0.8978 Acc: 0.7848

Epoch 5/5
----------
train Loss: 1.2507 Acc: 0.6806
valid Loss: 0.8320 Acc: 0.7885

Model trained and saved.


In [19]:
!python predict.py ./flowers/test/1/image_06752.jpg ./checkpoint.pth --top_k 3 --category_names cat_to_name.json

Using device: cpu
Top 3 predictions:
  1: 0.5319
  86: 0.3681
  34: 0.0666

Category names:
  pink primrose
  tree mallow
  mexican aster
