# How to train an Object Detector with your own COCO dataset in PyTorch (Common Objects in Context format)

## Understanding the Dataset & DataLoader in PyTorch

- [Link to Medium post by Takashi Nakamura, PhD](https://medium.com/fullstackai/how-to-train-an-object-detector-with-your-own-coco-dataset-in-pytorch-319e7090da5)
- [Create COCO Annotations From Scratch](https://www.immersivelimit.com/tutorials/create-coco-annotations-from-scratch)
- []()

Dataset was annotated using [AnyLabeling](https://anylabeling.nrl.ai)

[format converter](https://github.com/enekuie/COCO-json-annotations-to-YOLO-txt-format-converter)


In [1]:
import os
import torch
from torch.utils import data
from PIL import Image
from torchvision import transforms

In [3]:
print(f"PyTorch Version: {torch.__version__}")
# Check PyTorch has access to MPS (Metal Performance Shader, Apple's GPU architecture)
print(f"Is MPS (Metal Performance Shader) built? {torch.backends.mps.is_built()}")
print(f"Is MPS available? {torch.backends.mps.is_available()}")

# Set the device
# device = "mps" if torch.backends.mps.is_available() else "cpu"
device = (
    "mps"
    if torch.mps.is_available()
    else "cuda"
    if torch.cuda.is_available()
    else "cpu"
)
print(f"Using device: {device}")

PyTorch Version: 2.5.1
Is MPS (Metal Performance Shader) built? True
Is MPS available? True
Using device: mps


DogBreed Dataset Class for returning images and labels

In [2]:
class dogBreedDataset(data.Dataset):
    # Initialize function of class
    def __init__(self, root, filenames, labels):
        # the data directory
        self.root = root
        # list of filenames
        self.filenames = filenames
        # list of labels
        self.labels = labels

    # obtain sample from index
    def __getitem__(self, index):

        # get the image filename from the filenames list
        image_filename = self.filenames[index]
        # Load data and label
        image = Image.open(os.path.join(self.root, image_filename))
        label = self.labels[index]

        # output of dataset must be tensor
        image = transforms.ToTensor()(image)
        label = torch.as_tensor(label, dtype=torch.int64)
        return image, label

    # return the length of the dataset
    def __len__(self):
        return len(self.filenames)

In [None]:
# data directory
root = "../data/renamed_dub_removed"

# assume we have 3 jpg images
filenames = [
    "american_pit_bull_terrier_0001.jpg",
    "american_pit_bull_terrier_0002.jpg",
    "american_pit_bull_terrier_0003.jpg",
]

# the class of image might be ['black cat', 'tabby cat', 'tabby cat']
labels = [0, 1, 1]

# create own Dataset
my_dataset = dogBreedDataset(root=root, filenames=filenames, labels=labels)
