<a href="https://colab.research.google.com/github/weedge/doraemon-nb/blob/main/corenet_train_a_pic_classification_model_on_a_new_dataset_from_scratch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!git clone https://github.com/apple/corenet.git

Cloning into 'corenet'...
remote: Enumerating objects: 837, done.[K
remote: Counting objects: 100% (837/837), done.[K
remote: Compressing objects: 100% (640/640), done.[K
remote: Total 837 (delta 188), reused 834 (delta 185), pack-reused 0[K
Receiving objects: 100% (837/837), 795.35 KiB | 6.17 MiB/s, done.
Resolving deltas: 100% (188/188), done.


In [24]:
%cd /content/corenet

/content/corenet


In [10]:
!git lfs install && git lfs pull

Updated git hooks.
Git LFS initialized.


In [11]:
!python -m pip install --editable .

Obtaining file:///content/corenet
  Installing build dependencies ... [?25l[?25hdone
  Checking if build backend supports build_editable ... [?25l[?25hdone
  Getting requirements to build editable ... [?25l[?25hdone
  Preparing editable metadata (pyproject.toml) ... [?25l[?25hdone
Collecting psutil==5.9.8 (from corenet==0.1.0)
  Downloading psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m288.2/288.2 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting ujson==5.9.0 (from corenet==0.1.0)
  Downloading ujson-5.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (53 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.2/53.2 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting scikit-learn==1.4.1.post1 (from corenet==0.1.0)
  Downloading scikit_learn-1.4.1.post1-cp310-cp310-manylinux_2_17_x86_64.ma

In [5]:
import os

if os.getcwd().endswith("tutorials"):
    os.chdir("..")

assert os.path.exists(
    "corenet"
), f"We should be in the root repository folder, but we are in {os.getcwd()}"

! mkdir -p projects/playground_cifar10/classification


In [6]:
%%file projects/playground_cifar10/classification/cifar10.yaml
common:
    log_freq: 2000                 # Log the training metrics every 2000 iterations.

dataset:
    category: classification
    name: "cifar10"                # We'll register the "cifar10" name at DATASET_REGISTRY later in this tutorial.

    # The `corenet-train` entrypoint uses train_batch_size0 and val_batch_size0 values to construct
    # training/validation batches during training. The `corenet-eval` entrypoint uses eval_batch_size0 to
    # construct batches during evaluation (ie test).
    #
    # The effective batch size is: num_nodes x num_gpus x train_batch_size0
    train_batch_size0: 4
    val_batch_size0: 4
    eval_batch_size0: 1

    workers: 2
    persistent_workers: true
    pin_memory: true

model:
    classification:
        name: "two_layer"          # We'll register the "two_layer" name at MODEL_REGISTRY later in this tutorial.
        n_classes: 10

    layer:
        # Weight initialization parameters:
        conv_init: "kaiming_normal"
        linear_init: "trunc_normal"
        linear_init_std_dev: 0.02


sampler:
    name: batch_sampler

    # The following dimensions will be passed to the dataset.__get__ method, and the dataset produces samples
    # cropped and resized to the requested dimensions.
    bs:
        crop_size_width: 32
        crop_size_height: 32

loss:
    category: classification
    classification:
        name: cross_entropy       # The implemention is available in "corenet/loss_fn/" folder.

optim:
    name: sgd
    sgd:
        momentum: 0.9

scheduler:
    name: fixed                    # The implementation is available in "corenet/optims/scheduler/" folder.
    max_epochs: 2
    fixed:
        lr: 0.001                  # Fixed Learning Rate

stats:
  val: ["loss", "top1"]            # Metrics to log
  train: ["loss", "top1"]
  checkpoint_metric: top1          # Assigns a checkpoint to results/checkpoint_best.pt
  checkpoint_metric_max: true

Writing projects/playground_cifar10/classification/cifar10.yaml


In [7]:
%%file corenet/data/datasets/classification/playground_dataset.py

from argparse import Namespace
from typing import Any, Dict, Tuple

import torchvision
import torchvision.transforms as transforms

from corenet.data.datasets import DATASET_REGISTRY
from corenet.data.datasets.dataset_base import BaseDataset


@DATASET_REGISTRY.register(name="cifar10", type="classification")
class Cifar10(BaseDataset):
    CLASS_NAMES = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

    def __init__(self, opts: Namespace, **kwargs) -> None:
        super().__init__(opts, **kwargs)
        self._torchvision_dataset = torchvision.datasets.CIFAR10(
            "/tmp/cifar10_cache",
            train=self.is_training,
            download=True,
        )

    def __len__(self) -> int:
        return len(self._torchvision_dataset)

    def __getitem__(self, sample_size_and_index: Tuple[int]) -> Dict[str, Any]:
        # In CoreNet, not only does the sampler determine the index of the samples, but
        # also the sampler determines the crop size dynamically for each batch. This
        # allows samplers to train multi-scale models more efficiently.
        # See: corenet/data/sampler/variable_batch_sampler.py
        (crop_size_h, crop_size_w, sample_index) = sample_size_and_index

        img, target = self._torchvision_dataset[sample_index]

        transform = transforms.Compose(
            [
                transforms.ToTensor(),
                transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                transforms.Resize(size=(crop_size_h, crop_size_w)),
            ]
        )
        img = transform(img)
        return {
            "samples": img,
            "targets": target,
        }


Writing corenet/data/datasets/classification/playground_dataset.py


In [8]:
%%file corenet/modeling/models/classification/playground_model.py

import argparse

import torch
import torch.nn.functional as F
from torch import nn

from corenet.modeling.models import MODEL_REGISTRY
from corenet.modeling.models.base_model import BaseAnyNNModel


@MODEL_REGISTRY.register("two_layer", type="classification")
class Net(BaseAnyNNModel):
    """A simple 2-layer CNN, inspired by https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html"""

    def __init__(self, opts: argparse.Namespace) -> None:
        super().__init__(opts)
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        self.reset_parameters(opts)  # Initialize the weights

    def forward(self, x: torch.Tensor):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = torch.flatten(x, 1)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


Writing corenet/modeling/models/classification/playground_model.py


In [12]:
!corenet-train --common.config-file projects/playground_cifar10/classification/cifar10.yaml


2024-04-27 03:45:34 - [93m[1mDEBUG   [0m - Cannot load internal arguments, skipping.
2024-04-27 03:45:34 - [34m[1mLOGS   [0m - Random seeds are set to 0
2024-04-27 03:45:34 - [34m[1mLOGS   [0m - Using PyTorch version 2.2.1+cu121
2024-04-27 03:45:34 - [34m[1mLOGS   [0m - Setting --ddp.world-size the same as the number of available gpus.
2024-04-27 03:45:34 - [34m[1mLOGS   [0m - Directory created at: results/run_1
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to /tmp/cifar10_cache/cifar-10-python.tar.gz
100% 170498071/170498071 [00:01<00:00, 105036546.72it/s]
Extracting /tmp/cifar10_cache/cifar-10-python.tar.gz to /tmp/cifar10_cache
2024-04-27 03:45:40 - [34m[1mLOGS   [0m - Training dataset details are given below
Cifar10(
	root= 
	is_training=True 
	num_samples=50000
)
Files already downloaded and verified
2024-04-27 03:45:41 - [34m[1mLOGS   [0m - Validation dataset details are given below
Cifar10(
	root= 
	is_training=False 
	num_samples=10000

In [15]:
!ls -lh results/run_1


total 1.5M
-rw-r--r-- 1 root root 246K Apr 27 03:49 checkpoint_best.pt
-rw-r--r-- 1 root root 246K Apr 27 03:49 checkpoint_last.pt
-rw-r--r-- 1 root root 246K Apr 27 03:47 checkpoint_score_48.4700.pt
-rw-r--r-- 1 root root 246K Apr 27 03:49 checkpoint_score_53.5300.pt
-rw-r--r-- 1 root root 1.9K Apr 27 03:45 config.yaml
-rw-r--r-- 1 root root 492K Apr 27 03:49 training_checkpoint_last.pt


In [16]:
!corenet-eval \
    --common.config-file projects/playground_cifar10/classification/cifar10.yaml \
    --model.classification.pretrained results/run_1/checkpoint_best.pt


2024-04-27 05:47:58 - [93m[1mDEBUG   [0m - Cannot load internal arguments, skipping.
2024-04-27 05:47:59 - [34m[1mLOGS   [0m - Random seeds are set to 0
2024-04-27 05:47:59 - [34m[1mLOGS   [0m - Using PyTorch version 2.2.1+cu121
2024-04-27 05:47:59 - [34m[1mLOGS   [0m - Setting --ddp.world-size the same as the number of available gpus.
2024-04-27 05:47:59 - [34m[1mLOGS   [0m - Directory exists at: results/run_1
Files already downloaded and verified
2024-04-27 05:48:00 - [34m[1mLOGS   [0m - Evaluation dataset details: 
Cifar10(
	root= 
	is_training=False 
	num_samples=10000
)
2024-04-27 05:48:00 - [34m[1mLOGS   [0m - Evaluation sampler details: BatchSampler(
	 num_repeat=1
	 trunc_rep_aug=False
	base_im_size=(h=32, w=32)
	base_batch_size=1
)
2024-04-27 05:48:00 - [34m[1mLOGS   [0m - Pretrained weights are loaded from results/run_1/checkpoint_best.pt
2024-04-27 05:48:00 - [32m[1mINFO   [0m - Trainable parameters: ['conv1.weight', 'conv1.bias', 'conv2.weight', '

In [19]:
!ls -hl results/run_1/checkpoint_best.pt projects/playground_cifar10/classification/cifar10.yaml

-rw-r--r-- 1 root root 1.9K Apr 27 03:29 projects/playground_cifar10/classification/cifar10.yaml
-rw-r--r-- 1 root root 246K Apr 27 03:49 results/run_1/checkpoint_best.pt


In [25]:
from corenet.options.opts import get_training_arguments
from corenet.modeling import get_model
from PIL import Image
import torch
from torchvision.transforms import Compose, Resize, PILToTensor, CenterCrop
from torchvision.transforms import ToPILImage
from corenet.data.datasets.classification.playground_dataset import Cifar10

config_file = "projects/playground_cifar10/classification/cifar10.yaml"
pretrained_weights = "results/run_1/checkpoint_best.pt"

opts = get_training_arguments(
    args=[
        "--common.config-file",
        config_file,
        "--model.classification.pretrained",
        pretrained_weights,
    ]
)

# Load the model
model = get_model(opts)
model.eval()

for image_path in ["assets/cat.jpeg", "assets/dog.jpeg"]:
    image = Image.open(image_path).convert("RGB")
    img_transforms = Compose([CenterCrop(600), Resize(size=(32, 32)), PILToTensor()])

    # Transform the image, normalize between 0 and 1
    input_tensor = img_transforms(image)

    # Show the transformed image
    ToPILImage()(input_tensor).show()

    input_tensor = input_tensor.to(torch.float).div(255.0)

    # add dummy batch dimension
    input_tensor = input_tensor[None, ...]

    with torch.no_grad():
        logits = model(input_tensor)[0]
        probs = torch.softmax(logits, dim=-1)
        predictions = sorted(zip(probs.tolist(), Cifar10.CLASS_NAMES), reverse=True)
        print(
            "Top 3 Predictions:",
            [f"{cls}: {prob:.1%}" for prob, cls in predictions[:3]],
        )


2024-04-27 09:35:47 - [93m[1mDEBUG   [0m - Cannot load internal arguments, skipping.
2024-04-27 09:35:47 - [34m[1mLOGS   [0m - Pretrained weights are loaded from results/run_1/checkpoint_best.pt
2024-04-27 09:35:47 - [32m[1mINFO   [0m - Trainable parameters: ['conv1.weight', 'conv1.bias', 'conv2.weight', 'conv2.bias', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias', 'fc3.weight', 'fc3.bias']
Top 3 Predictions: ['bird: 40.6%', 'dog: 16.4%', 'cat: 15.3%']
Top 3 Predictions: ['bird: 25.3%', 'ship: 25.1%', 'cat: 24.6%']


这里只是使用corenet 来训练模型的例子，训练仅仅是少量训练轮数，实现一个图片分类模型的训练。