# CIFAR-10 CNN Training with PyTorch - Self-Contained Notebook

This notebook contains all the code for training a CIFAR-10 CNN model with advanced features:
- C1C2C3C4 architecture with Depthwise Separable and Dilated Convolutions
- Albumentations data augmentation
- Global Average Pooling (GAP)
- Receptive field > 44 pixels
- < 200k parameters

**All code is embedded in this single notebook - no external files needed!**

In [1]:
# Install required packages
!pip install torch torchvision torchsummary numpy matplotlib albumentations tqdm



In [2]:
# ================================
# logger_setup.py - Logging Setup
# ================================
import logging
import os

_LOGGING_INITIALIZED = False

class TqdmLoggingHandler(logging.StreamHandler):
    """A logging handler that plays nicely with tqdm progress bars."""
    def emit(self, record):
        try:
            from tqdm import tqdm  # Lazy import so tqdm isn't a hard dependency
            msg = self.format(record)
            tqdm.write(msg)
        except Exception:
            # Fallback to normal stream behavior
            super().emit(record)

def setup_logging(log_to_file=False, log_dir='logs'):
    """Set up simple logging configuration.

    Args:
        log_to_file (bool): If True, also log to a file (default: False)
        log_dir (str): Directory for log files if log_to_file is True
    """
    global _LOGGING_INITIALIZED

    # If already initialized, don't recreate handlers (prevents truncation mid-run)
    if _LOGGING_INITIALIZED:
        logger = logging.getLogger()
        # If file logging is requested and not present yet, add it (append mode)
        if log_to_file and not any(isinstance(h, logging.FileHandler) for h in logger.handlers):
            try:
                if not os.path.exists(log_dir):
                    os.makedirs(log_dir, exist_ok=True)
                log_file = os.path.join(log_dir, 'training.log')
                file_handler = logging.FileHandler(log_file, mode='a', encoding='utf-8')
                file_handler.setFormatter(logging.Formatter('%(asctime)s - %(levelname)s - %(message)s'))
                file_handler.setLevel(logging.INFO)
                logger.addHandler(file_handler)
            except Exception:
                pass
        return logger

    # Configure basic logging format
    formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')

    # Set up console handler
    console_handler = TqdmLoggingHandler()
    console_handler.setFormatter(formatter)
    console_handler.setLevel(logging.INFO)

    # Configure root logger
    root_logger = logging.getLogger()
    root_logger.setLevel(logging.INFO)

    # Remove any existing handlers to avoid duplicates
    for handler in root_logger.handlers[:]:
        root_logger.removeHandler(handler)

    # Add console handler
    root_logger.addHandler(console_handler)

    # Add file handler if requested
    if log_to_file:
        try:
            if not os.path.exists(log_dir):
                os.makedirs(log_dir, exist_ok=True)
            log_file = os.path.join(log_dir, 'training.log')
            # Truncate once at initial setup to start fresh, then use append mode
            try:
                with open(log_file, 'w', encoding='utf-8'):
                    pass
            except Exception:
                # If truncation fails, proceed; handler will create/append
                pass
            file_handler = logging.FileHandler(log_file, mode='a', encoding='utf-8')
            file_handler.setFormatter(formatter)
            file_handler.setLevel(logging.INFO)
            root_logger.addHandler(file_handler)
        except Exception:
            # Silently continue with console-only logging if file logging fails
            pass

    _LOGGING_INITIALIZED = True
    return root_logger

In [3]:
# ================================
# data_setup.py - Data Loading & Augmentation
# ================================
import torch.utils as utils
from torchvision import datasets
import numpy as np
import albumentations as A
from albumentations.pytorch import ToTensorV2

class AlbumentationsTransform:
    """Wrapper to make Albumentations compatible with PyTorch datasets"""
    def __init__(self, transform):
        self.transform = transform

    def __call__(self, img):
        # Convert PIL to numpy
        img = np.array(img)
        # Apply albumentations transform
        transformed = self.transform(image=img)
        return transformed['image']

class DataSetup:
    def __init__(self, batch_size_train=64, batch_size_test=1000, shuffle_train=True, shuffle_test=False, num_workers=2, pin_memory=None, train_transforms=None, test_transforms=None):
        self.batch_size_train = batch_size_train
        self.batch_size_test = batch_size_test
        self.shuffle_train = shuffle_train
        self.shuffle_test = shuffle_test
        self.num_workers = num_workers
        self.pin_memory = pin_memory
        self.train_transforms = train_transforms if train_transforms else self.get_train_transforms()
        self.test_transforms = test_transforms if test_transforms else self.get_test_transforms()
        self.train_loader = self.get_train_loader()
        self.test_loader = self.get_test_loader()

    def get_train_transforms(self):
        """Albumentations transforms for training with required augmentations"""
        # CIFAR-10 mean: (0.4914, 0.4822, 0.4465) -> [125, 123, 114] for 0-255 range
        fill_value = [125, 123, 114]

        train_transform = A.Compose([
            A.HorizontalFlip(p=0.5),
            A.ShiftScaleRotate(
                shift_limit=0.1,
                scale_limit=0.1,
                rotate_limit=15,
                p=0.5
            ),
            A.CoarseDropout(
                max_holes=1,
                max_height=16,
                max_width=16,
                min_holes=1,
                min_height=16,
                min_width=16,
                fill_value=fill_value,
                mask_fill_value=None,
                p=0.5
            ),
            A.Normalize(
                mean=(0.4914, 0.4822, 0.4465),
                std=(0.2470, 0.2435, 0.2616)
            ),
            ToTensorV2()
        ])
        return AlbumentationsTransform(train_transform)

    def get_test_transforms(self):
        """Albumentations transforms for testing (only normalization)"""
        test_transform = A.Compose([
            A.Normalize(
                mean=(0.4914, 0.4822, 0.4465),
                std=(0.2470, 0.2435, 0.2616)
            ),
            ToTensorV2()
        ])
        return AlbumentationsTransform(test_transform)

    def get_train_datasets(self):
        return datasets.CIFAR10('../data', train=True, download=True, transform=self.train_transforms)

    def get_test_datasets(self):
        return datasets.CIFAR10('../data', train=False, download=True, transform=self.test_transforms)

    def get_train_loader(self):
        train_dataset = self.get_train_datasets()
        return utils.data.DataLoader(train_dataset, batch_size=self.batch_size_train, shuffle=self.shuffle_train, num_workers=self.num_workers, pin_memory=self.pin_memory)

    def get_test_loader(self):
        test_dataset = self.get_test_datasets()
        return utils.data.DataLoader(test_dataset, batch_size=self.batch_size_test, shuffle=self.shuffle_test, num_workers=self.num_workers, pin_memory=self.pin_memory)

In [4]:
# ================================
# cifar10model_v0.py - Model Architecture
# ================================
import torch
import torch.nn as nn
import torch.nn.functional as F

class DepthwiseSeparableConv(nn.Module):
    """Depthwise Separable Convolution = Depthwise + Pointwise"""
    def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1):
        super(DepthwiseSeparableConv, self).__init__()

        # Depthwise: each input channel convolved separately
        self.depthwise = nn.Conv2d(in_channels, in_channels, kernel_size,
                                 stride=stride, padding=padding, groups=in_channels, bias=False)
        self.bn1 = nn.BatchNorm2d(in_channels)

        # Pointwise: 1x1 conv to mix channels
        self.pointwise = nn.Conv2d(in_channels, out_channels, 1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)

    def forward(self, x):
        x = F.relu(self.bn1(self.depthwise(x)))
        x = F.relu(self.bn2(self.pointwise(x)))
        return x

class Net(nn.Module):
    """CIFAR-10 CNN with C1C2C3C4 architecture, Depthwise Sep Conv, Dilated Conv, and GAP."""
    def __init__(self):
        super(Net, self).__init__()

        # C1: Initial feature extraction (32x32 -> 32x32)
        self.c1 = nn.Sequential(
            nn.Conv2d(3, 8, 3, padding=1, bias=False),    # 32x32x8, RF=3
            nn.BatchNorm2d(8),
            nn.ReLU(),
            nn.Dropout(0.05),
            nn.Conv2d(8, 16, 3, padding=1, bias=False),   # 32x32x16, RF=5
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.Dropout(0.05),
            nn.Conv2d(16, 32, 3, padding=1, bias=False),  # 32x32x32, RF=7
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.Dropout(0.05),
        )

        # C2: Feature extraction with Dilated Convolutions (32x32 -> 32x32)
        self.c2 = nn.Sequential(
            nn.Conv2d(32, 32, 3, padding=0, dilation=1, bias=False),  # 32x32x32, RF=9
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.Dropout(0.05),
            nn.Conv2d(32, 32, 3, padding=2, dilation=2, bias=False),  # 32x32x32, RF=13
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.Dropout(0.05),
            nn.Conv2d(32, 32, 3, padding=0, dilation=1, bias=False),  # 32x32x32, RF=21
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.Dropout(0.05),
        )

        # C3: Pattern recognition with Depthwise Separable Conv (32x32 -> 16x16)
        self.c3 = nn.Sequential(
            nn.Conv2d(32, 32, 3, padding=1, stride=2),  # 32x32x32 -> 16x16x32, RF=23
            nn.BatchNorm2d(32),
            nn.ReLU(),
            DepthwiseSeparableConv(32, 64, kernel_size=3, stride=1, padding=1),  # 16x16x64, RF=25
            nn.Conv2d(64, 64, 3, padding=1, bias=False),  # 16x16x64, RF=27
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.Dropout(0.05),

        )

        # C4: Final convolution with stride=2 (16x16 -> 8x8), then GAP + FC
        self.c4 = nn.Sequential(
            nn.Conv2d(64, 128, 3, stride=1, padding=1, bias=False),  # 8x8x128, RF=55
            #nn.BatchNorm2d(128),
            #nn.ReLU(),
            nn.AdaptiveAvgPool2d(1),  # 1x1x128, RF=covers entire input
            nn.Flatten(),
            nn.Linear(128, 10)  # FC after GAP to target classes
        )

    def forward(self, x):
        x = self.c1(x)
        x = self.c2(x)
        x = self.c3(x)
        x = self.c4(x)
        return F.log_softmax(x, dim=1)

class set_config_v0:
    """Basic configuration for CIFAR-10 training."""
    def __init__(self):
        self.epochs = 35
        self.nll_loss = torch.nn.NLLLoss()
        self.criterion = self.nll_loss

    def setup(self, model, use_onecycle: bool = True):
        self.use_onecycle = use_onecycle
        base_lr = 0.01
        self.optimizer = torch.optim.SGD(model.parameters(), lr=base_lr, momentum=0.9)
        self.device = next(model.parameters()).device
        self.dataloader_args = self.get_dataloader_args()
        self.data_setup_instance = DataSetup(**self.dataloader_args)
        if self.use_onecycle:
            steps_per_epoch = len(self.data_setup_instance.train_loader)
            self.scheduler = torch.optim.lr_scheduler.OneCycleLR(
                self.optimizer,
                max_lr=base_lr,
                epochs=self.epochs,
                steps_per_epoch=steps_per_epoch,
                pct_start=0.2,
                div_factor=10,
                final_div_factor=100,
                anneal_strategy='cos'
            )
            self.scheduler.batch_step = True
            logging.getLogger().info(
                f"Model v0: OneCycleLR max_lr={base_lr} pct_start=0.2 div_factor=10 final_div_factor=100 epochs={self.epochs}"
            )
        else:
            self.scheduler = torch.optim.lr_scheduler.StepLR(self.optimizer, step_size=6, gamma=0.1)
            self.scheduler.batch_step = False
            logging.getLogger().info(
                f"Model v0: StepLR lr={base_lr} step_size=6 gamma=0.1"
            )
        logging.getLogger().info(f"Dataloader arguments: {self.dataloader_args}")
        return self

    def get_dataloader_args(self):
        if hasattr(self, 'device') and self.device.type == "cuda":
            args = dict(batch_size_train=32, batch_size_test=1000, shuffle_train=True, shuffle_test=False,
                        num_workers=2, pin_memory=True)
        else:
            args = dict(batch_size_train=32, batch_size_test=1000, shuffle_train=True, shuffle_test=False)
        logging.info(f"Model v0 dataloader args: {args}")
        return args

In [5]:
# ================================
# train_test.py - Training & Testing Logic
# ================================
from tqdm import tqdm
import torch
import matplotlib.pyplot as plt
import torch.nn.functional as F
import logging

class train_test_model:

    def __init__(self, model, device, train_loader, test_loader,criterion,optimizer,scheduler,epochs=1):
        self.model = model
        self.device = device
        self.train_loader = train_loader
        self.test_loader = test_loader
        self.criterion = criterion
        self.optimizer = optimizer
        self.scheduler = scheduler
        self.train_acc_list = []
        self.test_acc_list = []
        self.F = F  # Assign torch.nn.functional to self.F for easier access
        self.epochs = epochs

    def train(self, model, device, train_loader, optimizer, criterion,epoch):
        self.model.train()
        pbar = tqdm(self.train_loader, desc="Training", leave=True)
        train_loss, correct, processed = 0, 0, 0
        # Train with progress bar

        for batch_idx, (data, target) in enumerate(pbar, 1):
            # get samples and move to device
            data, target = data.to(self.device), target.to(self.device)
            # Initialize optimizer
            self.optimizer.zero_grad()
            # Prediction
            output = self.model(data)
            # Calculate loss
            loss = self.F.nll_loss(output, target)
            # Backpropagation
            loss.backward()
            # Gradient clipping to stabilize higher LR OneCycle swings
            torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=2.0)
            self.optimizer.step()
            # Per-batch scheduler stepping (e.g., OneCycleLR) if attribute present
            if hasattr(self, 'scheduler') and getattr(self.scheduler, 'batch_step', False):
                self.scheduler.step()
            # -----------------------------
            # Accumulate loss and calculate accuracy
            train_loss += loss.item()
            pred = output.argmax(dim=1, keepdim=True)
            batch_correct = pred.eq(target.view_as(pred)).sum().item()
            correct += batch_correct
            processed += len(data)

            # Calculate current metrics
            current_loss = train_loss / processed
            current_accuracy = 100. * correct / processed

            # Update progress bar only (add LR peek occasionally)
            if hasattr(self, 'scheduler') and getattr(self.scheduler, 'batch_step', False):
                current_lr = self.scheduler.get_last_lr()[0]
                status = f"Train Loss={current_loss:.4f} Acc={current_accuracy:.2f}% LR={current_lr:.4f}"
            else:
                status = f"Train Loss={current_loss:.4f} Accuracy={current_accuracy:.2f}%"
            pbar.set_description(desc=status)

        # Final epoch-level logging for training metrics
        epoch_accuracy = 100. * correct / len(self.train_loader.dataset)
        logging.info(
            f'Epoch {epoch:02d}/{self.epochs}: Train set final results: Average loss: {train_loss:.4f}, '
            f'Accuracy: {correct}/{len(self.train_loader.dataset)} ({epoch_accuracy:.2f}%)'
        )
        return epoch_accuracy

    def test(self, model, device, test_loader, criterion,epoch):
        self.model.eval()
        test_loss, correct = 0, 0
        # Test with progress bar

        with torch.no_grad():
            pbar = tqdm(self.test_loader, desc="Testing", leave=True)
            for batch_idx, (data, target) in enumerate(pbar, 1):
                data, target = data.to(self.device), target.to(self.device)
                output = self.model(data)
                test_loss += self.F.nll_loss(output, target, reduction='sum').item()
                pred = output.argmax(dim=1, keepdim=True)
                batch_correct = pred.eq(target.view_as(pred)).sum().item()
                correct += batch_correct

                # Calculate current metrics
                current_loss = test_loss / (batch_idx * len(data))
                current_accuracy = 100. * correct / (batch_idx * len(data))

                # Update progress bar only
                status = f"Test Loss={current_loss:.4f} Accuracy={current_accuracy:.2f}%"
                pbar.set_description(desc=status)

        test_loss /= len(self.test_loader.dataset)
        acc = 100. * correct / len(self.test_loader.dataset)
        logging.info(f'Epoch {epoch:02d}/{self.epochs}:Test set final results: Average loss: {test_loss:.4f}, Accuracy: {correct}/{len(self.test_loader.dataset)} ({acc:.2f}%)')
        return acc

    def do_training(self,epoch):
        return self.train(self.model, self.device, self.train_loader, self.optimizer, self.criterion,epoch)

    def do_testing(self,epoch):
        return self.test(self.model, self.device, self.test_loader, self.criterion,epoch)

    def run_epoch(self):
        logging.info(f"Training model for {self.epochs} epochs")
        for epoch in range(1, self.epochs+1):
            train_acc = self.do_training(epoch=epoch)
            test_acc = self.do_testing(epoch=epoch)
            # Epoch-level scheduler step only if not using per-batch scheduler
            if hasattr(self.scheduler, 'batch_step') and not getattr(self.scheduler, 'batch_step'):
                self.scheduler.step()
            elif not hasattr(self.scheduler, 'batch_step'):
                # Legacy schedulers
                try:
                    self.scheduler.step()
                except Exception:
                    pass
            self.train_acc_list.append(train_acc)
            self.test_acc_list.append(test_acc)

            #logging.info(f"Epoch {epoch:02d}/{self.epochs}: Train={train_acc:.2f}%, Test={test_acc:.2f}%, LR={self.scheduler.get_last_lr()[0]:.6f}")

    def plot_results(self):
        plt.plot(self.train_acc_list, label='Train Acc')
        plt.plot(self.test_acc_list, label='Test Acc')
        plt.legend()
        plt.title("Training vs Test Accuracy")
        plt.show()

In [6]:
# ================================
# summarizer.py - Model Summary & Checks
# ================================
import torch.nn as nn
from torchsummary import summary

# -----------------------------
# 8. Model Architecture Checks
# -----------------------------
def model_checks(model):
    import logging

    logging.info('--- Model Architecture Checks ---')
    # Total Parameter Count
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    logging.info(f'Total Parameters: {total_params:,}')
    logging.info(f'Trainable Parameters: {trainable_params:,}\n')

    logging.info('Layer-wise Parameter Details (in model order):')
    logging.info('-'*100)

    def get_layer_details(module):
        details = ''
        if isinstance(module, nn.Conv2d):
            details = (f'Convolution: {module.in_channels}->{module.out_channels} channels, '
                      f'kernel {module.kernel_size}, stride {module.stride}, padding {module.padding}, '
                      f'groups {module.groups}, bias {module.bias is not None}')
        elif isinstance(module, nn.BatchNorm2d):
            details = f'BatchNorm: {module.num_features} features, eps={module.eps}, momentum={module.momentum}'
        elif isinstance(module, nn.ReLU):
            details = 'Activation: ReLU'
        elif isinstance(module, nn.ReLU6):
            details = 'Activation: ReLU6'
        elif isinstance(module, nn.LeakyReLU):
            details = f'Activation: LeakyReLU (negative_slope={module.negative_slope})'
        elif isinstance(module, nn.MaxPool2d):
            details = f'MaxPool: kernel {module.kernel_size}, stride {module.stride}, padding {module.padding}'
        elif isinstance(module, nn.AvgPool2d):
            details = f'AvgPool: kernel {module.kernel_size}, stride {module.stride}, padding {module.padding}'
        elif isinstance(module, nn.AdaptiveAvgPool2d):
            details = f'AdaptiveAvgPool: output size {module.output_size}'
        elif isinstance(module, nn.Dropout):
            details = f'Dropout: probability {module.p}'
        elif isinstance(module, nn.Dropout2d):
            details = f'Dropout2d: probability {module.p}'
        elif isinstance(module, nn.Linear):
            details = f'Linear: {module.in_features}->{module.out_features}, bias {module.bias is not None}'
        elif isinstance(module, nn.Flatten):
            details = 'Flatten'
        return details

    # Get layers in order as defined in model
    for name, module in model.named_children():
        if isinstance(module, nn.Sequential):
            logging.info(f"\nBlock: {name} (Sequential)")
            for subname, submodule in module.named_children():
                layer_name = f'{name}.{subname} ({submodule.__class__.__name__})'
                layer_params = sum(p.numel() for p in submodule.parameters())
                details = get_layer_details(submodule)
                logging.info(f'  {layer_name:50} | Params: {layer_params:6,d} | {details}')
        else:
            layer_name = f'{name} ({module.__class__.__name__})'
            layer_params = sum(p.numel() for p in module.parameters())
            details = get_layer_details(module)
            logging.info(f'  {layer_name:50} | Params: {layer_params:6,d} | {details}')

    logging.info('-'*100)
    logging.info('\nLayer Type Summary:')

    # Count all layer types
    layer_types = {
        'Conv2d': [m for m in model.modules() if isinstance(m, nn.Conv2d)],
        'BatchNorm2d': [m for m in model.modules() if isinstance(m, nn.BatchNorm2d)],
        'ReLU': [m for m in model.modules() if isinstance(m, (nn.ReLU, nn.ReLU6))],
        'LeakyReLU': [m for m in model.modules() if isinstance(m, nn.LeakyReLU)],
        'MaxPool2d': [m for m in model.modules() if isinstance(m, nn.MaxPool2d)],
        'AvgPool2d': [m for m in model.modules() if isinstance(m, nn.AvgPool2d)],
        'AdaptiveAvgPool2d': [m for m in model.modules() if isinstance(m, nn.AdaptiveAvgPool2d)],
        'Dropout': [m for m in model.modules() if isinstance(m, nn.Dropout)],
        'Dropout2d': [m for m in model.modules() if isinstance(m, nn.Dropout2d)],
        'Linear': [m for m in model.modules() if isinstance(m, nn.Linear)],
        'Flatten': [m for m in model.modules() if isinstance(m, nn.Flatten)]
    }

    for layer_type, layers in layer_types.items():
        if layers:  # Only show if there are layers of this type
            logging.info(f'{layer_type:20} layers used: {len(layers):3d}')

    logging.info('-'*100)

In [7]:
# ================================
# receptive_field_calculator.py - RF Calculation
# ================================
def calculate_receptive_field(model):
    """
    Calculate receptive field for each layer in the model.
    Formula: RF_new = RF_previous + (kernel_size - 1) * dilation for stride=1
             RF_new = RF_previous * stride + (kernel_size - 1) * dilation for stride>1
    """
    rf = 1  # Starting receptive field (single pixel)
    print("Layer-by-layer receptive field calculation:")
    print(f"Input: RF = {rf}")

    # C1 block
    print("\nC1 Block:")
    # Conv2d(3, 8, 3, padding=1)
    rf = rf + (3 - 1) * 1
    print(f"Conv2d(3->8, 3x3, p=1): RF = {rf}")

    # Conv2d(8, 16, 3, padding=1)
    rf = rf + (3 - 1) * 1
    print(f"Conv2d(8->16, 3x3, p=1): RF = {rf}")

    # Conv2d(16, 32, 3, padding=1)
    rf = rf + (3 - 1) * 1
    print(f"Conv2d(16->32, 3x3, p=1): RF = {rf}")

    # C2 block
    print("\nC2 Block:")
    # Conv2d(32, 32, 3, padding=1, dilation=1)
    rf = rf + (3 - 1) * 1
    print(f"Conv2d(32->32, 3x3, p=1, d=1): RF = {rf}")

    # Conv2d(32, 32, 3, padding=2, dilation=2)
    rf = rf + (3 - 1) * 2
    print(f"Conv2d(32->32, 3x3, p=2, d=2): RF = {rf}")

    # Conv2d(32, 32, 3, padding=4, dilation=4)
    rf = rf + (3 - 1) * 4
    print(f"Conv2d(32->32, 3x3, p=4, d=4): RF = {rf}")

    # C3 block
    print("\nC3 Block:")
    # DepthwiseSeparableConv depthwise: Conv2d(32, 32, 3, stride=2, padding=1, groups=32)
    rf = rf * 2 + (3 - 1) * 1
    print(f"DepthwiseConv(32->32, 3x3, s=2, p=1): RF = {rf}")

    # Pointwise doesn't change RF
    print(f"PointwiseConv(32->64, 1x1): RF = {rf} (unchanged)")

    # Conv2d(64, 64, 3, padding=1)
    rf = rf + (3 - 1) * 1
    print(f"Conv2d(64->64, 3x3, p=1): RF = {rf}")

    # C4 block
    print("\nC4 Block:")
    # Conv2d(64, 128, 3, stride=2, padding=1)
    rf = rf * 2 + (3 - 1) * 1
    print(f"Conv2d(64->128, 3x3, s=2, p=1): RF = {rf}")

    # GAP doesn't change RF
    print(f"AdaptiveAvgPool2d(1): RF = {rf} (unchanged)")

    print(f"\nFinal receptive field: {rf} x {rf} pixels")
    return rf

def count_parameters(model):
    """Count total parameters in the model"""
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    print("\nModel Parameters:")
    print(f"Total parameters: {total_params:,}")
    print(f"Trainable parameters: {trainable_params:,}")
    return total_params

In [8]:
# ================================
# main.py - Main Training Orchestration
# ================================
class get_model:
    def __init__(self,device=None):
        self.device = device if device else self.get_device()
        self.model_obj = self.get_model()
        self.model_config = self.get_config()

    def get_device(self):
        return torch.device("cuda" if torch.cuda.is_available() else "cpu")

    def get_model(self):
        return Net().to(self.device)

    def get_config(self):
        return set_config_v0().setup(self.model_obj)

def main_i(params_check=1):
    logging.info("Setting up for model")
    model = get_model(device=None)
    # Capture printed summary into logs
    import io
    import contextlib
    with io.StringIO() as buf, contextlib.redirect_stdout(buf):
        summary(model.model_obj, input_size=(3, 32, 32))
        summary_text = buf.getvalue().strip()
    if summary_text:
        logging.info("\n" + summary_text)
    train_test_instance = train_test_model(model.model_obj,
                                          model.device,
                                          model.model_config.data_setup_instance.train_loader,
                                          model.model_config.data_setup_instance.test_loader,
                                          model.model_config.criterion,
                                          model.model_config.optimizer,
                                          model.model_config.scheduler,
                                          model.model_config.epochs)
    if (params_check == 0):
        train_test_instance.run_epoch()
    else:
        pass
    #train_test_instance.plot_results()
    # Capture printed model checks into logs
    with io.StringIO() as buf, contextlib.redirect_stdout(buf):
        model_checks(model.model_obj)
        checks_text = buf.getvalue().strip()
    if checks_text:
        logging.info("\n" + checks_text)

def main():
    # Initialize logging only in the main process
    setup_logging(log_to_file=True)
    params_check = int(input("Enter 1 for params check only, 0 for full training/testing: "))
    main_i(params_check=params_check)

if __name__ == "__main__":
    main()

KeyboardInterrupt: Interrupted by user

In [10]:
# ================================
# main.py - Main Training Orchestration
# ================================
class get_model:
    def __init__(self,device=None):
        self.device = device if device else self.get_device()
        self.model_obj = self.get_model()
        self.model_config = self.get_config()

    def get_device(self):
        return torch.device("cuda" if torch.cuda.is_available() else "cpu")

    def get_model(self):
        return Net().to(self.device)

    def get_config(self):
        return set_config_v0().setup(self.model_obj)

def main_i(params_check=1):
    logging.info("Setting up for model")
    model = get_model(device=None)
    # Capture printed summary into logs
    import io
    import contextlib
    with io.StringIO() as buf, contextlib.redirect_stdout(buf):
        summary(model.model_obj, input_size=(3, 32, 32))
        summary_text = buf.getvalue().strip()
    if summary_text:
        logging.info("\n" + summary_text)
    train_test_instance = train_test_model(model.model_obj,
                                          model.device,
                                          model.model_config.data_setup_instance.train_loader,
                                          model.model_config.data_setup_instance.test_loader,
                                          model.model_config.criterion,
                                          model.model_config.optimizer,
                                          model.model_config.scheduler,
                                          model.model_config.epochs)
    if (params_check == 0):
        train_test_instance.run_epoch()
    else:
        pass
    #train_test_instance.plot_results()
    # Capture printed model checks into logs
    with io.StringIO() as buf, contextlib.redirect_stdout(buf):
        model_checks(model.model_obj)
        checks_text = buf.getvalue().strip()
    if checks_text:
        logging.info("\n" + checks_text)

def main():
    # Initialize logging only in the main process
    setup_logging(log_to_file=True)
    params_check = int(input("Enter 1 for params check only, 0 for full training/testing: "))
    main_i(params_check=params_check)

if __name__ == "__main__":
    main()

Enter 1 for params check only, 0 for full training/testing: 0
2025-10-02 22:28:19,103 - INFO - Setting up for model
2025-10-02 22:28:19,110 - INFO - Model v0 dataloader args: {'batch_size_train': 128, 'batch_size_test': 1000, 'shuffle_train': True, 'shuffle_test': False, 'num_workers': 2, 'pin_memory': True}


  original_init(self, **validated_kwargs)
  A.CoarseDropout(


2025-10-02 22:28:20,683 - INFO - Model v0: OneCycleLR max_lr=0.01 pct_start=0.2 div_factor=10 final_div_factor=100 epochs=15
2025-10-02 22:28:20,683 - INFO - Dataloader arguments: {'batch_size_train': 128, 'batch_size_test': 1000, 'shuffle_train': True, 'shuffle_test': False, 'num_workers': 2, 'pin_memory': True}
2025-10-02 22:28:20,724 - INFO - 
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 8, 32, 32]             216
       BatchNorm2d-2            [-1, 8, 32, 32]              16
              ReLU-3            [-1, 8, 32, 32]               0
           Dropout-4            [-1, 8, 32, 32]               0
            Conv2d-5           [-1, 16, 32, 32]           1,152
       BatchNorm2d-6           [-1, 16, 32, 32]              32
              ReLU-7           [-1, 16, 32, 32]               0
           Dropout-8           [-1, 16, 32, 32]               0
          

Train Loss=0.0154 Acc=24.35% LR=0.0033: 100%|██████████| 391/391 [00:22<00:00, 17.74it/s]


2025-10-02 22:28:42,772 - INFO - Epoch 01/15: Train set final results: Average loss: 768.4059, Accuracy: 12176/50000 (24.35%)


Test Loss=1.7024 Accuracy=31.12%: 100%|██████████| 10/10 [00:01<00:00,  5.88it/s]


2025-10-02 22:28:44,479 - INFO - Epoch 01/15:Test set final results: Average loss: 1.7024, Accuracy: 3112/10000 (31.12%)


Train Loss=0.0124 Acc=39.06% LR=0.0078: 100%|██████████| 391/391 [00:21<00:00, 18.43it/s]


2025-10-02 22:29:05,698 - INFO - Epoch 02/15: Train set final results: Average loss: 618.5986, Accuracy: 19528/50000 (39.06%)


Test Loss=1.4052 Accuracy=47.93%: 100%|██████████| 10/10 [00:02<00:00,  4.43it/s]


2025-10-02 22:29:07,960 - INFO - Epoch 02/15:Test set final results: Average loss: 1.4052, Accuracy: 4793/10000 (47.93%)


Train Loss=0.0105 Acc=50.17% LR=0.0100: 100%|██████████| 391/391 [00:21<00:00, 18.15it/s]


2025-10-02 22:29:29,504 - INFO - Epoch 03/15: Train set final results: Average loss: 523.0820, Accuracy: 25084/50000 (50.17%)


Test Loss=1.2297 Accuracy=54.69%: 100%|██████████| 10/10 [00:01<00:00,  6.31it/s]


2025-10-02 22:29:31,094 - INFO - Epoch 03/15:Test set final results: Average loss: 1.2297, Accuracy: 5469/10000 (54.69%)


Train Loss=0.0092 Acc=56.84% LR=0.0098: 100%|██████████| 391/391 [00:22<00:00, 17.14it/s]


2025-10-02 22:29:53,915 - INFO - Epoch 04/15: Train set final results: Average loss: 462.0059, Accuracy: 28422/50000 (56.84%)


Test Loss=1.0360 Accuracy=62.31%: 100%|██████████| 10/10 [00:01<00:00,  6.35it/s]


2025-10-02 22:29:55,494 - INFO - Epoch 04/15:Test set final results: Average loss: 1.0360, Accuracy: 6231/10000 (62.31%)


Train Loss=0.0085 Acc=60.62% LR=0.0093: 100%|██████████| 391/391 [00:23<00:00, 16.56it/s]


2025-10-02 22:30:19,105 - INFO - Epoch 05/15: Train set final results: Average loss: 423.8401, Accuracy: 30309/50000 (60.62%)


Test Loss=1.0187 Accuracy=63.71%: 100%|██████████| 10/10 [00:01<00:00,  6.35it/s]


2025-10-02 22:30:20,684 - INFO - Epoch 05/15:Test set final results: Average loss: 1.0187, Accuracy: 6371/10000 (63.71%)


Train Loss=0.0080 Acc=63.30% LR=0.0085: 100%|██████████| 391/391 [00:22<00:00, 17.44it/s]


2025-10-02 22:30:43,103 - INFO - Epoch 06/15: Train set final results: Average loss: 397.6592, Accuracy: 31648/50000 (63.30%)


Test Loss=0.9465 Accuracy=65.90%: 100%|██████████| 10/10 [00:01<00:00,  6.36it/s]


2025-10-02 22:30:44,678 - INFO - Epoch 06/15:Test set final results: Average loss: 0.9465, Accuracy: 6590/10000 (65.90%)


Train Loss=0.0075 Acc=65.29% LR=0.0075: 100%|██████████| 391/391 [00:22<00:00, 17.57it/s]


2025-10-02 22:31:06,939 - INFO - Epoch 07/15: Train set final results: Average loss: 377.3529, Accuracy: 32643/50000 (65.29%)


Test Loss=0.9542 Accuracy=66.25%: 100%|██████████| 10/10 [00:01<00:00,  6.44it/s]


2025-10-02 22:31:08,495 - INFO - Epoch 07/15:Test set final results: Average loss: 0.9542, Accuracy: 6625/10000 (66.25%)


Train Loss=0.0072 Acc=66.94% LR=0.0063: 100%|██████████| 391/391 [00:22<00:00, 17.69it/s]


2025-10-02 22:31:30,599 - INFO - Epoch 08/15: Train set final results: Average loss: 359.0284, Accuracy: 33471/50000 (66.94%)


Test Loss=0.8503 Accuracy=70.29%: 100%|██████████| 10/10 [00:01<00:00,  5.67it/s]


2025-10-02 22:31:32,366 - INFO - Epoch 08/15:Test set final results: Average loss: 0.8503, Accuracy: 7029/10000 (70.29%)


Train Loss=0.0068 Acc=68.51% LR=0.0050: 100%|██████████| 391/391 [00:21<00:00, 18.25it/s]


2025-10-02 22:31:53,793 - INFO - Epoch 09/15: Train set final results: Average loss: 341.0438, Accuracy: 34257/50000 (68.51%)


Test Loss=0.8262 Accuracy=70.48%: 100%|██████████| 10/10 [00:02<00:00,  4.86it/s]


2025-10-02 22:31:55,854 - INFO - Epoch 09/15:Test set final results: Average loss: 0.8262, Accuracy: 7048/10000 (70.48%)


Train Loss=0.0065 Acc=70.29% LR=0.0037: 100%|██████████| 391/391 [00:21<00:00, 17.89it/s]


2025-10-02 22:32:17,715 - INFO - Epoch 10/15: Train set final results: Average loss: 326.6047, Accuracy: 35147/50000 (70.29%)


Test Loss=0.7987 Accuracy=71.82%: 100%|██████████| 10/10 [00:01<00:00,  6.03it/s]


2025-10-02 22:32:19,379 - INFO - Epoch 10/15:Test set final results: Average loss: 0.7987, Accuracy: 7182/10000 (71.82%)


Train Loss=0.0063 Acc=71.32% LR=0.0025: 100%|██████████| 391/391 [00:22<00:00, 17.37it/s]


2025-10-02 22:32:41,889 - INFO - Epoch 11/15: Train set final results: Average loss: 314.6136, Accuracy: 35659/50000 (71.32%)


Test Loss=0.7214 Accuracy=74.69%: 100%|██████████| 10/10 [00:01<00:00,  6.30it/s]


2025-10-02 22:32:43,483 - INFO - Epoch 11/15:Test set final results: Average loss: 0.7214, Accuracy: 7469/10000 (74.69%)


Train Loss=0.0060 Acc=72.63% LR=0.0015: 100%|██████████| 391/391 [00:22<00:00, 17.25it/s]


2025-10-02 22:33:06,151 - INFO - Epoch 12/15: Train set final results: Average loss: 300.9645, Accuracy: 36317/50000 (72.63%)


Test Loss=0.7147 Accuracy=74.81%: 100%|██████████| 10/10 [00:02<00:00,  4.32it/s]


2025-10-02 22:33:08,473 - INFO - Epoch 12/15:Test set final results: Average loss: 0.7147, Accuracy: 7481/10000 (74.81%)


Train Loss=0.0059 Acc=73.42% LR=0.0007: 100%|██████████| 391/391 [00:22<00:00, 17.23it/s]


2025-10-02 22:33:31,173 - INFO - Epoch 13/15: Train set final results: Average loss: 293.4296, Accuracy: 36710/50000 (73.42%)


Test Loss=0.6682 Accuracy=76.38%: 100%|██████████| 10/10 [00:01<00:00,  6.20it/s]


2025-10-02 22:33:32,792 - INFO - Epoch 13/15:Test set final results: Average loss: 0.6682, Accuracy: 7638/10000 (76.38%)


Train Loss=0.0057 Acc=74.39% LR=0.0002: 100%|██████████| 391/391 [00:22<00:00, 17.23it/s]


2025-10-02 22:33:55,486 - INFO - Epoch 14/15: Train set final results: Average loss: 283.5937, Accuracy: 37196/50000 (74.39%)


Test Loss=0.6586 Accuracy=77.09%: 100%|██████████| 10/10 [00:01<00:00,  6.34it/s]


2025-10-02 22:33:57,069 - INFO - Epoch 14/15:Test set final results: Average loss: 0.6586, Accuracy: 7709/10000 (77.09%)


Train Loss=0.0056 Acc=74.55% LR=0.0000: 100%|██████████| 391/391 [00:22<00:00, 17.24it/s]


2025-10-02 22:34:19,748 - INFO - Epoch 15/15: Train set final results: Average loss: 279.6953, Accuracy: 37274/50000 (74.55%)


Test Loss=0.6558 Accuracy=77.01%: 100%|██████████| 10/10 [00:01<00:00,  5.97it/s]

2025-10-02 22:34:21,426 - INFO - Epoch 15/15:Test set final results: Average loss: 0.6558, Accuracy: 7701/10000 (77.01%)
2025-10-02 22:34:21,431 - INFO - 
2025-10-02 22:34:21,426 - INFO - --- Model Architecture Checks ---
2025-10-02 22:34:21,427 - INFO - Total Parameters: 148,466
2025-10-02 22:34:21,427 - INFO - Trainable Parameters: 148,466

2025-10-02 22:34:21,427 - INFO - Layer-wise Parameter Details (in model order):
2025-10-02 22:34:21,427 - INFO - ----------------------------------------------------------------------------------------------------
2025-10-02 22:34:21,427 - INFO - 
Block: c1 (Sequential)
2025-10-02 22:34:21,428 - INFO -   c1.0 (Conv2d)                                      | Params:    216 | Convolution: 3->8 channels, kernel (3, 3), stride (1, 1), padding (1, 1), groups 1, bias False
2025-10-02 22:34:21,428 - INFO -   c1.1 (BatchNorm2d)                                 | Params:     16 | BatchNorm: 8 features, eps=1e-05, momentum=0.1
2025-10-02 22:34:21,428 - INFO -




In [13]:
# ================================
# main.py - Main Training Orchestration
# ================================
class get_model:
    def __init__(self,device=None):
        self.device = device if device else self.get_device()
        self.model_obj = self.get_model()
        self.model_config = self.get_config()

    def get_device(self):
        return torch.device("cuda" if torch.cuda.is_available() else "cpu")

    def get_model(self):
        return Net().to(self.device)

    def get_config(self):
        return set_config_v0().setup(self.model_obj)

def main_i(params_check=1):
    logging.info("Setting up for model")
    model = get_model(device=None)
    # Capture printed summary into logs
    import io
    import contextlib
    with io.StringIO() as buf, contextlib.redirect_stdout(buf):
        summary(model.model_obj, input_size=(3, 32, 32))
        summary_text = buf.getvalue().strip()
    if summary_text:
        logging.info("\n" + summary_text)
    train_test_instance = train_test_model(model.model_obj,
                                          model.device,
                                          model.model_config.data_setup_instance.train_loader,
                                          model.model_config.data_setup_instance.test_loader,
                                          model.model_config.criterion,
                                          model.model_config.optimizer,
                                          model.model_config.scheduler,
                                          model.model_config.epochs)
    if (params_check == 0):
        train_test_instance.run_epoch()
    else:
        pass
    #train_test_instance.plot_results()
    # Capture printed model checks into logs
    with io.StringIO() as buf, contextlib.redirect_stdout(buf):
        model_checks(model.model_obj)
        checks_text = buf.getvalue().strip()
    if checks_text:
        logging.info("\n" + checks_text)

def main():
    # Initialize logging only in the main process
    setup_logging(log_to_file=True)
    params_check = int(input("Enter 1 for params check only, 0 for full training/testing: "))
    main_i(params_check=params_check)

if __name__ == "__main__":
    main()

Enter 1 for params check only, 0 for full training/testing: 0
2025-10-02 22:53:49,846 - INFO - Setting up for model
2025-10-02 22:53:49,857 - INFO - Model v0 dataloader args: {'batch_size_train': 128, 'batch_size_test': 1000, 'shuffle_train': True, 'shuffle_test': False, 'num_workers': 2, 'pin_memory': True}


  A.CoarseDropout(


2025-10-02 22:53:51,445 - INFO - Model v0: OneCycleLR max_lr=0.01 pct_start=0.2 div_factor=10 final_div_factor=100 epochs=15
2025-10-02 22:53:51,446 - INFO - Dataloader arguments: {'batch_size_train': 128, 'batch_size_test': 1000, 'shuffle_train': True, 'shuffle_test': False, 'num_workers': 2, 'pin_memory': True}
2025-10-02 22:53:51,452 - INFO - 
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 8, 32, 32]             216
       BatchNorm2d-2            [-1, 8, 32, 32]              16
              ReLU-3            [-1, 8, 32, 32]               0
           Dropout-4            [-1, 8, 32, 32]               0
            Conv2d-5           [-1, 16, 32, 32]           1,152
       BatchNorm2d-6           [-1, 16, 32, 32]              32
              ReLU-7           [-1, 16, 32, 32]               0
           Dropout-8           [-1, 16, 32, 32]               0
          

Train Loss=0.0156 Acc=23.10% LR=0.0033: 100%|██████████| 391/391 [00:21<00:00, 18.42it/s]


2025-10-02 22:54:12,677 - INFO - Epoch 01/15: Train set final results: Average loss: 781.3326, Accuracy: 11551/50000 (23.10%)


Test Loss=1.7181 Accuracy=33.99%: 100%|██████████| 10/10 [00:02<00:00,  4.56it/s]


2025-10-02 22:54:14,873 - INFO - Epoch 01/15:Test set final results: Average loss: 1.7181, Accuracy: 3399/10000 (33.99%)


Train Loss=0.0122 Acc=41.06% LR=0.0078: 100%|██████████| 391/391 [00:23<00:00, 16.94it/s]


2025-10-02 22:54:37,957 - INFO - Epoch 02/15: Train set final results: Average loss: 610.2634, Accuracy: 20528/50000 (41.06%)


Test Loss=1.4122 Accuracy=47.37%: 100%|██████████| 10/10 [00:01<00:00,  6.15it/s]


2025-10-02 22:54:39,586 - INFO - Epoch 02/15:Test set final results: Average loss: 1.4122, Accuracy: 4737/10000 (47.37%)


Train Loss=0.0103 Acc=51.46% LR=0.0100: 100%|██████████| 391/391 [00:22<00:00, 17.60it/s]


2025-10-02 22:55:01,802 - INFO - Epoch 03/15: Train set final results: Average loss: 516.1983, Accuracy: 25728/50000 (51.46%)


Test Loss=1.1993 Accuracy=56.12%: 100%|██████████| 10/10 [00:01<00:00,  6.13it/s]


2025-10-02 22:55:03,437 - INFO - Epoch 03/15:Test set final results: Average loss: 1.1993, Accuracy: 5612/10000 (56.12%)


Train Loss=0.0092 Acc=57.31% LR=0.0098: 100%|██████████| 391/391 [00:22<00:00, 17.35it/s]


2025-10-02 22:55:25,980 - INFO - Epoch 04/15: Train set final results: Average loss: 462.0678, Accuracy: 28655/50000 (57.31%)


Test Loss=1.1512 Accuracy=58.89%: 100%|██████████| 10/10 [00:01<00:00,  6.31it/s]


2025-10-02 22:55:27,569 - INFO - Epoch 04/15:Test set final results: Average loss: 1.1512, Accuracy: 5889/10000 (58.89%)


Train Loss=0.0086 Acc=60.29% LR=0.0093: 100%|██████████| 391/391 [00:22<00:00, 17.33it/s]


2025-10-02 22:55:50,132 - INFO - Epoch 05/15: Train set final results: Average loss: 427.7597, Accuracy: 30144/50000 (60.29%)


Test Loss=1.0688 Accuracy=62.14%: 100%|██████████| 10/10 [00:01<00:00,  6.24it/s]


2025-10-02 22:55:51,741 - INFO - Epoch 05/15:Test set final results: Average loss: 1.0688, Accuracy: 6214/10000 (62.14%)


Train Loss=0.0080 Acc=62.72% LR=0.0085: 100%|██████████| 391/391 [00:22<00:00, 17.37it/s]


2025-10-02 22:56:14,251 - INFO - Epoch 06/15: Train set final results: Average loss: 401.4008, Accuracy: 31358/50000 (62.72%)


Test Loss=0.9325 Accuracy=66.78%: 100%|██████████| 10/10 [00:01<00:00,  6.15it/s]


2025-10-02 22:56:15,881 - INFO - Epoch 06/15:Test set final results: Average loss: 0.9325, Accuracy: 6678/10000 (66.78%)


Train Loss=0.0076 Acc=65.04% LR=0.0075: 100%|██████████| 391/391 [00:21<00:00, 18.04it/s]


2025-10-02 22:56:37,560 - INFO - Epoch 07/15: Train set final results: Average loss: 381.4600, Accuracy: 32521/50000 (65.04%)


Test Loss=0.9466 Accuracy=66.80%: 100%|██████████| 10/10 [00:02<00:00,  4.21it/s]


2025-10-02 22:56:39,941 - INFO - Epoch 07/15:Test set final results: Average loss: 0.9466, Accuracy: 6680/10000 (66.80%)


Train Loss=0.0072 Acc=66.87% LR=0.0063: 100%|██████████| 391/391 [00:21<00:00, 18.20it/s]


2025-10-02 22:57:01,429 - INFO - Epoch 08/15: Train set final results: Average loss: 361.3235, Accuracy: 33436/50000 (66.87%)


Test Loss=0.8526 Accuracy=69.52%: 100%|██████████| 10/10 [00:01<00:00,  5.65it/s]


2025-10-02 22:57:03,203 - INFO - Epoch 08/15:Test set final results: Average loss: 0.8526, Accuracy: 6952/10000 (69.52%)


Train Loss=0.0069 Acc=68.25% LR=0.0050: 100%|██████████| 391/391 [00:21<00:00, 17.85it/s]


2025-10-02 22:57:25,107 - INFO - Epoch 09/15: Train set final results: Average loss: 346.4174, Accuracy: 34123/50000 (68.25%)


Test Loss=0.8193 Accuracy=71.12%: 100%|██████████| 10/10 [00:02<00:00,  4.34it/s]


2025-10-02 22:57:27,414 - INFO - Epoch 09/15:Test set final results: Average loss: 0.8193, Accuracy: 7112/10000 (71.12%)


Train Loss=0.0066 Acc=70.02% LR=0.0037: 100%|██████████| 391/391 [00:22<00:00, 17.27it/s]


2025-10-02 22:57:50,052 - INFO - Epoch 10/15: Train set final results: Average loss: 331.1379, Accuracy: 35008/50000 (70.02%)


Test Loss=0.7717 Accuracy=72.81%: 100%|██████████| 10/10 [00:01<00:00,  6.28it/s]


2025-10-02 22:57:51,648 - INFO - Epoch 10/15:Test set final results: Average loss: 0.7717, Accuracy: 7281/10000 (72.81%)


Train Loss=0.0064 Acc=70.96% LR=0.0025: 100%|██████████| 391/391 [00:22<00:00, 17.38it/s]


2025-10-02 22:58:14,145 - INFO - Epoch 11/15: Train set final results: Average loss: 318.8619, Accuracy: 35481/50000 (70.96%)


Test Loss=0.7331 Accuracy=74.33%: 100%|██████████| 10/10 [00:01<00:00,  6.06it/s]


2025-10-02 22:58:15,801 - INFO - Epoch 11/15:Test set final results: Average loss: 0.7331, Accuracy: 7433/10000 (74.33%)


Train Loss=0.0061 Acc=72.50% LR=0.0015: 100%|██████████| 391/391 [00:22<00:00, 17.33it/s]


2025-10-02 22:58:38,373 - INFO - Epoch 12/15: Train set final results: Average loss: 306.6767, Accuracy: 36249/50000 (72.50%)


Test Loss=0.7167 Accuracy=75.07%: 100%|██████████| 10/10 [00:01<00:00,  6.13it/s]


2025-10-02 22:58:40,010 - INFO - Epoch 12/15:Test set final results: Average loss: 0.7167, Accuracy: 7507/10000 (75.07%)


Train Loss=0.0059 Acc=73.44% LR=0.0007: 100%|██████████| 391/391 [00:22<00:00, 17.45it/s]


2025-10-02 22:59:02,423 - INFO - Epoch 13/15: Train set final results: Average loss: 294.6736, Accuracy: 36720/50000 (73.44%)


Test Loss=0.6796 Accuracy=76.31%: 100%|██████████| 10/10 [00:01<00:00,  6.22it/s]


2025-10-02 22:59:04,035 - INFO - Epoch 13/15:Test set final results: Average loss: 0.6796, Accuracy: 7631/10000 (76.31%)


Train Loss=0.0057 Acc=74.16% LR=0.0002: 100%|██████████| 391/391 [00:21<00:00, 18.13it/s]


2025-10-02 22:59:25,611 - INFO - Epoch 14/15: Train set final results: Average loss: 284.8207, Accuracy: 37079/50000 (74.16%)


Test Loss=0.6624 Accuracy=76.78%: 100%|██████████| 10/10 [00:02<00:00,  4.20it/s]


2025-10-02 22:59:27,999 - INFO - Epoch 14/15:Test set final results: Average loss: 0.6624, Accuracy: 7678/10000 (76.78%)


Train Loss=0.0057 Acc=74.68% LR=0.0000: 100%|██████████| 391/391 [00:21<00:00, 17.90it/s]


2025-10-02 22:59:49,844 - INFO - Epoch 15/15: Train set final results: Average loss: 282.9829, Accuracy: 37338/50000 (74.68%)


Test Loss=0.6517 Accuracy=77.33%: 100%|██████████| 10/10 [00:01<00:00,  5.76it/s]

2025-10-02 22:59:51,584 - INFO - Epoch 15/15:Test set final results: Average loss: 0.6517, Accuracy: 7733/10000 (77.33%)
2025-10-02 22:59:51,590 - INFO - 
2025-10-02 22:59:51,584 - INFO - --- Model Architecture Checks ---
2025-10-02 22:59:51,585 - INFO - Total Parameters: 157,778
2025-10-02 22:59:51,585 - INFO - Trainable Parameters: 157,778

2025-10-02 22:59:51,585 - INFO - Layer-wise Parameter Details (in model order):
2025-10-02 22:59:51,585 - INFO - ----------------------------------------------------------------------------------------------------
2025-10-02 22:59:51,585 - INFO - 
Block: c1 (Sequential)
2025-10-02 22:59:51,585 - INFO -   c1.0 (Conv2d)                                      | Params:    216 | Convolution: 3->8 channels, kernel (3, 3), stride (1, 1), padding (1, 1), groups 1, bias False
2025-10-02 22:59:51,586 - INFO -   c1.1 (BatchNorm2d)                                 | Params:     16 | BatchNorm: 8 features, eps=1e-05, momentum=0.1
2025-10-02 22:59:51,586 - INFO -




In [9]:
# ================================
# main.py - Main Training Orchestration
# ================================
class get_model:
    def __init__(self,device=None):
        self.device = device if device else self.get_device()
        self.model_obj = self.get_model()
        self.model_config = self.get_config()

    def get_device(self):
        return torch.device("cuda" if torch.cuda.is_available() else "cpu")

    def get_model(self):
        return Net().to(self.device)

    def get_config(self):
        return set_config_v0().setup(self.model_obj)

def main_i(params_check=1):
    logging.info("Setting up for model")
    model = get_model(device=None)
    # Capture printed summary into logs
    import io
    import contextlib
    with io.StringIO() as buf, contextlib.redirect_stdout(buf):
        summary(model.model_obj, input_size=(3, 32, 32))
        summary_text = buf.getvalue().strip()
    if summary_text:
        logging.info("\n" + summary_text)
    train_test_instance = train_test_model(model.model_obj,
                                          model.device,
                                          model.model_config.data_setup_instance.train_loader,
                                          model.model_config.data_setup_instance.test_loader,
                                          model.model_config.criterion,
                                          model.model_config.optimizer,
                                          model.model_config.scheduler,
                                          model.model_config.epochs)
    if (params_check == 0):
        train_test_instance.run_epoch()
    else:
        pass
    #train_test_instance.plot_results()
    # Capture printed model checks into logs
    with io.StringIO() as buf, contextlib.redirect_stdout(buf):
        model_checks(model.model_obj)
        checks_text = buf.getvalue().strip()
    if checks_text:
        logging.info("\n" + checks_text)

def main():
    # Initialize logging only in the main process
    setup_logging(log_to_file=True)
    params_check = int(input("Enter 1 for params check only, 0 for full training/testing: "))
    main_i(params_check=params_check)

if __name__ == "__main__":
    main()

Enter 1 for params check only, 0 for full training/testing: 0
2025-10-03 00:10:14,510 - INFO - Setting up for model
2025-10-03 00:10:14,855 - INFO - Model v0 dataloader args: {'batch_size_train': 32, 'batch_size_test': 1000, 'shuffle_train': True, 'shuffle_test': False, 'num_workers': 2, 'pin_memory': True}


  original_init(self, **validated_kwargs)
  A.CoarseDropout(
100%|██████████| 170M/170M [01:03<00:00, 2.67MB/s]


2025-10-03 00:11:22,027 - INFO - Model v0: OneCycleLR max_lr=0.01 pct_start=0.2 div_factor=10 final_div_factor=100 epochs=35
2025-10-03 00:11:22,028 - INFO - Dataloader arguments: {'batch_size_train': 32, 'batch_size_test': 1000, 'shuffle_train': True, 'shuffle_test': False, 'num_workers': 2, 'pin_memory': True}
2025-10-03 00:11:23,006 - INFO - 
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 8, 32, 32]             216
       BatchNorm2d-2            [-1, 8, 32, 32]              16
              ReLU-3            [-1, 8, 32, 32]               0
           Dropout-4            [-1, 8, 32, 32]               0
            Conv2d-5           [-1, 16, 32, 32]           1,152
       BatchNorm2d-6           [-1, 16, 32, 32]              32
              ReLU-7           [-1, 16, 32, 32]               0
           Dropout-8           [-1, 16, 32, 32]               0
           

Train Loss=0.0557 Acc=31.64% LR=0.0014: 100%|██████████| 1563/1563 [00:35<00:00, 44.28it/s]


2025-10-03 00:11:58,307 - INFO - Epoch 01/35: Train set final results: Average loss: 2787.2644, Accuracy: 15822/50000 (31.64%)


Test Loss=1.5422 Accuracy=40.58%: 100%|██████████| 10/10 [00:01<00:00,  5.60it/s]


2025-10-03 00:12:00,098 - INFO - Epoch 01/35:Test set final results: Average loss: 1.5422, Accuracy: 4058/10000 (40.58%)


Train Loss=0.0462 Acc=44.99% LR=0.0027: 100%|██████████| 1563/1563 [00:33<00:00, 46.23it/s]


2025-10-03 00:12:33,910 - INFO - Epoch 02/35: Train set final results: Average loss: 2311.1262, Accuracy: 22494/50000 (44.99%)


Test Loss=1.3076 Accuracy=51.86%: 100%|██████████| 10/10 [00:02<00:00,  4.38it/s]


2025-10-03 00:12:36,200 - INFO - Epoch 02/35:Test set final results: Average loss: 1.3076, Accuracy: 5186/10000 (51.86%)


Train Loss=0.0401 Acc=53.37% LR=0.0045: 100%|██████████| 1563/1563 [00:35<00:00, 43.45it/s]


2025-10-03 00:13:12,172 - INFO - Epoch 03/35: Train set final results: Average loss: 2007.0715, Accuracy: 26685/50000 (53.37%)


Test Loss=1.2655 Accuracy=55.93%: 100%|██████████| 10/10 [00:01<00:00,  5.90it/s]


2025-10-03 00:13:13,872 - INFO - Epoch 03/35:Test set final results: Average loss: 1.2655, Accuracy: 5593/10000 (55.93%)


Train Loss=0.0361 Acc=58.73% LR=0.0065: 100%|██████████| 1563/1563 [00:35<00:00, 43.92it/s]


2025-10-03 00:13:49,462 - INFO - Epoch 04/35: Train set final results: Average loss: 1802.9942, Accuracy: 29366/50000 (58.73%)


Test Loss=1.1306 Accuracy=59.49%: 100%|██████████| 10/10 [00:01<00:00,  5.87it/s]


2025-10-03 00:13:51,171 - INFO - Epoch 04/35:Test set final results: Average loss: 1.1306, Accuracy: 5949/10000 (59.49%)


Train Loss=0.0335 Acc=61.73% LR=0.0083: 100%|██████████| 1563/1563 [00:35<00:00, 43.55it/s]


2025-10-03 00:14:27,065 - INFO - Epoch 05/35: Train set final results: Average loss: 1672.6435, Accuracy: 30866/50000 (61.73%)


Test Loss=0.9751 Accuracy=65.50%: 100%|██████████| 10/10 [00:01<00:00,  6.01it/s]


2025-10-03 00:14:28,734 - INFO - Epoch 05/35:Test set final results: Average loss: 0.9751, Accuracy: 6550/10000 (65.50%)


Train Loss=0.0311 Acc=64.55% LR=0.0096: 100%|██████████| 1563/1563 [00:34<00:00, 45.38it/s]


2025-10-03 00:15:03,183 - INFO - Epoch 06/35: Train set final results: Average loss: 1556.5277, Accuracy: 32277/50000 (64.55%)


Test Loss=0.9245 Accuracy=68.02%: 100%|██████████| 10/10 [00:01<00:00,  5.30it/s]


2025-10-03 00:15:05,076 - INFO - Epoch 06/35:Test set final results: Average loss: 0.9245, Accuracy: 6802/10000 (68.02%)


Train Loss=0.0293 Acc=66.83% LR=0.0100: 100%|██████████| 1563/1563 [00:33<00:00, 46.32it/s]


2025-10-03 00:15:38,828 - INFO - Epoch 07/35: Train set final results: Average loss: 1462.6938, Accuracy: 33415/50000 (66.83%)


Test Loss=0.7915 Accuracy=72.26%: 100%|██████████| 10/10 [00:01<00:00,  5.80it/s]


2025-10-03 00:15:40,555 - INFO - Epoch 07/35:Test set final results: Average loss: 0.7915, Accuracy: 7226/10000 (72.26%)


Train Loss=0.0279 Acc=68.46% LR=0.0100: 100%|██████████| 1563/1563 [00:34<00:00, 45.96it/s]


2025-10-03 00:16:14,569 - INFO - Epoch 08/35: Train set final results: Average loss: 1395.5646, Accuracy: 34228/50000 (68.46%)


Test Loss=0.8167 Accuracy=72.37%: 100%|██████████| 10/10 [00:01<00:00,  6.03it/s]


2025-10-03 00:16:16,232 - INFO - Epoch 08/35:Test set final results: Average loss: 0.8167, Accuracy: 7237/10000 (72.37%)


Train Loss=0.0267 Acc=69.92% LR=0.0099: 100%|██████████| 1563/1563 [00:36<00:00, 43.06it/s]


2025-10-03 00:16:52,534 - INFO - Epoch 09/35: Train set final results: Average loss: 1333.5776, Accuracy: 34960/50000 (69.92%)


Test Loss=0.7551 Accuracy=73.94%: 100%|██████████| 10/10 [00:01<00:00,  6.17it/s]


2025-10-03 00:16:54,159 - INFO - Epoch 09/35:Test set final results: Average loss: 0.7551, Accuracy: 7394/10000 (73.94%)


Train Loss=0.0256 Acc=71.42% LR=0.0097: 100%|██████████| 1563/1563 [00:34<00:00, 45.56it/s]


2025-10-03 00:17:28,465 - INFO - Epoch 10/35: Train set final results: Average loss: 1278.8487, Accuracy: 35709/50000 (71.42%)


Test Loss=0.7631 Accuracy=73.80%: 100%|██████████| 10/10 [00:01<00:00,  5.94it/s]


2025-10-03 00:17:30,154 - INFO - Epoch 10/35:Test set final results: Average loss: 0.7631, Accuracy: 7380/10000 (73.80%)


Train Loss=0.0248 Acc=72.35% LR=0.0095: 100%|██████████| 1563/1563 [00:34<00:00, 45.77it/s]


2025-10-03 00:18:04,308 - INFO - Epoch 11/35: Train set final results: Average loss: 1242.1008, Accuracy: 36176/50000 (72.35%)


Test Loss=0.6824 Accuracy=76.31%: 100%|██████████| 10/10 [00:02<00:00,  4.47it/s]


2025-10-03 00:18:06,548 - INFO - Epoch 11/35:Test set final results: Average loss: 0.6824, Accuracy: 7631/10000 (76.31%)


Train Loss=0.0241 Acc=73.26% LR=0.0092: 100%|██████████| 1563/1563 [00:36<00:00, 43.20it/s]


2025-10-03 00:18:42,734 - INFO - Epoch 12/35: Train set final results: Average loss: 1203.3715, Accuracy: 36631/50000 (73.26%)


Test Loss=0.7016 Accuracy=76.34%: 100%|██████████| 10/10 [00:01<00:00,  6.17it/s]


2025-10-03 00:18:44,360 - INFO - Epoch 12/35:Test set final results: Average loss: 0.7016, Accuracy: 7634/10000 (76.34%)


Train Loss=0.0234 Acc=73.90% LR=0.0089: 100%|██████████| 1563/1563 [00:33<00:00, 46.17it/s]


2025-10-03 00:19:18,217 - INFO - Epoch 13/35: Train set final results: Average loss: 1169.9813, Accuracy: 36949/50000 (73.90%)


Test Loss=0.6777 Accuracy=77.11%: 100%|██████████| 10/10 [00:01<00:00,  6.25it/s]


2025-10-03 00:19:19,822 - INFO - Epoch 13/35:Test set final results: Average loss: 0.6777, Accuracy: 7711/10000 (77.11%)


Train Loss=0.0229 Acc=74.63% LR=0.0085: 100%|██████████| 1563/1563 [00:34<00:00, 45.00it/s]


2025-10-03 00:19:54,563 - INFO - Epoch 14/35: Train set final results: Average loss: 1143.6724, Accuracy: 37314/50000 (74.63%)


Test Loss=0.6173 Accuracy=78.40%: 100%|██████████| 10/10 [00:01<00:00,  6.31it/s]


2025-10-03 00:19:56,152 - INFO - Epoch 14/35:Test set final results: Average loss: 0.6173, Accuracy: 7840/10000 (78.40%)


Train Loss=0.0222 Acc=75.20% LR=0.0081: 100%|██████████| 1563/1563 [00:32<00:00, 47.41it/s]


2025-10-03 00:20:29,121 - INFO - Epoch 15/35: Train set final results: Average loss: 1110.8866, Accuracy: 37599/50000 (75.20%)


Test Loss=0.6319 Accuracy=78.73%: 100%|██████████| 10/10 [00:01<00:00,  5.84it/s]


2025-10-03 00:20:30,837 - INFO - Epoch 15/35:Test set final results: Average loss: 0.6319, Accuracy: 7873/10000 (78.73%)


Train Loss=0.0216 Acc=76.12% LR=0.0077: 100%|██████████| 1563/1563 [00:33<00:00, 46.23it/s]


2025-10-03 00:21:04,653 - INFO - Epoch 16/35: Train set final results: Average loss: 1081.6140, Accuracy: 38059/50000 (76.12%)


Test Loss=0.5990 Accuracy=79.32%: 100%|██████████| 10/10 [00:01<00:00,  6.25it/s]


2025-10-03 00:21:06,258 - INFO - Epoch 16/35:Test set final results: Average loss: 0.5990, Accuracy: 7932/10000 (79.32%)


Train Loss=0.0211 Acc=76.78% LR=0.0072: 100%|██████████| 1563/1563 [00:33<00:00, 46.00it/s]


2025-10-03 00:21:40,242 - INFO - Epoch 17/35: Train set final results: Average loss: 1053.9131, Accuracy: 38389/50000 (76.78%)


Test Loss=0.6066 Accuracy=79.64%: 100%|██████████| 10/10 [00:01<00:00,  6.10it/s]


2025-10-03 00:21:41,886 - INFO - Epoch 17/35:Test set final results: Average loss: 0.6066, Accuracy: 7964/10000 (79.64%)


Train Loss=0.0208 Acc=77.02% LR=0.0067: 100%|██████████| 1563/1563 [00:33<00:00, 47.35it/s]


2025-10-03 00:22:14,896 - INFO - Epoch 18/35: Train set final results: Average loss: 1039.9282, Accuracy: 38508/50000 (77.02%)


Test Loss=0.5793 Accuracy=80.81%: 100%|██████████| 10/10 [00:01<00:00,  6.04it/s]


2025-10-03 00:22:16,556 - INFO - Epoch 18/35:Test set final results: Average loss: 0.5793, Accuracy: 8081/10000 (80.81%)


Train Loss=0.0205 Acc=77.00% LR=0.0061: 100%|██████████| 1563/1563 [00:36<00:00, 42.51it/s]


2025-10-03 00:22:53,332 - INFO - Epoch 19/35: Train set final results: Average loss: 1025.9219, Accuracy: 38501/50000 (77.00%)


Test Loss=0.5741 Accuracy=80.73%: 100%|██████████| 10/10 [00:01<00:00,  5.92it/s]


2025-10-03 00:22:55,028 - INFO - Epoch 19/35:Test set final results: Average loss: 0.5741, Accuracy: 8073/10000 (80.73%)


Train Loss=0.0201 Acc=77.63% LR=0.0056: 100%|██████████| 1563/1563 [00:38<00:00, 40.26it/s]


2025-10-03 00:23:33,856 - INFO - Epoch 20/35: Train set final results: Average loss: 1004.9551, Accuracy: 38815/50000 (77.63%)


Test Loss=0.5470 Accuracy=81.40%: 100%|██████████| 10/10 [00:01<00:00,  5.44it/s]


2025-10-03 00:23:35,701 - INFO - Epoch 20/35:Test set final results: Average loss: 0.5470, Accuracy: 8140/10000 (81.40%)


Train Loss=0.0196 Acc=78.30% LR=0.0050: 100%|██████████| 1563/1563 [00:37<00:00, 41.57it/s]


2025-10-03 00:24:13,301 - INFO - Epoch 21/35: Train set final results: Average loss: 978.3846, Accuracy: 39149/50000 (78.30%)


Test Loss=0.5259 Accuracy=82.09%: 100%|██████████| 10/10 [00:01<00:00,  5.80it/s]


2025-10-03 00:24:15,029 - INFO - Epoch 21/35:Test set final results: Average loss: 0.5259, Accuracy: 8209/10000 (82.09%)


Train Loss=0.0192 Acc=78.73% LR=0.0044: 100%|██████████| 1563/1563 [00:36<00:00, 42.51it/s]


2025-10-03 00:24:51,802 - INFO - Epoch 22/35: Train set final results: Average loss: 962.0642, Accuracy: 39367/50000 (78.73%)


Test Loss=0.5321 Accuracy=82.21%: 100%|██████████| 10/10 [00:01<00:00,  5.69it/s]


2025-10-03 00:24:53,565 - INFO - Epoch 22/35:Test set final results: Average loss: 0.5321, Accuracy: 8221/10000 (82.21%)


Train Loss=0.0189 Acc=79.24% LR=0.0039: 100%|██████████| 1563/1563 [00:37<00:00, 41.23it/s]


2025-10-03 00:25:31,478 - INFO - Epoch 23/35: Train set final results: Average loss: 947.2682, Accuracy: 39618/50000 (79.24%)


Test Loss=0.5049 Accuracy=82.99%: 100%|██████████| 10/10 [00:01<00:00,  5.95it/s]


2025-10-03 00:25:33,164 - INFO - Epoch 23/35:Test set final results: Average loss: 0.5049, Accuracy: 8299/10000 (82.99%)


Train Loss=0.0186 Acc=79.39% LR=0.0034: 100%|██████████| 1563/1563 [00:36<00:00, 42.98it/s]


2025-10-03 00:26:09,534 - INFO - Epoch 24/35: Train set final results: Average loss: 928.6708, Accuracy: 39697/50000 (79.39%)


Test Loss=0.5230 Accuracy=82.51%: 100%|██████████| 10/10 [00:02<00:00,  4.59it/s]


2025-10-03 00:26:11,718 - INFO - Epoch 24/35:Test set final results: Average loss: 0.5230, Accuracy: 8251/10000 (82.51%)


Train Loss=0.0180 Acc=80.09% LR=0.0028: 100%|██████████| 1563/1563 [00:36<00:00, 42.81it/s]


2025-10-03 00:26:48,234 - INFO - Epoch 25/35: Train set final results: Average loss: 901.0948, Accuracy: 40047/50000 (80.09%)


Test Loss=0.5119 Accuracy=83.04%: 100%|██████████| 10/10 [00:02<00:00,  3.66it/s]


2025-10-03 00:26:50,972 - INFO - Epoch 25/35:Test set final results: Average loss: 0.5119, Accuracy: 8304/10000 (83.04%)


Train Loss=0.0177 Acc=80.43% LR=0.0023: 100%|██████████| 1563/1563 [00:35<00:00, 43.73it/s]


2025-10-03 00:27:26,719 - INFO - Epoch 26/35: Train set final results: Average loss: 885.2760, Accuracy: 40217/50000 (80.43%)


Test Loss=0.4777 Accuracy=83.79%: 100%|██████████| 10/10 [00:01<00:00,  5.35it/s]


2025-10-03 00:27:28,591 - INFO - Epoch 26/35:Test set final results: Average loss: 0.4777, Accuracy: 8379/10000 (83.79%)


Train Loss=0.0174 Acc=80.79% LR=0.0019: 100%|██████████| 1563/1563 [00:36<00:00, 43.20it/s]


2025-10-03 00:28:04,776 - INFO - Epoch 27/35: Train set final results: Average loss: 870.7485, Accuracy: 40396/50000 (80.79%)


Test Loss=0.4867 Accuracy=83.60%: 100%|██████████| 10/10 [00:01<00:00,  5.97it/s]


2025-10-03 00:28:06,458 - INFO - Epoch 27/35:Test set final results: Average loss: 0.4867, Accuracy: 8360/10000 (83.60%)


Train Loss=0.0171 Acc=81.12% LR=0.0015: 100%|██████████| 1563/1563 [00:38<00:00, 40.71it/s]


2025-10-03 00:28:44,861 - INFO - Epoch 28/35: Train set final results: Average loss: 856.6689, Accuracy: 40559/50000 (81.12%)


Test Loss=0.4869 Accuracy=83.29%: 100%|██████████| 10/10 [00:01<00:00,  5.79it/s]


2025-10-03 00:28:46,595 - INFO - Epoch 28/35:Test set final results: Average loss: 0.4869, Accuracy: 8329/10000 (83.29%)


Train Loss=0.0166 Acc=81.54% LR=0.0011: 100%|██████████| 1563/1563 [00:37<00:00, 42.08it/s]


2025-10-03 00:29:23,739 - INFO - Epoch 29/35: Train set final results: Average loss: 830.7871, Accuracy: 40769/50000 (81.54%)


Test Loss=0.4607 Accuracy=84.62%: 100%|██████████| 10/10 [00:01<00:00,  5.87it/s]


2025-10-03 00:29:25,448 - INFO - Epoch 29/35:Test set final results: Average loss: 0.4607, Accuracy: 8462/10000 (84.62%)


Train Loss=0.0164 Acc=81.98% LR=0.0008: 100%|██████████| 1563/1563 [00:36<00:00, 42.33it/s]


2025-10-03 00:30:02,377 - INFO - Epoch 30/35: Train set final results: Average loss: 818.3666, Accuracy: 40992/50000 (81.98%)


Test Loss=0.4584 Accuracy=84.85%: 100%|██████████| 10/10 [00:01<00:00,  5.96it/s]


2025-10-03 00:30:04,059 - INFO - Epoch 30/35:Test set final results: Average loss: 0.4584, Accuracy: 8485/10000 (84.85%)


Train Loss=0.0161 Acc=82.28% LR=0.0005: 100%|██████████| 1563/1563 [00:34<00:00, 45.20it/s]


2025-10-03 00:30:38,642 - INFO - Epoch 31/35: Train set final results: Average loss: 805.1573, Accuracy: 41141/50000 (82.28%)


Test Loss=0.4518 Accuracy=84.85%: 100%|██████████| 10/10 [00:02<00:00,  4.43it/s]


2025-10-03 00:30:40,906 - INFO - Epoch 31/35:Test set final results: Average loss: 0.4518, Accuracy: 8485/10000 (84.85%)


Train Loss=0.0159 Acc=82.44% LR=0.0003: 100%|██████████| 1563/1563 [00:33<00:00, 46.19it/s]


2025-10-03 00:31:14,748 - INFO - Epoch 32/35: Train set final results: Average loss: 792.7924, Accuracy: 41220/50000 (82.44%)


Test Loss=0.4559 Accuracy=84.86%: 100%|██████████| 10/10 [00:01<00:00,  6.25it/s]


2025-10-03 00:31:16,354 - INFO - Epoch 32/35:Test set final results: Average loss: 0.4559, Accuracy: 8486/10000 (84.86%)


Train Loss=0.0156 Acc=82.78% LR=0.0001: 100%|██████████| 1563/1563 [00:36<00:00, 42.79it/s]


2025-10-03 00:31:52,886 - INFO - Epoch 33/35: Train set final results: Average loss: 780.1840, Accuracy: 41389/50000 (82.78%)


Test Loss=0.4501 Accuracy=84.89%: 100%|██████████| 10/10 [00:01<00:00,  5.71it/s]


2025-10-03 00:31:54,644 - INFO - Epoch 33/35:Test set final results: Average loss: 0.4501, Accuracy: 8489/10000 (84.89%)


Train Loss=0.0154 Acc=83.18% LR=0.0000: 100%|██████████| 1563/1563 [00:35<00:00, 44.11it/s]


2025-10-03 00:32:30,085 - INFO - Epoch 34/35: Train set final results: Average loss: 768.0875, Accuracy: 41589/50000 (83.18%)


Test Loss=0.4455 Accuracy=84.90%: 100%|██████████| 10/10 [00:01<00:00,  6.08it/s]


2025-10-03 00:32:31,736 - INFO - Epoch 34/35:Test set final results: Average loss: 0.4455, Accuracy: 8490/10000 (84.90%)


Train Loss=0.0153 Acc=83.04% LR=0.0000: 100%|██████████| 1563/1563 [00:35<00:00, 44.54it/s]


2025-10-03 00:33:06,831 - INFO - Epoch 35/35: Train set final results: Average loss: 766.9637, Accuracy: 41519/50000 (83.04%)


Test Loss=0.4484 Accuracy=85.09%: 100%|██████████| 10/10 [00:01<00:00,  6.07it/s]

2025-10-03 00:33:08,482 - INFO - Epoch 35/35:Test set final results: Average loss: 0.4484, Accuracy: 8509/10000 (85.09%)
2025-10-03 00:33:08,488 - INFO - 
2025-10-03 00:33:08,483 - INFO - --- Model Architecture Checks ---
2025-10-03 00:33:08,483 - INFO - Total Parameters: 157,778
2025-10-03 00:33:08,483 - INFO - Trainable Parameters: 157,778

2025-10-03 00:33:08,483 - INFO - Layer-wise Parameter Details (in model order):
2025-10-03 00:33:08,484 - INFO - ----------------------------------------------------------------------------------------------------
2025-10-03 00:33:08,484 - INFO - 
Block: c1 (Sequential)
2025-10-03 00:33:08,484 - INFO -   c1.0 (Conv2d)                                      | Params:    216 | Convolution: 3->8 channels, kernel (3, 3), stride (1, 1), padding (1, 1), groups 1, bias False
2025-10-03 00:33:08,484 - INFO -   c1.1 (BatchNorm2d)                                 | Params:     16 | BatchNorm: 8 features, eps=1e-05, momentum=0.1
2025-10-03 00:33:08,484 - INFO -




## Model Architecture Overview

The model implements:
- **C1**: Initial feature extraction (3→8→16→32 channels)
- **C2**: Dilated convolutions (dilation=1,2,4) maintaining 32x32 spatial size
- **C3**: Depthwise Separable Convolution with stride=2 (32→64 channels, 16x16)
- **C4**: Final convolution with stride=2 (64→128 channels, 8x8) + GAP + FC

**Key Features:**
- Receptive field: 94 pixels (>44 requirement)
- Parameters: ~150k (<200k requirement)
- Albumentations: HorizontalFlip, ShiftScaleRotate, CoarseDropout

In [None]:
# Create and analyze the model
model = Net()
print(f"Model created successfully!")
print(f"Total parameters: {sum(p.numel() for p in model.parameters()):,}")

# Calculate receptive field
rf = calculate_receptive_field(model)
params = count_parameters(model)

print("
Requirements check:")
print(f"RF > 44: {'✓' if rf > 44 else '✗'} ({rf} > 44)")
print(f"Parameters < 200k: {'✓' if params < 200000 else '✗'} ({params:,} < 200,000)")

In [None]:
# Model summary
import io
import contextlib

with io.StringIO() as buf, contextlib.redirect_stdout(buf):
    summary(model, input_size=(3, 32, 32))
    summary_text = buf.getvalue()

print("Model Summary:")
print(summary_text)

## Training Setup

Now let's set up the training configuration and data loaders.

In [None]:
# Setup training
model_wrapper = get_model(device=None)
print("Training setup complete!")
print(f"Device: {model_wrapper.device}")
print(f"Epochs: {model_wrapper.model_config.epochs}")

## Start Training

Choose your training mode:
- Enter 0 for full training/testing
- Enter 1 for parameter check only

In [None]:
# Choose training mode
params_check = 0  # Set to 0 for full training, 1 for params check only

if params_check == 0:
    print("Starting FULL TRAINING...")
    main_i(params_check=0)
else:
    print("Parameter check only...")
    main_i(params_check=1)

## Results and Analysis

After training completes, you can analyze the results:

1. **Training curves**: Loss and accuracy over epochs
2. **Final accuracy**: Should achieve >85% on CIFAR-10 test set
3. **Model performance**: Check if all requirements are met

## Key Requirements Met:
- ✅ C1C2C3C4 architecture with no MaxPooling
- ✅ Last convolution has stride=2
- ✅ Receptive field > 44 pixels
- ✅ One Depthwise Separable Convolution layer
- ✅ One Dilated Convolution layer
- ✅ Global Average Pooling (GAP)
- ✅ Albumentations data augmentation
- ✅ < 200k parameters
- ✅ Code modularity
- ✅ 85% accuracy target

## Troubleshooting

If you encounter issues:
1. Make sure all dependencies are installed
2. Check that CUDA is available if using GPU
3. Monitor memory usage on Colab
4. Training may take 30-60 minutes depending on hardware

In [None]:
# Install required packages
!pip install torch torchvision torchsummary numpy matplotlib albumentations tqdm

In [None]:
# Clone the repository (replace with your actual repo URL)
!git clone https://github.com/your-username/ERA_v4_cifar_10_model_v1_S7.git
%cd ERA_v4_cifar_10_model_v1_S7

## Alternative: Upload Project Files

If you prefer to upload the files directly instead of cloning, upload all the .py files from your project to Colab.

In [None]:
# If uploading files manually, create the project structure
# Upload all .py files to Colab and run this cell
import os
if not os.path.exists('ERA_v4_cifar_10_model_v1_S7'):
    os.makedirs('ERA_v4_cifar_10_model_v1_S7')
    print("Created project directory. Please upload your .py files.")
else:
    %cd ERA_v4_cifar_10_model_v1_S7

## Model Architecture Overview

The model implements:
- **C1**: Initial feature extraction (3→8→16→32 channels)
- **C2**: Dilated convolutions (dilation=1,2,4) maintaining 32x32 spatial size
- **C3**: Depthwise Separable Convolution with stride=2 (32→64 channels, 16x16)
- **C4**: Final convolution with stride=2 (64→128 channels, 8x8) + GAP + FC

**Key Features:**
- Receptive field: 94 pixels (>44 requirement)
- Parameters: ~150k (<200k requirement)
- Albumentations: HorizontalFlip, ShiftScaleRotate, CoarseDropout

In [None]:
# Check if all required files are present
import os
required_files = [
    'main.py', 'cifar10model_v0.py', 'data_setup.py', 'data_visual.py',
    'train_test.py', 'logger_setup.py', 'summarizer.py'
]

missing_files = [f for f in required_files if not os.path.exists(f)]
if missing_files:
    print(f"Missing files: {missing_files}")
    print("Please upload the missing files to continue.")
else:
    print("All required files are present!")

In [None]:
# Import and check model
try:
    from cifar10model_v0 import Net
    import torch

    # Create model
    model = Net()
    print(f"Model created successfully!")
    print(f"Total parameters: {sum(p.numel() for p in model.parameters()):,}")

    # Check receptive field calculation
    print("\nRunning receptive field calculation...")

except ImportError as e:
    print(f"Import error: {e}")
    print("Please ensure all Python files are uploaded.")

In [None]:
# Run receptive field calculator
try:
    from receptive_field_calculator import calculate_receptive_field, count_parameters

    rf = calculate_receptive_field(model)
    params = count_parameters(model)

    print("
Requirements check:")
    print(f"RF > 44: {'✓' if rf > 44 else '✗'} ({rf} > 44)")
    print(f"Parameters < 200k: {'✓' if params < 200000 else '✗'} ({params:,} < 200,000)")

except ImportError:
    print("Receptive field calculator not found. Creating a simple version...")

    # Simple parameter count
    total_params = sum(p.numel() for p in model.parameters())
    print(f"Total parameters: {total_params:,}")
    print(f"Parameters < 200k: {'✓' if total_params < 200000 else '✗'}")

## Data Visualization

Let's visualize the CIFAR-10 dataset and understand the data distribution.

In [None]:
# Visualize CIFAR-10 data
try:
    from data_visual import data_visual
    import matplotlib.pyplot as plt

    # Create data visualizer
    visualizer = data_visual(dataset_name='cifar10')

    # Show sample images
    visualizer.show_sample_images()

    # Show class distribution
    visualizer.show_class_distribution()

except ImportError as e:
    print(f"Data visualization error: {e}")
    print("Continuing without visualization...")

## Model Summary

Let's examine the model architecture and parameters.

In [None]:
# Model summary
try:
    from summarizer import summary
    import io
    import contextlib

    # Capture model summary
    with io.StringIO() as buf, contextlib.redirect_stdout(buf):
        summary(model, input_size=(3, 32, 32))
        summary_text = buf.getvalue()

    print("Model Summary:")
    print(summary_text)

except ImportError as e:
    print(f"Summary error: {e}")
    print("Model structure:")
    print(model)

## Training Setup

Now let's set up the training configuration and data loaders.

In [None]:
# Setup training
try:
    from main import get_model

    # Create model and configuration
    model_wrapper = get_model(model_version=0)
    print("Training setup complete!")
    print(f"Device: {model_wrapper.device}")
    print(f"Epochs: {model_wrapper.model_config.epochs}")

except Exception as e:
    print(f"Setup error: {e}")
    print("Manual setup...")

    # Manual setup if main.py fails
    from cifar10model_v0 import Net, set_config_v0
    from data_setup import DataSetup

    model = Net()
    config = set_config_v0()
    config.setup(model)
    print("Manual setup complete!")

## Start Training

Finally, let's start the training process. This will train the model for the specified number of epochs.

In [None]:
# Start training
try:
    from train_test import train_test_model

    # Run training
    print("Starting training...")
    train_test_model(
        model_wrapper.model_obj,
        model_wrapper.device,
        model_wrapper.model_config.data_setup_instance.train_loader,
        model_wrapper.model_config.data_setup_instance.test_loader,
        model_wrapper.model_config.criterion,
        model_wrapper.model_config.optimizer,
        model_wrapper.model_config.scheduler
    )

except Exception as e:
    print(f"Training error: {e}")
    print("Please check the setup and try again.")

## Results and Analysis

After training completes, you can analyze the results:

1. **Training curves**: Loss and accuracy over epochs
2. **Final accuracy**: Should achieve >85% on CIFAR-10 test set
3. **Model performance**: Check if all requirements are met

## Key Requirements Met:
- ✅ C1C2C3C4 architecture with no MaxPooling
- ✅ Last convolution has stride=2
- ✅ Receptive field > 44 pixels
- ✅ One Depthwise Separable Convolution layer
- ✅ One Dilated Convolution layer
- ✅ Global Average Pooling (GAP)
- ✅ Albumentations data augmentation
- ✅ < 200k parameters
- ✅ Code modularity
- ✅ 85% accuracy target

## Troubleshooting

If you encounter issues:
1. Make sure all Python files are uploaded
2. Check that all dependencies are installed
3. Ensure CUDA is available if using GPU
4. Monitor memory usage on Colab