# Ada-FracBNN Testing Notebook

This notebook provides an interactive environment to test and explore the Adaptive FracBNN (Ada-FracBNN) implementation.

## Features:
1. **Baseline FracBNN** - Original FracBNN with fixed gates
2. **Adaptive PG** - Learnable per-channel fractionalization
3. **Knowledge Distillation** - KD from compact FP teacher

## Quick Start:
Run cells sequentially to:
- Load and configure models
- Test on CIFAR-10 dataset
- Visualize gate statistics
- Compare model performance


## 1. Setup and Imports


In [None]:
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tqdm.notebook import tqdm
import os
import sys

# Add project root to path
project_root = os.path.abspath('.')
if project_root not in sys.path:
    sys.path.insert(0, project_root)

# Import project modules
import utils.utils as util
import utils.quantization as q
import model.fracbnn_cifar10 as m

# Set style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

print("PyTorch version:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
    print("CUDA device:", torch.cuda.get_device_name(0))


## 2. Configuration

In [None]:
# Configuration
config = {
    'device': 'cuda' if torch.cuda.is_available() else 'cpu',
    'batch_size': 128,
    'num_workers': 2,
    'data_dir': './data/cifar10',
    
    # Adaptive PG parameters
    'target_sparsity': 0.15,  # Target 15% sparsity (85% of channels use 1-bit)
    'sparsity_weight': 0.01,   # Weight for sparsity regularization
    
    # Knowledge Distillation parameters
    'kd_temperature': 4.0,
    'kd_alpha': 0.7,
    
    # Training parameters
    'learning_rate': 1e-3,
    'num_epochs': 5,  # Small number for quick testing
}

print("Configuration:")
for key, value in config.items():
    print(f"  {key}: {value}")
