# C-SFDA Stage 2 - Kaggle Notebook

## Setup Instructions
1. Upload this notebook to Kaggle
2. Turn ON GPU: Settings → Accelerator → GPU P100
3. Run cells in order

**Expected time:** 4-6 hours  
**Cost:** $0 (free tier)

## Step 1: Setup Environment

In [1]:
# # Clone repository
# !git clone https://github.com/nazmul-karim170/C-SFDA_Source-Free-Domain-Adaptation.git
# %cd C-SFDA_Source-Free-Domain-Adaptation

In [2]:
# Uninstall any existing PyTorch and reinstall with CUDA support
!pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 --index-url https://download.pytorch.org/whl/cu118
!pip install hydra-core omegaconf scikit-learn tqdm wandb matplotlib
!pip install 'numpy<2'  # Fix NumPy compatibility with torchvision 0.15.2

Looking in indexes: https://download.pytorch.org/whl/cu118
Collecting torch==2.0.1+cu118
  Downloading https://download.pytorch.org/whl/cu118/torch-2.0.1%2Bcu118-cp310-cp310-linux_x86_64.whl (2267.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 GB[0m [31m?[0m  [33m0:00:46[0meta [36m0:00:01[0m00:01[0mm
[?25hCollecting torchvision==0.15.2+cu118
  Downloading https://download.pytorch.org/whl/cu118/torchvision-0.15.2%2Bcu118-cp310-cp310-linux_x86_64.whl (6.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.1/6.1 MB[0m [31m21.7 MB/s[0m  [33m0:00:00[0mm0:00:01[0m
Collecting triton==2.0.0 (from torch==2.0.1+cu118)
  Downloading https://download.pytorch.org/whl/triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m63.3/63.3 MB[0m [31m66.8 MB/s[0m  [33m0:00:00[0mm0:00:01[0m
Collecting cmake (from triton==2.0.0->torch==2.0.1+cu118)


In [3]:
!conda install -c conda-forge libstdcxx-ng -y
!conda install -c conda-forge pillow -y

Retrieving notices: done
Channels:
 - conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/ec2-user/anaconda3/envs/python3

  added / updated specs:
    - libstdcxx-ng


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2025.11.12 |       hbd8a1cb_0         149 KB  conda-forge
    certifi-2025.11.12         |     pyhd8ed1ab_0         153 KB  conda-forge
    openssl-3.6.0              |       h26f9b46_0         3.0 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.3 MB

The following packages will be UPDATED:

  ca-certificates                      2025.10.5-hbd8a1cb_0 --> 2025.11.12-hbd8a1cb_0 
  certifi                            2025.10.5-pyhd8ed1ab_0 --> 2025.11.12-pyhd8ed1ab_0 

In [4]:
import os
import sys

# Force use of conda's libstdc++
conda_lib = '/home/ec2-user/anaconda3/envs/python3/lib'
if 'LD_LIBRARY_PATH' in os.environ:
    os.environ['LD_LIBRARY_PATH'] = f"{conda_lib}:{os.environ['LD_LIBRARY_PATH']}"
else:
    os.environ['LD_LIBRARY_PATH'] = conda_lib

In [5]:
# Verify GPU
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
    print("⚠️ No GPU! Go to Settings → Accelerator → GPU P100")

PyTorch version: 2.0.1+cu118
CUDA available: True
GPU: NVIDIA A10G
VRAM: 23.7 GB


## Step 2: Download VisDA-C Dataset

**About VisDA-C:**
- Training domain: Synthetic object images (rendered from CAD models)
- Validation domain: Real object images (cropped from COCO dataset)
- 12 categories: aeroplane, bicycle, bus, car, horse, knife, motorcycle, person, plant, skateboard, train, truck

**For Stage 2, you only need the VALIDATION set (target domain)**

Choose one of the options below:

In [6]:
# # ==================== OPTION A: Direct Download (Recommended) ====================
# # Official VisDA-C dataset from Boston University server

# !mkdir -p data/VISDA-C

# # Download validation set (target domain) - ~5GB
# !wget -P data/VISDA-C http://csr.bu.edu/ftp/visda17/clf/validation.tar

# # Extract
# !tar -xvf data/VISDA-C/validation.tar -C data/VISDA-C/

# # Clean up tar files to save space
# !rm -f data/VISDA-C/*.tar

# !mv data/VISDA-C/validation/image_list.txt data/VISDA-C/validation_list.txt

# print("\n✓ Dataset downloaded!")


## Step 3: Download Pre-trained Checkpoint

In [7]:
# # Download checkpoint from Google Drive
# # Link: https://drive.google.com/drive/folders/16vTNNzzAt4M1mmeLsOxSFDRzBogaNkJw

# !pip install -q gdown

# # Download specific file (replace FILE_ID with actual ID from Drive link)
# # Get FILE_ID by: Right-click file in Drive → Get link → Copy ID
# # Example: https://drive.google.com/file/d/1a2B3c4D5e6F7g8H9i0J/view → FILE_ID = 1a2B3c4D5e6F7g8H9i0J

# # !gdown --id FILE_ID_FOR_best_train_2020 -O checkpoint/best_train_2020.pth.tar

# # Alternative: Download entire folder
# !mkdir -p checkpoint
# # !gdown --folder https://drive.google.com/drive/folders/16vTNNzzAt4M1mmeLsOxSFDRzBogaNkJw --output checkpoint/
# !gdown --folder https://drive.google.com/drive/folders/1gJhqu00z536tPB3wwBw6zcWIxPjbh5Ri --output checkpoint/

# # Verify checkpoint
# !ls -lh checkpoint/

## Step 4B: Full Training Run (4-6 hours)

**Only run this after Step 4A succeeds!**

This will take 4-6 hours. The notebook will keep running even if you close the browser.

In [8]:
# # Full training - 25 epochs (4-6 hours)
# !python main_csfda.py \
#     train_source=false \
#     seed=2020 \
#     data.dataset="VISDA-C" \
#     data.data_root="./data/" \
#     data.source_domains="[train]" \
#     data.target_domains="[validation]" \
#     data.batch_size=64 \
#     data.workers=4 \
#     model_src.arch="resnet101" \
#     model_tta.src_log_dir="./checkpoint/" \
#     learn.epochs=25 \
#     optim.lr=2e-4 \
#     multiprocessing_distributed=false \
#     use_wandb=false

## Step 4A: Quick Test Run (10-15 minutes)

**Run this first to verify everything works!**

This will do 1 epoch to test:
- Dataset loads correctly
- Checkpoint loads
- Model initializes
- Training loop runs
- Output saves properly

Once this succeeds, skip to Step 4B for full training.

In [9]:
# Quick test run - just 1 epoch to verify everything works
# Use sys.executable to ensure we're using the notebook's Python with correct PyTorch
import sys
python_path = sys.executable
print(f"Using Python: {python_path}")

!{python_path} main_csfda.py \
    train_source=false \
    seed=2022 \
    data.dataset="VISDA-C" \
    data.data_root="./data/" \
    data.source_domains="[train]" \
    data.target_domains="[validation]" \
    data.batch_size=64 \
    data.workers=4 \
    model_src.arch="resnet101" \
    model_tta.src_log_dir="./checkpoint/VISDA-C" \
    learn.epochs=1 \
    optim.lr=2e-4 \
    multiprocessing_distributed=false \
    use_wandb=false

Using Python: /home/ec2-user/anaconda3/envs/python3/bin/python
[INFO] 2025-11-15 04:34:15 main_csfda.py:96 Dataset: VISDA-C, Source domains: ['train'], Target domains: ['validation'], Pipeline: target
[INFO] 2025-11-15 04:34:15 target_csfda.py:160 Start target training on train-validation...
Downloading: "https://download.pytorch.org/models/resnet101-63fe2227.pth" to /home/ec2-user/.cache/torch/hub/checkpoints/resnet101-63fe2227.pth
100%|█████████████████████████████████████████| 171M/171M [00:00<00:00, 304MB/s]
  nn.init.orthogonal(m.weight.data)   # Initializing with orthogonal rows
[INFO] 2025-11-15 04:34:18 classifier.py:67 Loaded from ./checkpoint/VISDA-C/best_train_2022.pth.tar; missing params: []
[INFO] 2025-11-15 04:34:19 classifier.py:67 Loaded from ./checkpoint/VISDA-C/best_train_2022.pth.tar; missing params: []
[INFO] 2025-11-15 04:34:20 target_csfda.py:195 1 - Created target model
[INFO] 2025-11-15 04:34:20 target_csfda.py:49 Eval and labeling...
100%|██████████████████████

## Step 5: Check Results

In [10]:
# List output files
!find output/ -name "*.pth.tar" -o -name "*.txt" -o -name "*.yaml" | head -20

output/VISDA-C/test/.hydra-2022/overrides.yaml
output/VISDA-C/test/.hydra-2022/hydra.yaml
output/VISDA-C/test/.hydra-2022/config.yaml


In [11]:
# Read final results (adjust path based on actual output)
!tail -50 output/VISDA-C/*/logs.txt 2>/dev/null || echo "Check output/ directory structure"

Check output/ directory structure


In [12]:
# Download results
!zip -r stage2_results.zip output/
from IPython.display import FileLink
FileLink('stage2_results.zip')

  adding: output/ (stored 0%)
  adding: output/VISDA-C/ (stored 0%)
  adding: output/VISDA-C/test/ (stored 0%)
  adding: output/VISDA-C/test/.hydra-2022/ (stored 0%)
  adding: output/VISDA-C/test/.hydra-2022/overrides.yaml (deflated 38%)
  adding: output/VISDA-C/test/.hydra-2022/hydra.yaml (deflated 66%)
  adding: output/VISDA-C/test/.hydra-2022/config.yaml (deflated 46%)
  adding: output/VISDA-C/test/.ipynb_checkpoints/ (stored 0%)
  adding: output/VISDA-C/test/.ipynb_checkpoints/main_csfda-checkpoint.log (stored 0%)
  adding: output/VISDA-C/test/main_csfda.log (stored 0%)


## Expected Results

According to the paper, on VisDA-C you should see:
- **Test Accuracy:** ~85%
- **Per-class Average:** ~83-85%

