# 🛠️ Colab File Manipulation Cheatsheet
**Use ! before shell commands and % for magic commands (like %cd or %timeit) in Colab.**

### 📁 File & Directory Management
!ls                      # List files in current directory  
!ls -la                  # Long listing with hidden files  
!pwd                     # Show current working directory  
!cd path/to/dir          # Change directory (temporary in Colab)

### 📦 Zip & Unzip
!zip -r archive.zip folder_name       # Zip folder  
!unzip archive.zip -d output_folder   # Unzip to specific folder

### 🧹 Create / Remove Files & Folders
!mkdir new_folder         # Create a folder  
!rm filename.txt          # Delete a file  
!rm -r folder_name        # Delete folder recursively  
!rm -rf folder_name       # Force delete (be careful!)  
!touch file.txt           # Create an empty file

### 📝 Move / Copy / Rename
!cp source.txt dest.txt             # Copy file  
!cp -r src_folder dest_folder       # Copy folder  
!mv oldname.txt newname.txt         # Rename or move



# **GANalyzer**
A hybrid Model to combine NN image content analysis with file forensics for more robust detection.

## Goals
### Phase 1: Pixel-Only CNN Classifier
Train a CNN-based binary classifier to distinguish between real and AI-generated images using pixel data alone.

### Phase 2: Hybrid Classifier (Pixel + File-Level Features)
Build a combined model that fuses CNN image embeddings with file-based forensics features to improve detection accuracy.

### Phase 3: Evaluation, Explainability, and Robustness
Evaluate the hybrid model’s performance using Grad-CAM and feature attribution tools.

### Phase 4: Deployment & Real-World Integration
Package the hybrid detection system into a usable interface or tool for real-world testing, inference, or interactive demos.

---

## Notebook Structure

1. **Setup & Dependencies**
   - Mount Google Drive
   - Retrieve helper scripts from GitHub
   - Install required Python packages

2. **Import Modules**
   - Import standard libraries, custom config, and dataloader scripts
   - Enable autoreload to auto-update external `.py` files without restarting

3. **Data Loader Verification**
   - Run quick checks to validate dataset loading (optional cleanup later)

---

## 1. Setup & Dependencies

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# Define your project folder
PROJECT_PATH = '/content/drive/MyDrive/GANalyzer/'

DATA_PATH = PROJECT_PATH + 'data/'
REAL_PATH = DATA_PATH + 'real/'
FAKE_PATH = DATA_PATH + 'fake/'

MODEL_PATH = PROJECT_PATH + 'models/'

In [None]:
import os
import sys

REPO_NAME = "GANalyzer"
GIT_URL = "https://github.com/ericwang210000/GANalyzer.git"
REPO_DIR = f"/content/{REPO_NAME}"

# Only clone if repo doesn't exist
if not os.path.exists(REPO_DIR):
    !git clone {GIT_URL}
else:
    # Optional: ensure we’re at repo root before pulling
    %cd {REPO_DIR}
    !git pull
    os.chdir("/content") # return to base so we don't cd into nested folders next time

# Add repo to Python path just once
if REPO_DIR not in sys.path:
    sys.path.append(REPO_DIR)

# Always CD cleanly to repo root when needed
%cd {REPO_DIR}

Cloning into 'GANalyzer'...
remote: Enumerating objects: 96, done.[K
remote: Counting objects: 100% (96/96), done.[K
remote: Compressing objects: 100% (66/66), done.[K
remote: Total 96 (delta 48), reused 66 (delta 22), pack-reused 0 (from 0)[K
Receiving objects: 100% (96/96), 4.69 MiB | 28.07 MiB/s, done.
Resolving deltas: 100% (48/48), done.
/content/GANalyzer


In [None]:
!pip install -r requirements.txt

Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch->-r requirements.txt (line 1))
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch->-r requirements.txt (line 1))
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch->-r requirements.txt (line 1))
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch->-r requirements.txt (line 1))
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch->-r requirements.txt (line 1))
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch->-r requirements.txt (line 1))
  Downl

## 2. Import Modules

In [None]:
from src.dataloader import get_dataloaders
from src.config import BATCH_SIZE, IMAGE_SIZE

In [None]:
# autoreload for external .py scripts
%load_ext autoreload
%autoreload 2

In [None]:
train_loader, val_loader = get_dataloaders(DATA_PATH, batch_size=BATCH_SIZE, image_size=IMAGE_SIZE)

In [None]:
from src.train import train_model
from src.utils import get_loss
from src.model import get_model

In [None]:
model = get_model()
loss_fn = get_loss
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

train_model = train_model(model, train_loader, val_loader, epochs=10, loss_fn=loss_fn, optimzer=optimzer, device=device)

NameError: name 'torch' is not defined

# **TEST**

In [None]:
from torchvision.datasets import ImageFolder

# Temporarily create dataset just to check metadata
dataset = ImageFolder(DATA_PATH)
print("Classes:", dataset.classes)
print("Class-to-idx mapping:", dataset.class_to_idx)
print("Total images:", len(dataset))

Classes: ['fake', 'real']
Class-to-idx mapping: {'fake': 0, 'real': 1}
Total images: 20000
