# 🔧 01_setup_environment.ipynb

This notebook prepares the development environment for the fire detection dissertation project. It performs the following setup tasks:

- Mounts your Google Drive for persistent access to datasets, models, and outputs.
- Creates and verifies the required folder structure in Drive (if not already present).
- Securely loads a GitHub access token stored in Drive and clones the GitHub repository into the Colab environment.
- Sets your Git identity for proper commit tracking (required once per session).

> ⚠️ Note: This notebook should be run at the beginning of each Colab session.  
> Your GitHub token is stored safely in Drive and should **never** be pushed to GitHub.



## 📂 Step 1: Mount Google Drive

This step mounts your personal Google Drive into the Colab environment at `/content/drive`.  
This allows you to access and save files persistently across sessions — including datasets, trained models, results, and the GitHub token.

Run this cell once at the start of each session. You’ll be prompted to grant access to your Drive.


In [1]:
# Mount your Google Drive first (if not already mounted)
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## 🗂️ Step 2: Create Project Folder Structure in Drive

This step defines a base path for the project in your Google Drive and creates a standard folder structure to organize your work.

The folders include:
- `data/raw` → for unprocessed datasets (e.g., downloaded real and synthetic images)
- `data/processed` → for cleaned and resized versions of the datasets
- `models` → to store trained model checkpoints
- `figures` → for Grad-CAM visualizations or other result plots
- `results` → for evaluation metrics, tables, and summary outputs
- `notebooks` → to back up working notebooks for reference

This cell is safe to re-run — it won’t overwrite existing folders.


In [9]:
# Define your base project path inside Drive
base_path = "/content/drive/MyDrive/fire-detection-dissertation"

# Create folders
!mkdir -p {base_path}/data/raw
!mkdir -p {base_path}/data/processed
!mkdir -p {base_path}/models
!mkdir -p {base_path}/figures
!mkdir -p {base_path}/results
!mkdir -p {base_path}/notebooks

!ls -R /content/drive/MyDrive/fire-detection-dissertation


/content/drive/MyDrive/fire-detection-dissertation:
data  figures  models  notebooks  results  secrets

/content/drive/MyDrive/fire-detection-dissertation/data:
processed  raw

/content/drive/MyDrive/fire-detection-dissertation/data/processed:

/content/drive/MyDrive/fire-detection-dissertation/data/raw:

/content/drive/MyDrive/fire-detection-dissertation/figures:

/content/drive/MyDrive/fire-detection-dissertation/models:

/content/drive/MyDrive/fire-detection-dissertation/notebooks:
01_setup_environment.ipynb

/content/drive/MyDrive/fire-detection-dissertation/results:

/content/drive/MyDrive/fire-detection-dissertation/secrets:
github_token.txt


## 🔐 Step 3: Load GitHub Token and Clone Repository

This step securely loads your GitHub personal access token from a hidden file stored in your Google Drive.  
Using this token, the project repository is cloned into the temporary Colab workspace at `/content/`.

Key details:
- The token is stored in `secrets/github_token.txt` and **should never be pushed to GitHub**.
- The repository is freshly cloned into `/content/` each session to allow fast access and clean execution.
- If the folder already exists from a previous run, it is removed before re-cloning to prevent conflicts.

This setup ensures you always work with the latest version of your GitHub code in a safe and reproducible way.


In [4]:
# STEP 3: Load GitHub token from Drive and clone the repo

# Path to the token file you just created
token_path = "/content/drive/MyDrive/fire-detection-dissertation/secrets/github_token.txt"

# Read the token securely
with open(token_path, "r") as f:
    token = f.read().strip()

# GitHub credentials
username = "Misharasapu"  # <- Replace this
repo = "fire-detection-dissertation"

# Construct the clone URL
clone_url = f"https://{token}@github.com/{username}/{repo}.git"

# OPTIONAL: Remove old clone if it exists (safe to run every time)
!rm -rf /content/{repo}

# Clone the GitHub repo into Colab VM
%cd /content
!git clone {clone_url}
%cd {repo}


/content
Cloning into 'fire-detection-dissertation'...
remote: Enumerating objects: 7, done.[K
remote: Counting objects: 100% (7/7), done.[K
remote: Compressing objects: 100% (7/7), done.[K
remote: Total 7 (delta 0), reused 3 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (7/7), done.
/content/fire-detection-dissertation


## 🧾 Step 4: Configure Git Identity for Commits

This step sets your Git author information (name and email) so that any commits you make from Colab are properly attributed to you on GitHub.

This configuration is required once per session because Colab resets the environment each time it restarts.

If you skip this step, Git will reject your commits with an "author identity unknown" error.


In [7]:
!git config --global user.name "Misharasapu"
!git config --global user.email "misharasapu@gmail.com"


## 📄 Step 5: Overwrite and Confirm `.gitignore` Rules

This step ensures that your `.gitignore` file contains all the necessary rules to keep your GitHub repository clean and lightweight.

The following items are excluded from version control:
- Python cache files and virtual environments
- Notebook checkpoints and OS/system files
- Model checkpoints (e.g., `.pt`, `.h5`)
- Large project folders (`data/`, `models/`, `figures/`, `results/`, `secrets/`)
- Your mounted Google Drive directory

This cell safely overwrites the existing `.gitignore` in the cloned GitHub repo and prints its contents for verification.


In [6]:
%cd /content/fire-detection-dissertation

# STEP 4: Overwrite .gitignore with custom rules for your project

custom_gitignore = """
# Python cache and virtual envs
__pycache__/
*.py[cod]
*.so
.env/
venv/
dist/
build/

# Notebook checkpoints and system files
.ipynb_checkpoints/
.DS_Store

# Output files and logs
*.log
*.out
*.zip

# Model checkpoints
*.pt
*.pth
*.h5

# Project folders not to track
data/
models/
figures/
results/
secrets/

# Google Drive mount (just in case)
drive/
"""

# Write the updated .gitignore file
gitignore_path = "/content/fire-detection-dissertation/.gitignore"
with open(gitignore_path, "w") as f:
    f.write(custom_gitignore.strip())

# Confirm contents
!echo "✅ .gitignore contents:"
!cat /content/fire-detection-dissertation/.gitignore


/content/fire-detection-dissertation
✅ .gitignore contents:
# Python cache and virtual envs
__pycache__/
*.py[cod]
*.so
.env/
venv/
dist/
build/

# Notebook checkpoints and system files
.ipynb_checkpoints/
.DS_Store

# Output files and logs
*.log
*.out
*.zip

# Model checkpoints
*.pt
*.pth
*.h5

# Project folders not to track
data/
models/
figures/
results/
secrets/

# Google Drive mount (just in case)
drive/

## 🧱 Step 6: Create Folder Structure in GitHub Repository

This step creates the internal folder structure within the cloned GitHub repository.  
These folders are used to organize project scripts and notebooks consistently:

- `notebooks/` → for all experiment and development notebooks
- `utils/` → for reusable Python modules (e.g., data loaders, training logic)

This step only needs to be run once after cloning, but is safe to include in the setup notebook for completeness and future reuse.


In [11]:
# STEP 6: Create folders inside the cloned GitHub repo

!mkdir -p /content/fire-detection-dissertation/notebooks
!mkdir -p /content/fire-detection-dissertation/utils


In [None]:
!cp /content/drive/MyDrive/fire-detection-dissertation/notebooks/01_setup_environment.ipynb /content/fire-detection-dissertation/notebooks/
