<a href="https://colab.research.google.com/github/Aniruddha072/Noise-classification/blob/main/notebooks/01_environment.ipynb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Noise Classification Project — Environment & Google Drive Setup (Colab Ready)

This notebook sets up the reproducible environment for UrbanSound8K, ESC-50, and custom campus recordings on Google Colab.
It mounts Google Drive, creates persistent folders, installs dependencies, and logs all versions.

---

## 1. Mount Google Drive

```python
from google.colab import drive
drive.mount('/content/drive')

# Create main project directory on Drive for persistent saving
import os

project_drive_root = '/content/drive/MyDrive/NoiseClassification'
folders = [
    'datasets',              # Raw + processed audio
    'datasets/features',     # Extracted features
    'datasets/UrbanSound8K', # Raw UrbanSound8K
    'datasets/ESC-50',       # Raw ESC-50
    'datasets/campus',       # Raw campus recordings
    'models',                # Saved models
    'reports',               # CSVs, tables, figures
    'figures',               # Visualization outputs
    'src',                   # Modular scripts
]

for folder in folders:
    path = os.path.join(project_drive_root, folder)
    os.makedirs(path, exist_ok=True)
print("Project folders created under:", project_drive_root)
```

---

## 2. Set up Colab & Install Dependencies

```python
# Write requirements.txt for reproducibility
requirements = '''
numpy>=1.22
pandas>=1.4
librosa>=0.10.0
scikit-learn>=1.1
matplotlib>=3.6
seaborn>=0.12
tensorflow>=2.12
torch>=2.0
torchaudio>=2.0
umap-learn
openl3
pandas
tqdm
audiomentations
wget
'''

with open('requirements.txt', 'w') as f:
    f.write(requirements)

# Install requirements in Colab (select TensorFlow or PyTorch as needed)
!pip install -q -r requirements.txt
!pip install -q panns-inference # YAMNet, PANNs

# Log versions for reproducibility
import sys, librosa, sklearn, matplotlib, seaborn, tensorflow as tf, torch
print("Python:", sys.version)
print("librosa:", librosa.__version__)
print("scikit-learn:", sklearn.__version__)
print("matplotlib:", matplotlib.__version__)
print("seaborn:", seaborn.__version__)
print("TensorFlow:", tf.__version__)
print("PyTorch:", torch.__version__)
```

---

## 3. Set up Paths & Random Seeds (config.py)

```python
# src/config.py (create file for all scripts to import)
import os
import numpy as np
import random

# Paths
COLAB_ROOT = '/content'
DRIVE_ROOT = '/content/drive/MyDrive/NoiseClassification'
DATASET_ROOT = os.path.join(DRIVE_ROOT, 'datasets')
FEATURES_ROOT = os.path.join(DATASET_ROOT, 'features')
MODELS_ROOT = os.path.join(DRIVE_ROOT, 'models')
REPORTS_ROOT = os.path.join(DRIVE_ROOT, 'reports')
FIGURES_ROOT = os.path.join(DRIVE_ROOT, 'figures')

# Audio params
SR = 22050
DURATION = 4.0
N_MELS = 128
N_FFT = 2048
HOP_LENGTH = 512

# Random seeds for reproducibility
SEED = 42
np.random.seed(SEED)
random.seed(SEED)
try:
    import torch
    torch.manual_seed(SEED)
except ImportError:
    pass

# Export for scripts/notebooks
print(f"SR={SR}, Duration={DURATION}, N_MELS={N_MELS}")
```

---

## 4. Instructions for GPU/CPU Selection

```python
# Colab: Check GPU availability and set device
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print("Using device:", device)

# For TensorFlow:
import tensorflow as tf
print("TF GPU available:", tf.config.list_physical_devices('GPU'))
```

---

## 5. Manifest for Outputs

```python
import json
manifest_path = os.path.join(DRIVE_ROOT, 'manifest.json')
manifest = {} # Will be filled as pipeline runs

with open(manifest_path, 'w') as f:
    json.dump(manifest, f)
print("Initialized manifest.json for output tracking.")
```

---

> **Next steps:** Run the next notebook (`02_download_datasets.ipynb`) to download UrbanSound8K, ESC-50, and upload your custom campus recordings.