This notebook mounts the drive, builds the manifest CSV, makes sample grids, and saves them.  

Mount Google Drive and set folder paths within Google Drive.

In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

# Designate the disk we'll use to store the data and the TFM folder
BASE = "/content" # /content on colab is faster than GDrive.
DRIVE_BASE = "/content/drive/MyDrive/TFM"

REPO_NAME = "CNNs-distracted-driving"

# Put repo in the /content folder, because that is faster retrieval
REPO_ROOT = f"{BASE}/{REPO_NAME}"


Pull the repo (actualized) from Github.

In [None]:
import os

if not os.path.exists(REPO_ROOT):
  # Scenario for first-time clone of the repo into /content (fast retrieval)
  !git clone https://github.com/ClaudiaCPach/CNNs-distracted-driving.git "{REPO_ROOT}"
else:
  # Scenario for after this first-time clone - pulls updated code from the repo in Github
  !git -C "{REPO_ROOT}" fetch --all # Get updates for every branch
  !git -C "{REPO_ROOT}" pull

Install the package and dependencies

In [None]:
# install in editable mode

# make sure the pip we run corresponds to the python version we're using
# -q means quiet mode
# upgrade pip because colab virtual machines can sometimes start w a pip that doesn't
# necessarily correspond to our python version (3.11)

!python -m pip install -q --upgrade pip 
%cd "{REPO_ROOT}" # pip needs to be in the repo to be able to see the package
!pip install -q -e . # pip installs the package in quiet mode
%cd /content # we go back to content to make sure future logs, checkpoints, code, etc doesn't get saved to the repo.

Make data pull faster by copying data from Drive to /content .

In [None]:
from ddriver.config import DATASET_ROOT # use the path declared in config.py  



In [None]:
# check space on /content (fast disk space) 
!df -h /content

# check space on drive
!du -sh "{DATASET_ROOT}" || echo "Couldn't measure Drive size"

# for each session, copy the data to content
FAST_DATA = "/content/data"
os.makedirs(FAST_DATA, exist_ok=True) # create a data folder within /content

'''rsync is a smart, fast copy tool
-a = archive mode (keeps folders, times, permissions)
-h = human-readable speeds/sizes
--info=progress2 = shows a progress bar'''
!rsync -ah --info=progress2 "{DATASET_ROOT}/" "{FAST_DATA}/"
