In [1]:
import os
# Mount your drive
from google.colab import drive
drive.mount('/content/drive')

# 2. Define your project path and Git URL
PROJECT_PATH = '/content/drive/MyDrive/my_freelance_projects'
GIT_REPO_URL = 'https://github.com/AncyJohn/research-notebook.git'
REPO_NAME = 'research-notebook'
FULL_PATH = os.path.join(PROJECT_PATH, REPO_NAME)

# 3. Check and Clone/Update
if not os.path.exists(FULL_PATH):
    # Ensure the parent folder exists
    os.makedirs(PROJECT_PATH, exist_ok=True)

    # 4. Smart Clone/Access logic
    %cd {PROJECT_PATH}
    !git clone {GIT_REPO_URL}
    print(f"Successfully cloned into {FULL_PATH}")
else:
    # If it exists, just move inside and get latest changes
    print(f"--- Project already exists in Drive. Moving to folder. ---")
    %cd {FULL_PATH}
    # Optional: Pull latest changes if you made edits elsewhere
    !git pull
    print(f"Repository already exists. Updated with 'git pull'.")

print(f"Current Working Directory: {os.getcwd()}")

Mounted at /content/drive
--- Project already exists in Drive. Moving to folder. ---
/content/drive/MyDrive/my_freelance_projects/research-notebook
Already up to date.
Repository already exists. Updated with 'git pull'.
Current Working Directory: /content/drive/MyDrive/my_freelance_projects/research-notebook


In [2]:
import torch
torch.cuda.is_available()

True

CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model developed by NVIDIA that allows developers to use a GPU for general-purpose processing.

Open the Specific Notebook.
You must open the notebook file to execute its cells.
In the left sidebar of Colab, click the Folder icon.
Navigate to: my_freelance_projects → research-notebook → phase2 → deep_learning → tabular.
Double-click 01_pytorch_baseline.ipynb. This will open the file in a new tab.

Set the Working Directory (Inside the New Tab)
Colab notebooks always start in '/content' by default, regardless of where the file is stored. In the first cell of your newly opened 01_pytorch_baseline.ipynb, run:

In [None]:
from google.colab import drive
import os

# Mount drive if not already connected in this session
drive.mount('/content/drive')

# Change directory to where THIS notebook is located
# This allows the code to use relative paths like '../../kaggle/data.csv'
#%cd /content/drive/MyDrive/my_freelance_projects/research-notebook/phase2/deep_learning/tabular/
%cd /content/drive/MyDrive/my_freelance_projects/research-notebook

To access the data source from Kaggle, generate a token file from my kaggle account settings page. I used the 'Create Legacy API Key' button. This action automatically downloaded a file named kaggle.json to your local machine. Place it in the correct directory.

In [3]:
# Upload the file
from google.colab import files
files.upload() # Select the 'kaggle.json' file you downloaded

Saving kaggle.json to kaggle.json


{'kaggle.json': b'{"username":"ancyjohn","key":"4d2f0970f090251bb11dc160015cb9e9"}'}

In [4]:
# Move it to the secret loacation
!mkdir -p ~/.kaggle # Create the hidden directory
!cp kaggle.json ~/.kaggle/ # Move the file there
!chmod 600 ~/.kaggle/kaggle.json #Set permissions (required by Kaggle for security)

Default API Path: The Kaggle API tool is hardcoded to look for credentials in ~/.kaggle/ (which translates to /root/.kaggle/ in Colab).
Security: Running chmod 600 ensures that only the owner can read the file, which prevents the Kaggle tool from throwing a "permissions are too wide" warning.

In [5]:
# Verify
!kaggle datasets list -s "titanic"

ref                                  title                                                size  lastUpdated                 downloadCount  voteCount  usabilityRating  
-----------------------------------  ---------------------------------------------  ----------  --------------------------  -------------  ---------  ---------------  
heptapod/titanic                     Titanic                                             11090  2017-05-16 08:14:22.210000         143886       1859  0.7058824        
brendan45774/test-file               Titanic dataset                                     11514  2021-12-02 16:11:42.367000         219291       1715  1.0              
yasserh/titanic-dataset              Titanic Dataset                                     22564  2021-12-24 14:53:06.913000         271408        832  1.0              
azeembootwala/titanic                Titanic                                             12406  2017-06-05 12:14:37.477000          26137        209  0.8235294 

In [6]:
# Download the data
!kaggle competitions download -c titanic
!unzip -o titanic.zip

Downloading titanic.zip to /content/drive/MyDrive/my_freelance_projects/research-notebook
  0% 0.00/34.1k [00:00<?, ?B/s]
100% 34.1k/34.1k [00:00<00:00, 4.68MB/s]
Archive:  titanic.zip
  inflating: gender_submission.csv   
  inflating: test.csv                
  inflating: train.csv               


In [7]:
!ls

 gender_submission.csv	'Phase 1'   README.md   titanic.zip
 kaggle.json		'Phase 2'   test.csv    train.csv


Now you can run the notebook tab for 01_pytorch_baseline.ipynb, by simply running pd.read_csv("train.csv") if the CSV is in the root.

## Reflection
My work today was portable. My model ran outside Kaggle, on cloud compute. It didn't depend on my local system. It didn't brake.