# Git Setup and Workflow for Google Colab

This notebook provides a clean and self-explanatory guide for managing your project in Git and pushing to GitHub via SSH from Google Colab.

## Initial Setup: Mount Drive and Prepare Environment

This step ensures your Google Drive is mounted and accessible within the Colab environment.

In [None]:
from google.colab import drive
drive.mount("/content/drive", force_remount=True)

## One-Time Git and SSH Configuration (Run only when setting up a new repository or session)

This section handles generating SSH keys, configuring Git, initializing the repository in your target directory, and making the initial commit while correctly ignoring Google Drive-specific files.

### 1. Generate SSH Key Pair (Non-Interactively)

This command generates an Ed25519 SSH key pair with an empty passphrase. It's crucial for authenticating with GitHub via SSH. You only need to run this once per Colab session or if your keys are lost.

In [None]:
! ssh-keygen -t ed25519 -C "sghosh6@lbl.gov" -N "" -f ~/.ssh/id_ed25519

### 2. Display Public Key for GitHub

Copy the output of this cell and add it to your [GitHub SSH Keys settings](https://github.com/settings/keys) under a new SSH key. Give it a descriptive title (e.g., "Colab Key").

In [None]:
! cat ~/.ssh/id_ed25519.pub

### 3. Initialize Git Repository and Configure SSH Agent

This block performs several critical steps:
*   Changes to your project directory (`/content/drive/Shareddrives/GCCP`).
*   **Removes any old `.git` directory** (use with caution, this wipes Git history! Only for fresh setup).
*   Initializes a **new** Git repository.
*   Configures your Git user name and email.
*   Creates/updates a `.gitignore` to correctly ignore `.gdoc`, `.gsheet`, and potential nested `GCCP/` directories.
*   Starts the SSH agent and adds your private key.
*   Adds GitHub's host keys to your known hosts.
*   Adds the remote GitHub repository using the SSH URL.
*   Performs the **initial commit** of your project files.
*   Renames the default branch to `main`.

In [None]:
import os

# Change the current working directory to the target path
%cd /content/drive/Shareddrives/GCCP

# Remove any existing .git directory to start fresh (use with caution if you have other branches/history!)
# Only run this if you want to completely re-initialize the repository.
! rm -rf .git

# Initialize a new Git repository in this directory
! git init

# Configure your user email and name
! git config user.email "soumyadeepghosh35@gmail.gov"
! git config user.name "soumyadeepghosh35"

# Create/Update .gitignore file to correctly exclude problematic Google Drive file types
with open('.gitignore', 'w') as f:
    f.write('*.gdoc\n')
    f.write('*.gsheet\n')
    f.write('GCCP/\n') # Ignore potential nested directory if not intended as submodule

# Add and commit the updated .gitignore file immediately
! git add .gitignore
! git commit -m "Add .gitignore to ignore Google Drive files"

# Start the SSH agent and add your private key in one go
! eval "$(ssh-agent -s)" && ssh-add ~/.ssh/id_ed25519

# Add GitHub's host keys to your known hosts file
! ssh-keyscan github.com >> ~/.ssh/known_hosts

# Add the remote repository using the SSH URL (only if not already added)
if 'origin' not in os.popen('git remote').read():
    ! git remote add origin git@github.com:soumyadeepghosh35/AminesThermoPhysicalPrediction.git

# Verify the remote URL
! git remote -v

# Add all remaining files in the current directory to the staging area (now respecting .gitignore)
! git add .

# Commit the changes (if there are any new changes after ignoring files)
! git commit -m "Initial commit of project files from Colab via SSH"

# Rename the local branch to 'main' if it's 'master'
! git branch -M main

### Remove Nested Git Repository and Re-commit Files

This step will remove the `.git` directory from any nested `GCCP/` subdirectory, allowing its contents to be tracked as part of your main repository. We will then re-add, re-commit, and push.

In [None]:
import os

# Change to the project root directory
%cd /content/drive/Shareddrives/GCCP

# Remove the nested .git directory if it exists
# This assumes the nested repository is directly inside GCCP, e.g., /content/drive/Shareddrives/GCCP/GCCP/.git
if os.path.exists('./GCCP/.git'):
    ! rm -rf ./GCCP/.git
    print("Removed nested .git directory from GCCP/")
else:
    print("No nested .git directory found in GCCP/")

# Remove the 'GCCP' gitlink from the index so its contents can be tracked normally
! git rm --cached GCCP

# Re-add all files in the current directory to the staging area
# This will now include the contents of the GCCP/ subdirectory
! git add .

# Commit the changes
! git commit -m "Re-added GCCP/ contents after removing nested .git and gitlink"

# Push the changes to your remote 'main' branch
! git push -u origin main

/content/drive/Shareddrives/GCCP
Removed nested .git directory from GCCP/
rm 'GCCP'


In [50]:
import os

# Change the current working directory to the target path
%cd /content/drive/Shareddrives/GCCP

# Update .gitignore file to correctly exclude problematic Google Drive file types
# REMOVING 'GCCP/' from .gitignore to ensure the main directory contents are tracked
with open('.gitignore', 'w') as f:
    f.write('*.gdoc\n')
    f.write('*.gsheet\n')

# Display the content of the updated .gitignore
! cat .gitignore

# Add and commit the updated .gitignore file
! git add .gitignore
! git commit -m "Update .gitignore to remove GCCP/ entry for main repository content"

# Re-add all remaining files in the current directory to the staging area (now respecting the corrected .gitignore)
! git add .

# Commit the changes (if there are any new changes after ignoring files)
! git commit -m "adding data files"

# Push the changes to your remote 'main' branch
! git push -u origin main

/content/drive/Shareddrives/GCCP
*.gdoc
*.gsheet
[main 8928ac3] Update .gitignore to remove GCCP/ entry for main repository content
 1 file changed, 1 deletion(-)
[33mhint: You've added another git repository inside your current repository.[m
[33mhint: Clones of the outer repository will not contain the contents of[m
[33mhint: the embedded repository and will not know how to obtain it.[m
[33mhint: If you meant to add a submodule, use:[m
[33mhint: [m
[33mhint: 	git submodule add <url> GCCP[m
[33mhint: [m
[33mhint: If you added this path by mistake, you can remove it from the[m
[33mhint: index with:[m
[33mhint: [m
[33mhint: 	git rm --cached GCCP[m
[33mhint: [m
[33mhint: See "git help submodule" for more information.[m
[main 2f92dae] adding data files
 1 file changed, 1 insertion(+)
 create mode 160000 GCCP
Enumerating objects: 7, done.
Counting objects: 100% (7/7), done.
Delta compression using up to 2 threads
Compressing objects: 100% (4/4), done.
Writing objec

### 4. Push Initial Commit to GitHub

This pushes your newly configured repository and initial project files to your remote GitHub `main` branch.

In [None]:
# Push your changes to GitHub
! git push -u origin main

---

## Git Workflow for Subsequent Commits (Template)

This section provides a template for checking the status of your repository, staging new or modified files, committing them with a custom message, and pushing them to your remote GitHub repository.

### 1. Check Repository Status

Always a good idea to see what files have been changed, added, or deleted.

In [49]:
# Check the current status of your Git repository
! git status

On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean


### 2. Stage, Commit, and Push Your Changes

Use this block to stage your changes, commit them with a descriptive message, and then push them to your remote `main` branch. **Remember to replace `"Your custom commit message here"` with an actual, meaningful message.**

In [None]:
# Stage all changes (or specify individual files like 'git add my_file.txt')
! git add .

# Commit the changes with a custom message
! git commit -m "Your custom commit message here"

# Push the changes to your remote 'main' branch
! git push -u origin main