# General GitHub Integration Setup (Reusable)


This notebook sets up GitHub integration for a Google Colab project.

It connects Colab to Google Drive and a GitHub repository, allowing work done
in Colab to be saved locally in Drive and versioned via GitHub. It also securely
loads any API keys or tokens needed (e.g., GitHub, WandB, UMLS, HuggingFace) using Colab's Secrets tool.



**üìÅ Important:**


You must:
- Have a GitHub account and pre-created repository
- Use the same GitHub repo name in the USER INPUT section
- Place your working notebooks inside `/notebooks` subfolder of the Colab Google Drive. This ensures your work is versioned by Git and automatically pushed to GitHub at the end of each session.
- Create API tokens

üîê Required Secrets:
- `GitHubToken`
- `wandb` (optional)
- `UMLS` (optional)
- `HF_TOKEN` (optional)

‚ö†Ô∏è Make sure you have created these API tokens on their respective platforms and added them via the Colab Secrets UI (üîë icon on the left sidebar ‚Üí Add new secret). The GitHub Token is required for proper setup. The other secrets are optional depending on the project needs.



---



üöÄ **To initiate a new project**
1.   Update the user inputs
2.   Run this notebook through RUN SETUP


üìå At the *start of each Colab session* run this notebook from the top through RUN SETUP to:
1. Mount your Google Drive
2. Connect to your GitHub repo
3. Load API secrets
4. Prepare project folder structure (only if not already present)


üî∫Ô∏è At the *end of your Colab session*, run the **END-OF-SESSION PUSH** to push any notebook/code changes back to GitHub using your authenticated token.



---



This single notebook handles both setup and closing tasks, simplifying workflow
and ensuring all progress is backed up, version-controlled, and shareable.





# ---- USER INPUT ----
# Only this section needs to be changed for reuse across projects

In [None]:
GITHUB_USER = "your-github-username"
REPO_NAME = "your-repo-name"
USER_EMAIL = "your-email@example.com"
USER_FULLNAME = "Your Full Name"
REPO_DESCRIPTION = """
Describe your project here. This could include the purpose, data sources, methods,
and intended deliverables. Keep it concise and informative for others browsing the repo.
"""
DRIVE_BASE = "MyDrive/ColabRepos"  # ‚úÖ Adjust this if your folder structure is different

from google.colab import drive, userdata
import os

# ---- FUNCTIONALIZED SETUP ----

In [14]:
def setup_colab_project(github_user, repo_name, user_email, user_fullname, repo_description, drive_base):
    project_path = f"/content/drive/{drive_base}/{repo_name}"
    repo_url = f"https://github.com/{github_user}/{repo_name}.git"

    # Mount Google Drive
    if not os.path.ismount("/content/drive"):
        drive.mount('/content/drive')
    os.makedirs(f"/content/drive/{drive_base}", exist_ok=True)

    # Git identity setup
    !git config --global user.email "{user_email}"
    !git config --global user.name "{user_fullname}"

    # Clone repo if not already in Drive
    if not os.path.exists(project_path):
        !git clone {repo_url} "{project_path}"

    %cd "{project_path}"

    # Load secrets securely
    os.environ['GITHUB_TOKEN'] = userdata.get('GitHubToken')
    os.environ['WANDB_API_KEY'] = userdata.get('wandb')
    os.environ['UMLS_API_KEY'] = userdata.get('UMLS')
    os.environ['HF_TOKEN'] = userdata.get('HF_TOKEN')

    # Create folder structure
    for folder in ["notebooks", "models", "data", "src", "outputs"]:
        folder_path = os.path.join(project_path, folder)
        os.makedirs(folder_path, exist_ok=True)
        gitkeep = os.path.join(folder_path, ".gitkeep")
        if not os.path.exists(gitkeep):
            open(gitkeep, "w").close()

    # Add README if not present
    readme_path = os.path.join(project_path, "README.md")
    if not os.path.exists(readme_path):
        with open(readme_path, "w") as f:
            f.write(f"""# {repo_name}

{repo_description}

---
Maintained by **{github_user}**, 2025
""")

    # Initial Git push
    push_url = f"https://{github_user}:{os.environ['GITHUB_TOKEN']}@github.com/{github_user}/{repo_name}.git"
    !git add .
    !git commit -m "Initial setup from Colab"
    !git push {push_url}

    return project_path, github_user, repo_name


# ---- RUN SETUP ----

In [17]:
project_path, github_user, repo_name = setup_colab_project(
    GITHUB_USER, REPO_NAME, USER_EMAIL, USER_FULLNAME, REPO_DESCRIPTION, DRIVE_BASE
)

/content/drive/MyDrive/ColabRepos/NLP-Qualifications-Project
On branch main
Your branch is ahead of 'origin/main' by 2 commits.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean
Everything up-to-date


 # ---- END-OF-SESSION PUSH ----
## Run this manually after your work session to sync with GitHub

In [1]:
# Step 9: Push any setup changes (only needed during initial setup)
push_url = f"https://{github_user}:{os.environ['GITHUB_TOKEN']}@github.com/{github_user}/{repo_name}.git"
!git add .
!git commit -m "Initial setup from Colab"
!git push {push_url}


NameError: name 'github_user' is not defined