# Template and Instructions for Integrating Colab Notebooks with GitHub

This template is based on this Towards Data Science article by Vortana Say: https://towardsdatascience.com/google-drive-google-colab-github-dont-just-read-do-it-5554d5824228

GitHub allows shared editing and tracking of changes of Colab Notebooks by using Google Drive.  Essentially, this process takes our group's master project repo (called a remote repo) from GitHub and creates a working copy on your personal Google Drive (called a local repo).  

You can save your changes to your local repo by "adding" and "committing".  You can then update the remote repo by "pushing" your local to the remote.

If other group members have updated the remote repo with changes that you need, you can update your local repo from the remote by "pulling".  

## Step 1: Copy this Colab Notebook to your personal Google Drive

Save it somewhere in your Google Drive other than the team's shared project folder before proceeding to the next step.  

This will ensure the integrity of the template and prevent you from exposing your personal GitHub access token.

## Step 2: Create a folder in your Google Drive for your local repo

Open your personal Google Drive (using the same google account your Colab account uses).  

Create a folder on your Google Drive where you want to store your local repo.  

In order to avoid confusion, **DO NOT** create your personal, local repo inside the group's shared project folder.

Don't use periods or spaces in your path name.

## Step 3: Mount your Google Drive to this Colab Notebook

When you run the cell below it will prompt you for an authorization code from Google Drive.  Follow the link provided, copy the code, paste it in the cell below and hit enter.

You will also need to include a copy of the cell below in every notebook you use for the project.

In [None]:
# Mount Google Drive
from google.colab import drive # import drive from google colab

ROOT = "/content/drive"     # default location for the drive
print(ROOT)                 # print content of ROOT (Optional)

drive.mount(ROOT, force_remount=True)           # we mount the google drive at /content/drive

/content/drive
Mounted at /content/drive


## Step 4: Create a GitHub access token

3a) Sign-in to GitHub, in the upper right corner click on your icon to open the menu and click "Settings".  

3b) Click on "Developer settings" in the menu on the left.  

3c) Click on "Personal access tokens" in the menu on the left.

3d) Click "Generate new token" button.

3e) Name your new token in the "Note" field.  Set your expiration date to at least 90 days (that gets you to the end of the course plus some).  Select the checkbox next to "repo".  Click the "Generate token" button at the bottom.

3f) On the next screen you need to copy the token and store it somewhere safe.  You only get to view this code once.  Do not share it with anyone and do not put it in the group's repo.  This token will given anyone access to all of your GitHub repos.

## Step 5: Set-up your paths

Insert the folder path you created in Step 2, your GitHub user name and your GitHub token in the appropriate lines of the cell below.

In [None]:
# Pre-cloning set-up
# import join used to join ROOT path and MY_GOOGLE_DRIVE_PATH
from os.path import join  

# path to your project on Google Drive
MY_GOOGLE_DRIVE_PATH = 'MyDrive/[insert the rest of your path here]' 
# replace with your Github username 
GIT_USERNAME = "" 
# definitely replace with your token
GIT_TOKEN = ""  
# Replace with your github repository in this case we want 
GIT_REPOSITORY = "AC215_KKST" 

PROJECT_PATH = join(ROOT, MY_GOOGLE_DRIVE_PATH)

# It's good to print out the value if you are not sure 
print("PROJECT_PATH: ", PROJECT_PATH)       

#GIT_PATH = "https://{GIT_TOKEN}@github.com/{GIT_USERNAME}/{GIT_REPOSITORY}.git" this return 400 Bad Request for me
GIT_PATH = "https://" + GIT_TOKEN + "@github.com/" + GIT_USERNAME + "/" + GIT_REPOSITORY + ".git"
print("GIT_PATH: ", GIT_PATH)

PROJECT_PATH:  /content/drive/MyDrive/adv_practical_data_science/local-repo
mkdir: cannot create directory ‘/content/drive/MyDrive/adv_practical_data_science/local-repo’: File exists
GIT_PATH:  https://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@github.com/skgithub14/AC215_KKST.git


Now check your project path is correct by changing your working directory to it.  If successful, the cell will show the path to the new folder you created.

In [None]:
# change directory
# if everything is correct the cell will show the correct path you want the repo
%cd "{PROJECT_PATH}"

/content/drive/MyDrive/adv_practical_data_science/local-repo


## Step 6: Create a .env to store your GitHub credentials

6a) Open a .txt file and name is ".env".  Save it in the local repo on your Google Drive at the top level of AC215_KKST.

6b) In the first line copy/paste your PROJECT_PATH

6c) In the second line copy/paste your GIT_PATH

6d) In the third line put the email address linked to your GitHub account

6e) In the fouth line type your first and last name (or any name you want to show up next to your changes on the remote)

Note that the .gitignore file automatically excludes all files named .env from being pushed to the remote repo.  This means that your GitHub credentials will only remain on your Google Drive and other group members cannot access them.

## Step 7: Clone the remote repo to your Google Drive

In [None]:
# clone repo
!git clone "{GIT_PATH}" # clone the github repository

Cloning into 'AC215_KKST'...
remote: Enumerating objects: 61, done.[K
remote: Counting objects: 100% (61/61), done.[K
remote: Compressing objects: 100% (52/52), done.[K
remote: Total 61 (delta 21), reused 19 (delta 1), pack-reused 0[K
Unpacking objects: 100% (61/61), done.


You have now successfully created a local repo on your Google Drive which is linked to the remote repo.

## Step 8: Working with other Colab Notebooks


You shouldn't need to use this Colab notebook again. The remaining instructions tell you how to save other Colab notebooks to your local and remote repos. 

If you're creating a new Notebook for the project from scratch, open a new Colab notebook, name it, then go to File, select Move, then navigate to your local repo, then to the notebooks folder in the repo.  

Copy and paste the below cells into every Colab notebook for the project.

Follow the steps in the cells below to integrate your work into the group's remote repo and vice versa.

### Run these cells everytime you open the notebook (or your kernel resets)

In [None]:
# Mount Google Drive
from google.colab import drive # import drive from google colab

ROOT = "/content/drive"     # default location for the drive
print(ROOT)                 # print content of ROOT (Optional)

drive.mount(ROOT, force_remount=True)           # we mount the google drive at /content/drive

In [None]:
# Integrate GitHub

# each group member should add their own ENV_PATH here
# comment out other people's ENV_PATHs
# Steve:
ENV_PATH = "/content/drive/MyDrive/adv_practical_data_science/local-repo/AC215_KKST/.env"
# Matt:
# Shih-ye:
# Al:
# Ed:

# load environment variables
with open(ENV_PATH) as env:
  env_text = env.read()
env_list = env_text.split("\n")
PROJECT_PATH = env_list[0]
GIT_PATH = env_list[1]
EMAIL = env_list[2]
GIT_USERNAME = env_list[3]

# expand paths
REPO_PATH = PROJECT_PATH + '/AC215_KKST'
NOTEBOOK_DIR_PATH = REPO_PATH + '/notebooks'

# change directory to the local repo's notebook folder
%cd '{NOTEBOOK_DIR_PATH}'

In [None]:
# This will update your local repo with work other group member's have done on the project so you can build on their work.
!git pull

### Run these cells when you want to update your local repo with your work:

In [None]:
# check statuses of the files you changed
# this will give a list of files currently in the "head"
!git status

In [None]:
# add files you changed to the head
# this is where you tell Git which files you changed that you want to update your local repo with

# add all files in your local repo to the head
!git add .

# add all files you changed to the head
#!git add -u

# add files by name that you want to add to the head
#!git add {filename1}

# check statuses of the files you changed
# this will give a list of files currently in the "head"
!git status

In [None]:
# commit the changes in the head to your local repo
!git commit -m "brief description of changes"
!git config --global user.email "{EMAIL}"
!git config --global user.name "{GIT_USERNAME}"

### Run this cell when you want to update the remote from your local repo

In [None]:
# push the changes in your local repo to the group's remote repo develop branch
!git push origin develop

# push the changes in your local repo to the group's remote repo master branch
# note that only changes which have been thorough tested and validated should be committed to the master branch
#!git push origin

# you can also create additional branches for the remote repo as desired