# GitHub workflow for MLflow projects run on Colab

Use this notebook to:
1. Start all new projects linked to github (pull from github)
2. Push any code changes made in colab to github
3. Guide the workflow of MLtools projects (if appropriate)

Basic flow of this notebook is as follows:

1. Set the project variables (such as the project name)
2. Get github variable as an input (so it doesnt accidently get saved to the notebook).
3. If running notebook to commit (add commit message to the top)
4. Try to clone the repository, if it already exists, then pull, to update it to the latest.
5. Then try to push any changes made in Colab to github

In [2]:
# Change the following details to the active project

GLOBAL_PARENT_DIRECTORY = '/content/drive/My Drive/projects/'
GLOBAL_PROJECT_NAME = 'mlflow-example'
GLOBAL_GIT_REPOSITORY = 'https://github.com/asrishaheem/mlflow-example.git'
GLOBAL_GITHUB_USERNAME = 'asrishaheem'

In [13]:
from getpass import getpass
GLOBAL_GITHUB_PASSWORD = getpass('Enter GitHub password:')

Enter GitHub password:··········


In [4]:
run_pull = True
run_push = True
run_commit = True

# If running commit, add the commit message here
GLOBAL_COMMIT_MESSAGE = '"Added run_mlflow_on_colab_github.ipynb"'

# Section 1: Setup

In [5]:
import os

if 'google.colab' in str(get_ipython()):

    print('Running on CoLab')

    from google.colab import drive
    drive.mount("/content/drive")

    path = GLOBAL_PARENT_DIRECTORY
    #path = os.path.join(GLOBAL_PARENT_DIRECTORY, GLOBAL_PROJECT_NAME)

    # if not os.path.exists(path):
    #     os.mkdir(path)

    if os.getcwd() != path:
        os.chdir(path)

else:
    print('This notebook is for Colab development only')


Running on CoLab
Mounted at /content/drive


In [6]:
!pwd

/content/drive/My Drive/projects


## Clone repository

If the project directory exists, we will assume the project has already been cloned and we skip to the pull section. Otherwise, we clone the project. 

In [7]:
!git config --global user.email "asrishaheem@gmail.com"
!git config --global user.name "asrishaheem"

In [8]:
path = os.path.join(GLOBAL_PARENT_DIRECTORY, GLOBAL_PROJECT_NAME)

if not os.path.exists(path):
    print("Project does not exist in Colab. Attempting to clone repository.")
    !git clone '{GLOBAL_GIT_REPOSITORY}'
    os.chdir(path)

    # Also check the status
    !git status
else:
    print("Project already exists in Colab")
    os.chdir(path)

# Remote set-url

Set github credentials and remote URL

In [14]:
try:
    !git remote set-url origin https://{GLOBAL_GITHUB_USERNAME}:{GLOBAL_GITHUB_PASSWORD}@github.com/{GLOBAL_GITHUB_USERNAME}/{GLOBAL_PROJECT_NAME}.git
except:
    print("Remote set-url failed")

## Pull latest changes to Colab

In [None]:
# Pull latest copy from repository
if run_pull:
    !git pull

# Push changes made on Colab to repository

In [18]:
# Check if local branch is up to date
if run_push:
    !git status

On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[31mmodified:   run_from_colab.ipynb[m

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	[31mmlruns/[m
	[31mrun_mlflow_on_colab_github.ipynb[m

no changes added to commit (use "git add" and/or "git commit -a")


In [None]:
# Stage changes for commit
if run_push:
    !git add --all
    !git status

On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	[32mmodified:   run_from_colab.ipynb[m



In [15]:
# Stage changes for commit
if run_push:
    !git commit -m {GLOBAL_COMMIT_MESSAGE}

On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
	[31mmodified:   run_from_colab.ipynb[m

Untracked files:
	[31mmlruns/[m
	[31mrun_mlflow_on_colab_github.ipynb[m

no changes added to commit


In [16]:
# Push changes to main branch
if run_push:
    !git push

Everything up-to-date


In [20]:
!git pull

remote: Enumerating objects: 4, done.[K
remote: Counting objects:  25% (1/4)[Kremote: Counting objects:  50% (2/4)[Kremote: Counting objects:  75% (3/4)[Kremote: Counting objects: 100% (4/4)[Kremote: Counting objects: 100% (4/4), done.[K
remote: Compressing objects:  50% (1/2)[Kremote: Compressing objects: 100% (2/2)[Kremote: Compressing objects: 100% (2/2), done.[K
remote: Total 2 (delta 1), reused 0 (delta 0), pack-reused 0[K
Unpacking objects:  50% (1/2)   Unpacking objects: 100% (2/2)   Unpacking objects: 100% (2/2), done.
From https://github.com/asrishaheem/mlflow-example
   db93cb9..c594cca  master     -> origin/master
Updating db93cb9..c594cca
Fast-forward
 .gitignore | 101 [32m+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++[m
 MLproject  |   1 [32m+[m
 train.py   |  12 [32m+++++[m[31m---[m
 3 files changed, 110 insertions(+), 4 deletions(-)
 create mode 100644 .gitignore


In [None]:
!git commit -m "update test"

[master cfab90f] update test
