<a href="https://colab.research.google.com/github/harvard-visionlab/onboarding/blob/main/notebooks/visionlab_starter_private_repos.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)]([https://colab.research.google.com/drive/1rbb2xDQCmvWB-2UMNV7H8BNaQ5P9aScJ?usp=sharing](https://colab.research.google.com/drive/1AaBFZDYizf8mxVZ70LXWmkGznonE9FlA?usp=sharing))

In [None]:
from google.colab import userdata

repo = "harvard-visionlab/private-repo-test"

# Clone the repository using the GitHub token
repo_url = f"https://{userdata.get('GITHUB_TOKEN')}:x-oauth-basic@github.com/{repo}.git"
!git clone {repo_url}

Cloning into 'private-repo-test'...
remote: Invalid username or password.
fatal: Authentication failed for 'https://github.com/harvard-visionlab/private-repo-test.git/'


# one-time setup

## overview
You are going to generate an ssh key that you will use to identify yourself form google colab.

On a personal workstation (e.g., your laptop, or a lab computer), this ssh key would be stored on the device permanently (unless you choose to delete it), which makes life easy.

On google colab, anything you setup on the computer is "ephemeral" -> when google powers down your "instance" all data you created will be lost.

To deal with this, we're going to setup our ssh key, but then we're going to store it as a "google secret key" which we can then use to "reinstall" our ssh key whenever we start up a colab notebook that needs access to our private repos.



## a) create your ssh key

Open the terminal, the run this, replacing "your_email@example.com" with the e-mail address you use for google colab.
```
ssh-keygen -t ed25519 -C "your_email@example.com"
```

Accept the defaults for each field (no passphrase needed, leave password blank):
```
Enter file in which to save the key (/root/.ssh/id_ed25519):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_ed25519
Your public key has been saved in /root/.ssh/id_ed25519.pub
```

Notice that you now have two files, super secret `/root/.ssh/id_ed25519`, and public `/root/.ssh/id_ed25519.pub`. The secret key should never leave your laptop. It shouldn't be saved anywhere that could ever possibly be leaked by you. So don't put it in a notebook, or google drive, etc. Keep it in that hidden `/root/.ssh/` folder.



## b) save keys as colab secrets
OK, next we are going to break our own rules, and set the ssh private and public keys as colab Secrets, because google is promising to keep these safe, and we're trusting them.

- Open the colab "Secrets" panel (key on the left menu bar)
-  Click "Add new secret", name it SSH_PRIVATE_BASE64
-  In the terminal, run
```
base64 /root/.ssh/id_ed25519
```
-  Copy and paste the full contents as the "Value" for SSH_PRIVATE_BASE64.

-  Click "Add new secret", name it SSH_PUBLIC_BASE64
-  In the terminal, run
```
base64 /root/.ssh/id_ed25519.pub
```
-  Copy and past the  contents as the "Value" for SSH_PUBLIC_BASE64



## c) Step 3

Add the public key to your github account:
- https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account

# usage: use keys for private repo access

Now that your keys are created, you never have to create them again (unless your keys are compromised and you want to deactivate them; in that case, just delete it from your github account).

However, Colab will "forget" your keys when a session ends. So every new session you need to run a few lines of code to setup private repo access.

## Step 1: copy keys to files
So let's assume we are starting in that "blank" state. First we need to copy our keys from the "Secrets" to that hidden /root/.ssh folder:

In [None]:
import os
import base64
from google.colab import userdata

!mkdir -p /root/.ssh

# Write the secret to a file
with open('/root/.ssh/id_ed25519.pub', 'wb') as file:
    file.write(base64.b64decode(userdata.get('SSH_PUBLIC_BASE64')))

with open('/root/.ssh/id_ed25519', 'wb') as file:
    file.write(base64.b64decode(userdata.get('SSH_PRIVATE_BASE64')))

# Set the correct permissions for the private key file
os.chmod('/root/.ssh/id_ed25519', 0o600)

## Step 2: Then we add our ssh keys to our computers ssh agent

In [None]:
!eval "$(ssh-agent -s)"

Agent pid 629


## Step 3: Then we tell the local ssh system that it can send keys to github.com

In [None]:
!ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts

# github.com:22 SSH-2.0-babeld-acf36093d


## Step 4: You can test your ssh key

In [None]:
!ssh -T git@github.com

Hi grez72! You've successfully authenticated, but GitHub does not provide shell access.


## Step 5: Configure your github user

In [None]:
!git config --global user.email "grez72@gmail.com"
!git config --global user.name "George Alvarez"

## Step 6: verify your private repo access

Run the following to verify/test your private repo access (really only need to do this when you are testing the system; for actual work the next setp would be to glone the repo you want to work with)

In [None]:
!git clone git@github.com:harvard-visionlab/private-repo-test.git

Cloning into 'private-repo-test'...
remote: Enumerating objects: 12, done.[K
remote: Counting objects: 100% (12/12), done.[K
remote: Compressing objects: 100% (9/9), done.[K
remote: Total 12 (delta 3), reused 7 (delta 2), pack-reused 0 (from 0)[K
Receiving objects: 100% (12/12), done.
Resolving deltas: 100% (3/3), done.


In [None]:
%cd private-repo-test

/content/private-repo-test


In [None]:
!mkdir -p tests

In [None]:
test_filename = "tests/test_grez72_2024_08_26_take3.txt" # make this unique to you so you can add a new file to the repo!
!touch {test_filename}

In [None]:
!git add .
!git commit -m 'test private repo cloning'
!git push origin main

[main 48e59b1] test private repo cloning
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 tests/test_grez72_2024_08_26_take3.txt
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 2 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 296 bytes | 296.00 KiB/s, done.
Total 3 (delta 2), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.[K
To github.com:harvard-visionlab/private-repo-test.git
   34d6dfb..48e59b1  main -> main


# test2: disconnect runtime

If this is your first time setting up/testing private repo access, you should test that you can run this in a fresh "runtime". To do so, goto Runtime => Disconnect and delete runtime.

Then run Steps 1-6 again and verify that everything still works. If so, you are now setup to use private repos. Any repo that is private but you have access/permissions to should be cloneable to your colab environment.



# test3: setup using snippets

If you want to get really fancy, you can setup "code snippets": code snippets that you can import from other notebooks, e.g., the standard visionlab snippets notebook, which would make setting your ssh_keys up in a new notebook a simple 2-step process

- 1. import the code snippet (search "setup_ssh_keys" in snippet toolbar
- 2. run setup_ssh_keys (except with your github email and username)

```
setup_ssh_keys("grez72@gmail.com", "George Alvarez")
```


## setup instructions

To try this out, I would suggest yet again going to Runtime => "Disconnect and delete runtime".

We have a number of standard "visionlab" code snippets that you might want to use. You can see them https://colab.research.google.com/drive/16w0Q-KFqH98zFKtp9w3HGO31wSToRvS6#scrollTo=qQ0ALKl3Vm6m.

To setup access to these snippets in colab, you just goto Tools => Settings, then under "custom snippet notebook URL", paste that url to your snippets. You can paste multiple URLs (one at a time), e.g., if you want to write your own "snippets" you can paste the url for your own notebook(s) too.

Once the url is pasted, you might have to refresh the page for the current notebook to be able to search your snippet notebook(s).

In the menu-bar to the left click on the < > Icon to open up snippets. Visionlab snippets are prefaced with "visionlab" so you can start by typing visionlab. Once you see the snippet you want, you can highlight it an click "insert" and the code should be pasted in your notebook (you want "setup_ssh_keys"), which will look like this:
```
def setup_ssh_keys(github_email, github_usernmae):
    import os
    import base64
    from google.colab import userdata

    !mkdir -p /root/.ssh

    # Write the secret to a file
    with open('/root/.ssh/id_ed25519.pub', 'wb') as file:
        file.write(base64.b64decode(userdata.get('SSH_PUBLIC_BASE64')))

    with open('/root/.ssh/id_ed25519', 'wb') as file:
        file.write(base64.b64decode(userdata.get('SSH_PRIVATE_BASE64')))

    # Set the correct permissions for the private key file
    os.chmod('/root/.ssh/id_ed25519', 0o600)

    !eval "$(ssh-agent -s)"
    !ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts
    !ssh -T git@github.com

    !git config --global user.email {github_email}
    !git config --global user.name {github_usernmae}
```

So go ahead and run it with your github email and username, and you should see output that looks like this:
```
Agent pid 418
# github.com:22 SSH-2.0-babeld-acf36093d
Hi grez72! You've successfully authenticated, but GitHub does not provide shell access.
```

Then run the private-repo-access test again:
```
!git clone git@github.com:harvard-visionlab/private-repo-test.git
%cd private-repo-test
test_filename = "tests/test_<unique test filename>.txt"
!touch {test_filename}
!git add .
!git commit -m 'test private repo cloning'
!git push origin main
```

# Setting up SSH on your laptop or lab workstations or the cluster

Note that you can do the same ssh setup from any workstation that you have a password protected user account on, i.e., a machine where only you can ever goto /root/.ssh to see your private ssh keys (your laptop; your cluster home directory; a workstation where you have a password protected user login). The only difference is that you wont have to put the keys in the "Secrets" manager (skip Step 1 because once created the keys persist on your machine's hard drive), and you only have to do Steps 2-5 once (generally the right after you create the keys). In other words, after initial setup on say your laptop, you'll skip steps 1-5 and just jump directly to cloning/accessing your private repos on your machine.