# Getting Comfortable with Colaboratory
---

Google Colaboratory (Colab) is an free online Python notebook that runs on a CPU or GPU in the cloud (i.e. some server at Google). If you have a Google account, you can code with Colab without downloading any IDE (Integrated Development Environment) or installing any packages onto your local machine.

Colab Notebooks are functionally very similar to Jupyter Notebooks (with a few exceptions for uploading/downloading to/from local files) while also having similar funcitonalities to Google Docs where you can share notebooks to collaborate with group members in real time and comment on blocks of code/text.
<br></br>

Here are some basics for working in Colab:

* Hit Ctrl/⌘ + M + H to show Colab's keyboard shortcuts. You can even **customize shortcuts** to your own liking. Here are some examples of default shortcuts in Colab:
  * edit cell: enter
  * run current cell: shift + enter
  * insert code cell above: Ctrl/⌘ + M + A
  * insert code cell below: Ctrl/⌘ + M + B
  * delete cell/selection: Ctrl/⌘ + M + D
  * interupt execution: Ctrl/⌘ + M + I
  * convert to code cell: Ctrl/⌘ + M + Y
  * convert to text cell: Ctrl/⌘ + M + M
* To run all cells in a notebook use Runtime > Run All.
* Move the cell you are currently editing up or down in the notebook's cell order by clicking the up/down arrows in the top right corner. 
* Download your current notebook to your local drive using File > Download .ipynb 
* To clear a code cell's output click on the 3 dots in the top right corner and select "clear output".

# Packages
---

Colab already has most common packages installed and ready for import, but if it doesn't have the particular package you need, you can install it using `!pip install mypackage` from inside a notebook cell.

## EXERCISE

Try installing tensorflow in the cell below. It should output "Requirement already satisfied", meaning that the package is already installed on the remote cpu you are using. 

In [None]:
# YOUR CODE HERE edit made

# Downloading/Uploading Locally in Colab
---

Although you are running your code on a remote CPU/GPU, you can still download/upload files to/from your local drive. There are 2 special Colab commands to do so. For both you will need to import "files" from "google.colab". Note that these commands do not work on IDEs, and your regular download/upload functions will not work in Colab. 

*Note: if you are uploading files from a url to a local variable, you can do so the same way you would outside of Colab (ex. pd.read_csv(url)).*

## Uploading File(s)

There are 3 different ways you can upload files to work with from Colab.

1.


```python
from google.colab import files

uploaded = files.upload()
```

Whenever `files.upload()` is run, a GUI popup will appear in the output. Select "Choose Files" and select the file(s) from your machine.
<br></br>

2.

You can also upload file(s) to your Colab notebook through its UI.
1. Click on the files icon on the left side of your screen (it should be the third one down). That should open up a side bar and after a few seconds, 3 icons should appear lined up horizontally below "Files".
2. Select the leftmost icon to upload your files.

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_Public/master/Images/Week1/colab00.png)
<br></br>

3.

You can allow your Colab notebook to access files from your Google Drive, in other words mount your drive. 
1. Instead of selecting the file upload icon you should select the right most one labeled "Mount Drive" when you hover over it.
2. In your first time mounting your drive for a notebook, clicking the icon should make a Code Cell appear. Running this cell will provide a link in its output that you should click and be redirected to another tab that will ask for permission to access your files. Once you've done that copy/paste the code back into the box in the output of the Code Cell and hit enter.

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_Public/master/Images/Week1/colab01.png)
![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_Public/master/Images/Week1/colab02.jpg)

## Downloading File(s)

When downloading files, use the `files.download('filename.ext')` function and the file will be saved into your Downloads folder. This requires the filename you input to be a file already in existence in the CPU of the remote server you are using.

```python
from google.colab import files

files.download('filename.ext')
```

## EXERCISE
Create a .txt file somewhere in your local drive and upload it to colab. Remember to save it as a variable. Then download the same file you just uploaded.

In [None]:
# Upload file
# YOUR CODE HERE

In [None]:
# Download file
# YOUR CODE HERE

# GPU Use
---

Colab has (limited) free GPU support. 
> To enable GPU support use Edit > Notebook Settings > Hardware Accelerator > GPU.

When switching between CPU and GPU, all your local variables will be cleared, so **don't switch in the middle of running something unless you are prepared to reload all your local variables**. 

Because the GPU is free and public, sometimes many people try to use it at the same time. This can result in you getting less GPU resources to work with. If you get a "ResourcesExhausted" error while using GPU, this means you have used all the GPU resources dedicated to your session and you will have to restart your runtime (Runtime > Restart Runtime). After restarting your runtime, your local variables will be wiped.
> **You do not need GPU power unless you're doing something computationally intensive, like training a large neural network**. My advice would be to not use the GPU unless you notice your runtime slowing down significantly or you know you will need it during your session. 

# Troubleshooting Colab
---

* If you get a "ResourcesExhausted" error, this means you have used all the GPU resources dedicated to your session and you will need to restart your runtime (Runtime > Restart Runtime). After restarting your runtime your local environment will be wiped.
* If you get a warning window that says you are close to reaching the session's memory limit, this means you are nearing a ResourcesExhausted error. You can still run code until the error occurs, but note that you will likely need to restart your runtime soon.
* Colab will automatically end your runtime after 12 hours. Simply restart it if this happens
* If things are running really slow, double check that you are using GPU
* If you run into trouble and a cell is taking too long to run, use Runtime > Interrupt execution. You may have to wait a minute for the interruption to take effect.
* If all else fails, try Runtime > Restart Runtime. This will wipe your local variables. 
* Google is your best friend! Learning how to effectively and efficiently use google for coding queries is a vital skill.
* You can always ask an instructor for help

# Saving Your Progress
---

Since we'll be pushing out notebooks and other material in public class repositories on **Github**, you will have to make your own **fork** of them each week. This way you'll be able to fetch any changes we make to the class repo while still being able to push edits to your own workspace. The catch is, Colab allows you to directly push your Notebook edits/commits to Github as well! Feel free to reference back to this notebook any time throughout the course.

# Medlytics Github 101
---

## 1. Setting up your fork 🍴

*You only have to complete this first section once (in the beginning of the week)*

### 1a. Making the fork 
Just head over to the **public class'** GitHub repository and click the "Fork" button in the upper right corner. 

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_public/master/Images/Week1/colab2.png)

After a few seconds you should be redirected to your fork, the top of your page should look something like this:

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_public/master/Images/Week1/colab3.JPG)

### 1b. Making a local clone of *your fork*
First, click on the green "Clone" button and copy the link that drops down. *Again, make sure you're on the page of **your fork**, not the class repository.* 

Once you have that, open your terminal or Git bash to clone your fork by running the following command:

```shell
git clone [INSERT LINK] 
# Your repository link should look something like https://github.com/[YOUR USERNAME]/[WeekNUMBER]_Public
```

### 1c. Adding an upstream

In order to keep your fork up to date with the class repository when we push new notebooks and files each day, you'll need to set up the original class repository as your "upstream". *Make sure you insert the link of the **class** repository and not your fork.*

```shell
# First make sure you're in the right directory
cd Week[NUMBER]_Public
# Now add the upstream
git remote add upstream [INSERT LINK] 
# Class link should look something like https://github.com/BeaverWorksMedlytics2020/[WeekNUMBER]_Public
```

Make sure your origin and upstream are all set by running

```shell
git remote -v
```

You should see something like this where your origin is linked to your fork (as seen by your username) and upstream is set to the original class repository (as seen by BeaverWorksMedlytics2020)

```
origin  https://github.com/emilygtan/Week1_Public.git (fetch)
origin  https://github.com/emilygtan/Week1_Public.git (push)
upstream        https://github.com/BeaverWorksMedlytics2020/Week1_Public (fetch)
upstream        https://github.com/BeaverWorksMedlytics2020/Week1_Public (push)
```

## 2. Keeping your fork up to date

Once you've edited your Colab Exercise notebooks, you'll first need to pull those updates to your local copy. Then you you can fetch the changes from upstream, merge those with your local master branch, and push the updated local branch to your remote repository.

```shell
git pull

git fetch upstream

git checkout master

git commit -m "[YOUR COMMIT MESSAGE HERE]"

git merge upstream/master master

git push origin master
```

## Git Logic Chart

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_Public/master/Images/Week1/gitlogic.png)

# Committing/Pushing to Github with Colab
---

First click on File in the top left corner of your notebook and go down to "Save a copy in Github" 

*Note: Make sure you've now opened up **your personal fork's notebook** and not the class repository's notebook, since you won't have permissions to push or save changes to the public class repository*

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_Public/master/Images/Week1/colab4.jpg)

You may and should get a pop-up asking for permission to access your Github account through Colab, since this is presumably most people's first time using this feature. Once you sort that out you'll see a new pop-up in the middle of the screen for your commit, where you can choose which branch to push it to or edit your commit message. 

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_Public/master/Images/Week1/colab5.png)

Once you click "Ok" (aka push your commit) you'll be redirected to the page of the file you edited on Github. To confirm that it worked, you should see the commit message you just made in Colab as the latest commit.

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_Public/master/Images/Week1/colab6.jpg)

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=d235bfc1-2e47-4a5a-86ef-83055dd711b9' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>