# Getting Comfortable with Colaboratory!

Google Colaboratory (Colab) is an free online Python notebook that runs on a cpu or gpu in the cloud (i.e. some server at Google). If you have a Google account, you can code with Colab without downloading any IDE (Integrated Development Environment) or installing any packages onto your local machine.
<br></br>

Colab Notebooks are functionally very similar to Jupyter Notebooks (with a few exceptions for uploading/downloading to/from local files) while also having similar funcitonalities to Google Docs/Sheets/Slides where you can collaborate with group members in real time and comment on blocks of code/text.
<br></br>

Here are some basics for working in Colab:

* Hit Ctrl/⌘ + M + H to show Colab's keyboard shortcuts. You can even **customize shortcuts** to your own liking. Here are some examples of default shortcuts in Colab:
  * edit cell: enter
  * run current cell: shift + enter
  * insert code cell above: Ctrl/⌘ + M + A
  * insert code cell below: Ctrl/⌘ + M + B
  * delete cell/selection: Ctrl/⌘ + M + D
  * interupt execution: Ctrl/⌘ + M + I
  * convert to code cell: Ctrl/⌘ + M + Y
  * convert to text cell: Ctrl/⌘ + M + M
* To run all cells in a notebook use Runtime > Run All.
* Move the cell you are currently editing up or down in the notebook's cell order by clicking the up/down arrows in the top right corner. 
* Download your current notebook to your local drive using File > Download .ipynb 
* To clear a code cell's output click on the 3 dots in the top right corner and select "clear output".

# Packages

Colab already has most common packages installed and ready for import, but if it doesn't have the particular package you need, you can install it using "! pip install mypackage" from inside a notebook cell.

Try installing tensorflow in the cell below. It should output "Requirement already satisfied", meaning that the package is already installed on the remote cpu you are using. 

In [None]:
!pip install tensorflow



## Downloading/Uploading Locally in Colab

Although you are running your code on a remote cpu/gpu, you can still download/upload files to/from your local drive. There are 2 special colab commands to do so. For both you will need to import "files" from "google.colab". Note that these commands do not work on IDEs, and your regular download/upload functions will not work in Colab. Note: if you are uploading files from a url to a local variable, you can do so the same way you would outside of Colab (ex. pd.read_csv(url)).

To upload a file: 
 

```
from google.colab import files

uploaded = files.upload()
```
Whenever files.upload() is run, a GUI (graphical user interface) popup will appear in the output. Select "Choose Files" and select a file from your machine.


When downloading files, use the files.download('filename.ext') function and the file will be saved into your downloads folder. This requires the filename you input to be a file already in existence in the cpu of the remote server you are using. This means that you will have to write the file before downloading it to your own machine. How to write a file depends on the filetype. An example for a csv file is as follows:

``` 
from google.colab import files

files.download('filename.csv')
```

Create a .txt file somewhere in your local drive and upload it to colab. Remember to save it as a variable.

In [None]:
from google.colab import files

uploaded = files.upload()

Saving no_u.txt to no_u.txt


Now download the same file you just uploaded. 

In [None]:
files.download("no_u.txt")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# GPU Use

Colab has (limited) free GPU support. 
> To enable GPU support use Edit > Notebook Settings > Hardware Accelerator > GPU.

When switching between CPU and GPU, all your local variables will be cleared, so **don't switch in the middle of running something unless you are prepared to reload all your local variables**. 

Because the GPU is free and public, sometimes many people try to use it at the same time. This can result in you getting less GPU resources to work with. If you get a "ResourcesExhausted" error while using GPU, this means you have used all the GPU resources dedicated to your session and you will have to restart your runtime (Runtime > Restart Runtime). After restarting your runtime, your local variables will be wiped.
> **You do not need GPU power unless you're doing something computationally intensive, like training a large neural network**. My advice would be to not use the GPU unless you notice your runtime slowing down significantly or you know you will need it during your session. 

<br></br>
<br></br>
<br></br>

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_public/master/Images/Week1/colab1.JPG)

# Troubleshooting Colab
* If you get a "ResourcesExhausted" error, this means you have used all the GPU resources dedicated to your session and you will need to restart your runtime (Runtime > Restart Runtime). After restarting your runtime your local environment will be wiped.
* If you get a warning window that says you are close to reaching the session's memory limit, this means you are nearing a ResourcesExhausted error. You can still run code until the error occurs, but note that you will likely need to restart your runtime soon.
* Colab will automatically end your runtime after 12 hours. Simply restart it if this happens
* If things are running really slow, double check that you are using GPU
* If you run into trouble and a cell is taking too long to run, use Runtime > Interrupt execution. You may have to wait a minute for the interruption to take effect.
* If all else fails, try Runtime > Restart Runtime. This will wipe your local variables. 
* Google is your best friend! Learning how to effectively and efficiently use google for coding queries is a vital skill.
* You can always ask an instructor for help

# Saving Your Progress

Since we'll be pushing out notebooks and other material in public class repositories on **Github**, you will have to make your own **fork** of them each week. This way you'll be able to fetch any changes we make to the class repo while still being able to push edits to your own workspace. The catch is, Colab allows you to directly push your Notebook edits/commits to Github as well! Feel free to reference back to this notebook any time throughout the course.

## Github Forking

These are snippets are from https://gist.github.com/Chaser324. We are including that file in the 01 folder in case the original gist goes away or if anyone wants to reference it as well.

### 1. Making the fork 🍴
Just head over to the GitHub repository and click the "Fork" button in the upper right corner. 

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_public/master/Images/Week1/colab2.png)

It's just that simple. After a few seconds you should be redirected to your fork, the top of your page should look like this:

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_public/master/Images/Week1/colab3.JPG)

Once you've done that, you can use your favorite git client (whether it's your command line or git bash) to clone your repo. To get the link click on the green "Clone" button and copy the link that drops down

```
git clone https://github.com/[INSERT YOUR USERNAME]/Week1_Public
```

Now that you have a local clone don't forget to change into that directory for the following steps.

### 2. Keeping Your Fork Up to Date
Keep your fork up to date by tracking the original "upstream" repo (our class repo) that you forked. To do this, you'll need to add a remote:

```
# Add 'upstream' repo to list of remotes
git remote add upstream https://github.com/BeaverWorksMedlytics2020/Week1_Public

# Verify the new remote named 'upstream'
git remote -v
```

Whenever you want to update your fork with the latest upstream changes, you'll need to first fetch the upstream repo's branches and latest commits to bring them into your repository.:

```
# Fetch from upstream remote
git fetch upstream
```

Now, checkout your own master branch and merge the upstream repo's master branch:

```
# Checkout your master branch and merge upstream
git checkout master
git merge upstream/master
git push origin master
```


## Committing/Pushing to Github with Colab

First click on File in the top left corner of your notebook and go down to "Save a copy in Github" 

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_public/master/Images/Week1/colab4.jpg)

You may and should get a pop-up asking for permission to access your Github account through Colab, since this is presumably most people's first time using this feature. Once you sort that out you'll see a new pop-up in the middle of the screen for your commit, where you can choose which branch to push it to or edit your commit message. 

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_public/master/Images/Week1/colab5.png)

Once you click "Ok" (aka push your commit) you'll be redirected to the page of the file you edited on Github. To confirm that it worked, you should see the commit message you just made in Colab as the latest commit.

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_public/master/Images/Week1/colab6.jpg)

## Git forking clarification:

![](https://raw.githubusercontent.com/BeaverWorksMedlytics2020/Data_public/master/Images/Week1/gitlogic.png)