##If you are not viewing this notebook in Colab already, open this link:
https://colab.research.google.com/github/dbamman/nlp20/blob/master/setup/Colab_Intro.ipynb

# What is Colab?

* Colab is a free [Jupyter Notebook](https://jupyter.org/) 
environment hosted by Google that allows you to develop and run code and analyze data using computing resources in the cloud.  In this class, we'll typically use it for programming assignments in Python.


* Colab Notebooks (like this one) consist of "Cells" that help organize code and text. From within Colab, you can add a Code Cell or a Text Cell by clicking the "+ Code" or "+ Text" buttons on the top left. 


* Google provides lots more details and tips about Colab in their [tutorials](https://colab.research.google.com/notebooks/welcome.ipynb).  If you are new to Colab, you might find these tutorials helpful to get up to speed.


* To execute the code in a cell, use the key-command `shift+enter`. Try that out on the code cell below

In [0]:
5 + 5

# Creating a Colab Notebook

* You need to be signed into a Google account in order to create, edit, and save Colab Notebooks.  You can create a new Notebook and find your saved Notebooks by going to https://colab.research.google.com/ while logged into your Google account. 


* We recommend that you use your Berkeley Gmail account rather than a personal Gmail for this class in order to avoid confusion.  When viewing a notebook, you can make sure that you're using the right Gmail account by clicking on your profile icon at the top right, and switch accounts if necessary.


* Saved notebooks will then be stored in your account on Google Drive.  

# Viewing a Colab hosted on Github

* In addition to Google Drive, Colab Notebooks can also be opened in the browser if they are stored on Github. For this class, assignments or examples will be posted on the course Github. Jupyter Notebooks are files with the extension ".ipynb". 


* If a Jupyter Notebook is posted on Github, you can open it in Colab by copying the web address address of the '.ipynb' file and pointing your browser to `https://colab.research.google.com/github/{github-path-to-notebook}`


* For example, this notebook can be opened in Colab via `https://colab.research.google.com/github/dbamman/nlp20/blob/master/setup/Colab_Intro.ipynb`


* Once you've opened a Notebook in Colab, you can run the code and make edits. To save your changes, click on File->Save in the menu.  ** If you are editing a copy of a Colab that you opened from Github, Your edits will only be saved once you save a copy in your Google Drive.** You'll be prompted to do this if you try to save and haven't already done this step.


* For assignments, Colab notebooks will typically be posted on the course Github page and will have instructions, starter code, and space for you to add your own code. In order to get started on an assigment, you'll want to open the starter notebook from Github in Colab and then save a copy into your Google Drive so that you can make changes to complete the assignment.

# What is a Runtime?

* Each time you open a Notebook in Colab, you are actually connecting to a computer (a server running the Linux operating system) in the Cloud, hosted by Google. When you execute your code, it's actually running on that computer and sending back any results to display in your browser.  In Colab, this connection between a Google server and your browser is called a 'Runtime'.


* Because Google provides these servers free to Colab users, running code on Colab lets you take advantage of powerful hardware to run computations faster than you could on a laptop, for example.


* Because you need to connect to a "Runtime" in order to use Colab, each time you connect, you'll need to re-run setup code each time you re-connect (such as installing Python libraries or downloading data). This can take a few minutes, but it's a necessary trade-off in order to be able to use these free computational resources.

# Downloading Data

* To work with data in Colab, we need to download the data to the computer you're connecting to in your Runtime. Data can be downloaded using the `!wget` command or by connecting to files on Google Drive.


* When possible, data for course assignments will be uploaded to the course Github and then can be downloaded from within Colab. Make sure to download the 'raw' content of the file.

In [0]:
!wget https://raw.githubusercontent.com/dbamman/nlp20/master/setup/example_data.txt
!ls

* Let's open a file from the course Github using Python and print out the contents.

In [0]:
open('example_data.txt').read().split('\n')

# Enabling GPU



* For some assignments in this class, you will want to enable Colab's GPU, which allows neural models to train much faster (~20x!) than on a CPU.


* GPU is not enabled by default on Colab.  To do so, perform the following:
    
    * Go to Edit -> Notebook Settings

    * Select "Hardware Accelerator"

    * Select "GPU"


* In order to confirm GPU is enabled, run the following command in your Colab notebook:

In [0]:
import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

* If you see something like the following, GPU is enabled.

```
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5
```


* If, however, you see this, GPU is NOT enabled.

```
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
```