#**Google Colab Tutorial**

Colab is a free notebook environment that runs entirely in the cloud. It lets you and your team members edit documents, the way you work with Google Docs. Colab supports many popular machine learning libraries which can be easily loaded in your notebook.

This tutorial gives an exhaustive coverage of all the features of Colab and makes you comfortable working on it with confidence.

#**Prerequisites**

Before you start practicing various types of examples given in this tutorial, we assume that you are already aware about Jupyter, GitHub, basics of Python and other computer programming languages.

# **Google Colab - Introduction**

Google is quite aggressive in AI research. Over many years, Google developed AI framework called TensorFlow and a development tool called Colaboratory. Today TensorFlow is open-sourced and since 2017, Google made Colaboratory free for public use. Colaboratory is now known as Google Colab or simply Colab.

Another attractive feature that Google offers to the developers is the use of GPU. Colab supports GPU and it is totally free. The reasons for making it free for public could be to make its software a standard in the academics for teaching machine learning and data science. It may also have a long term perspective of building a customer base for Google Cloud APIs which are sold per-use basis.

Irrespective of the reasons, the introduction of Colab has eased the learning and development of machine learning applications.

So, let us get started with Colab.

If you have used Jupyter notebook previously, you would quickly learn to use Google Colab. To be precise, Colab is a free Jupyter notebook environment that runs entirely in the cloud. Most importantly, it does not require a setup and the notebooks that you create can be simultaneously edited by your team members - just the way you edit documents in Google Docs. Colab supports many popular machine learning libraries which can be easily loaded in your notebook.

# **What Colab Offers You?**

As a programmer, you can perform the following using Google Colab.



*   Write and execute code in Python

*   Document your code that supports mathematical equations

*   Create/Upload/Share notebooks

*   Import/Save notebooks from/to Google Drive

*   Import/Publish notebooks from GitHub

*   Import external datasets e.g. from Kaggle

*   Integrate PyTorch, TensorFlow, Keras, OpenCV

*   Free Cloud service with free GPU



In this tutorial, you will create and execute your first trivial notebook. Follow the steps that have been given wherever needed.

Note − As Colab implicitly uses Google Drive for storing your notebooks, ensure that you are logged in to your Google Drive account before proceeding further.

**Step 1** − Open the following URL in your browser − https://colab.research.google.com 

**Step 2** − Click on the NEW PYTHON 3 NOTEBOOK link at the bottom of the screen. A new notebook would open up

As you might have noticed, the notebook interface is quite similar to the one provided in Jupyter. There is a code window in which you would enter your Python code.

#**Setting Notebook Name**

By default, the notebook uses the naming convention UntitledXX.ipynb. To rename the notebook, click on this name and type in the desired name in the edit box −

We will call this notebook as MyFirstColabNotebook. So type in this name in the edit box and hit ENTER. The notebook will acquire the name that you have given now.

#**Entering Code**

You will now enter a trivial Python code in the code window and execute it.

Enter the following two Python statements in the code window −

In [2]:
import time
print(time.ctime())

Wed Jul 15 06:15:28 2020


#**Executing Code**

To execute the code, click on the arrow on the left side of the code window.

After a while, you will see the output underneath the code window

You can clear the output anytime by clicking the icon on the left side of the output display.

#**Adding Code Cells**

To add more code to your notebook, select the following menu options −

Insert / Code Cell
Alternatively, just hover the mouse at the bottom center of the Code cell. When the CODE and TEXT buttons appear, click on the CODE to add a new cell. 


A new code cell will be added underneath the current cell. Add the following two statements in the newly created code window −


```
time.sleep(5)
print (time.ctime())
```
Now, if you run this cell, you will see the following output −
```
Mon Jun 17 04:50:27 2019
```

Certainly, the time difference between the two time strings is not 5 seconds. This is obvious as you did take some time to insert the new code. Colab allows you to run all code inside your notebook without an interruption.

#**Run All**

To run the entire code in your notebook without an interruption, execute the following menu options −

`Runtime / Reset and run all…`

#**Changing Cell Order**

When your notebook contains a large number of code cells, you may come across situations where you would like to change the order of execution of these cells. You can do so by selecting the cell that you want to move and clicking the UP CELL or DOWN CELL buttons −

You may click the buttons multiple times to move the cell for more than a single position.

#**Deleting Cell**

During the development of your project, you may have introduced a few now-unwanted cells in your notebook. You can remove such cells from your project easily with a single click. Click on the vertical-dotted icon at the top right corner of your code cell.
Click on the Delete cell option and the current cell will be deleted.

#**Google Colab - Documenting Your Code**

As the code cell supports full Python syntax, you may use Python comments in the code window to describe your code. However, many a time you need more than a simple text based comments to illustrate the ML algorithms. ML heavily uses mathematics and to explain those terms and equations to your readers you need an editor that supports LaTex - a language for mathematical representations. Colab provides Text Cells for this purpose.

Text Cells are formatted using markdown - a simple markup language. Let us now see you how to add text cells to your notebook and add to it some text containing mathematical equations.



**Markdown Examples**

Let us look into few examples of markup language syntax to demonstrate its capabilities.

Type in the following text in the Text cell.

```
This is **bold**.
This is *italic*.
This is ~strikethrough~.
```



#**Google Colab - Saving Your Work**
Colab allows you to save your work to Google Drive or even directly to your GitHub repository.

**Saving to Google Drive**
Colab allows you to save your work to your Google Drive. To save your notebook, select the following menu options −

`File / Save a copy in Drive…`

The action will create a copy of your notebook and save it to your drive. Later on you may rename the copy to your choice of name.

**Saving to GitHub**
You may also save your work to your GitHub repository by selecting the following menu options −

`File / Save a copy in GitHub...`

You will have to wait until you see the login screen to GitHub

Then, enter your credentials. If you do not have a repository, create a new one and save your project

#**Google Colab - Sharing Notebook**

To share the notebook that you have created with other co-developers, you may share the copy that you have made in your Google Drive.

To publish the notebook to general audience, you may share it from your GitHub repository.

There is one more way to share your work and that is by clicking on the SHARE link at the top right hand corner of your Colab notebook. 

You may enter the email IDs of people with whom you would like to share the current document. You can set the kind of access by selecting from the three options shown in the above screen.

Click on the Get shareable link option to get the URL of your notebook. You will find options for whom to share as follows −

* Specified group of people

* Colleagues in your organization

* Anyone with the link

* All public on the web

Now. you know how to create/execute/save/share a notebook. In the Code cell, we used Python so far. The code cell can also be used for invoking system commands. This is explained next.

#**Google Colab - Invoking System Commands**
Jupyter includes shortcuts for many common system operations. Colab Code cell supports this feature.

**Simple Commands**
Enter the following code in the Code cell that uses the system command echo.

```
message = 'A Great Tutorial on Colab by Tutorialspoint!'
greeting = !echo -e '$message\n$message'
greeting
```



**Getting Remote Data**

Let us look into another example that loads the dataset from a remote server. Type in the following command in your Code cell −


In [3]:
!wget http://mlr.cs.umass.edu/ml/machine-learning-databases/adult/adult.data -P "/10 Academy 2020/Content/Weekly schedules JH inputs/JH_Contents"

--2020-07-15 06:30:10--  http://mlr.cs.umass.edu/ml/machine-learning-databases/adult/adult.data
Resolving mlr.cs.umass.edu (mlr.cs.umass.edu)... 128.119.246.96
Connecting to mlr.cs.umass.edu (mlr.cs.umass.edu)|128.119.246.96|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3974305 (3.8M) [text/plain]
Saving to: ‘/10 Academy 2020/Content/Weekly schedules JH inputs/JH_Contents/adult.data’


2020-07-15 06:30:14 (1.27 MB/s) - ‘/10 Academy 2020/Content/Weekly schedules JH inputs/JH_Contents/adult.data’ saved [3974305/3974305]



As the message says, the adult.data.1 file is now added to your drive. You can verify this by examining the folder contents of your drive. Alternatively, type in the following code in a new Code cell −

In [4]:
import pandas as pd
data = pd.read_csv("/10 Academy 2020/Content/Weekly schedules JH inputs/JH_Contents/adult.data")
data.head(5)

Unnamed: 0,39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,<=50K
0,50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
1,38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,<=50K
2,53,Private,234721,11th,7,Married-civ-spouse,Handlers-cleaners,Husband,Black,Male,0,0,40,United-States,<=50K
3,28,Private,338409,Bachelors,13,Married-civ-spouse,Prof-specialty,Wife,Black,Female,0,0,40,Cuba,<=50K
4,37,Private,284582,Masters,14,Married-civ-spouse,Exec-managerial,Wife,White,Female,0,0,40,United-States,<=50K


Likewise, most of the system commands can be invoked in your code cell by prepending the command with an Exclamation Mark (!). Let us look into another example before giving out the complete list of commands that you can invoke.

**Cloning Git Repository**

You can clone the entire GitHub repository into Colab using the gitcommand. For example, to clone the keras tutorial, type the following command in the Code cell −

In [5]:
!git clone https://github.com/wxs/keras-mnist-tutorial.git

Cloning into 'keras-mnist-tutorial'...
remote: Enumerating objects: 26, done.[K
remote: Total 26 (delta 0), reused 0 (delta 0), pack-reused 26[K
Unpacking objects: 100% (26/26), done.


**System Aliases**

To get a list of shortcuts for common operations, execute the following command −

`!ls /bin`


#**Google Colab - Executing External Python Files**

Suppose, you already have some Python code developed that is stored in your Google Drive. Now, you will like to load this code in Colab for further modifications.

**Mounting Drive**

`Tools / Command palette`

Type a few letters like “m” in the search box to locate the mount command. Select Mount Drive command from the list. The following code would be inserted in your Code cell.


In [6]:
from google.colab import drive
drive.mount('/content/gdrive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/gdrive


If you run this code, you will be asked to enter the authentication code.

Listing Drive Contents
You can list the contents of the drive using the ls command as follows −

`!ls "/content/drive/My Drive/Colab Notebooks"`

This command will list the contents of your Colab Notebooks folder. The sample output of my drive contents are shown here −

In [7]:
!ls "/content/gdrive/My Drive/Colab Notebooks"

'Copie de 02.04-Computation-on-arrays-aggregates.ipynb'
'Copie de Index.ipynb'
'Copie de Intro_api_twitter.ipynb'
 Hello.py
 Intro_api_twitter.ipynb
'MNIST in Keras.ipynb'
 Untitled0.ipynb


**Running Python Code**

Now, let us say that you want to run a Python file called hello.py stored in your Google Drive. Type the following command in the Code cell −

!python3 "/content/drive/My Drive/Colab Notebooks/Hello.py"

In [8]:
!python3 "/content/gdrive/My Drive/Colab Notebooks/Hello.py"

Hello!


For your easy usage, just save the below code snippet and paste it into the Google Colab and you can mount your Google Drive to the notebook easily.


```
from google.colab import drive
drive.mount('/content/gdrive')
root_path = 'gdrive/My Drive/your_project_folder/'  #change dir to your project folder
```






In [None]:
from google.colab import drive

drive.mount('/content/gdrive',force_remount=True)
root_path = 'gdrive/My Drive/Kaggle_project/'  #change dir to your project folder

#**Download the dataset directly to Google Drive via Google Colab**

In this section I will share with you my experience in downloading dataset from Kaggle and other competition.

Downloading Kaggle datasets via Kaggle API

**Step 1 — Get the API key from your account**

Visit www.kaggle.com ⇨ login ⇨ My Account ⇨ Create New API Token

The “kaggle.json” file will be auto downloaded.

**Step 2 — Upload the kaggle.json file**

Use these code snippets in Google Colab for the task:

```
from google.colab import files
files.upload()  #this will prompt you to upload the kaggle.json

```



The below will create the necessary folder path.
```
!pip install -q kaggle
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!ls ~/.kaggle
!chmod 600 /root/.kaggle/kaggle.json  # set permission
```

In [9]:
from google.colab import files
files.upload() #this will prompt you to update the json

!pip install -q kaggle
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!ls ~/.kaggle
!chmod 600 /root/.kaggle/kaggle.json  # set permission

Saving kaggle (1).json to kaggle (1).json
cp: cannot stat 'kaggle.json': No such file or directory
chmod: cannot access '/root/.kaggle/kaggle.json': No such file or directory


**Step 3 — Download the required dataset**

Simply download the required dataset with the syntax:

`!kaggle competitions download -c ‘name_of_competition’ -p “target_colab_dir”`

In [None]:
pip install --upgrade kaggle

In [10]:
!kaggle competitions download -c histopathologic-cancer-detection -p /content/gdrive/My\ Drive/kaggle/cancer

Traceback (most recent call last):
  File "/usr/local/bin/kaggle", line 5, in <module>
    from kaggle.cli import main
  File "/usr/local/lib/python2.7/dist-packages/kaggle/__init__.py", line 23, in <module>
    api.authenticate()
  File "/usr/local/lib/python2.7/dist-packages/kaggle/api/kaggle_api_extended.py", line 146, in authenticate
    self.config_file, self.config_dir))
IOError: Could not find kaggle.json. Make sure it's located in /root/.kaggle. Or use the environment method.


**Step 4 — Unzip**

For dataset with multiple zip files like the example, I tend to change directory to the designated folder and unzip them one by one.
```
!unzip -q file[.zip] -d [exdir]
-q suppress the printing of the file names being extracted
-d [exdir] optional directory to which to extract files
```

In [None]:
import os
os.chdir('gdrive/My Drive/kaggle/cancer')  #change dir
!mkdir train  #create a directory named train/
!mkdir test  #create a directory named test/
!unzip -q train.zip -d train/  #unzip data in train/
!unzip -q test.zip -d test/  #unzip data in test/
!unzip sample_submission.csv.zip
!unzip train_labels.csv.zip

**Download Dataset from competition website which username and password is required while requesting to download**

For competition like ICIAR2018, you will need to provide the username and password while downloading the dataset.

To do this in Google Colab, first you can change your current directory to the folder you wish to save your dataset. Then, use wget instead of using curl command.

In [None]:
!wget --user=your_username --password=your_password http://cdn1.i3s.up.pt/digitalpathology/ICIAR2018_BACH_Challenge.zip

After downloading, you can unzip the file using the same approach above.