<a href="https://colab.research.google.com/github/kaizerpatrawala/AI-NLP/blob/main/OptusU_AIN_PreCourseWork_Jupyter_Notebooks_and_Google_Colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Jupyter Notebooks and Google Colab**

**Jupyter notebooks** are a type of computational notebook that have been popularised by AI researchers and data scientists in combining coding, scripting, documenting and presenting on the single interface. It can be termed a web tool that is free, open-source, interactive and powerful as it brings together software code, computational output, explanatory text and multimedia resources on to a single document. 
The name Jupyter is derived from the main languages that are supported: Julia, Python, and R. Jupyter is not alone, there are others; R Markdown, Apache Zeppelin, and Spark Notebook.  

**Google Colab** (or Colaboratory) is a free cloud service on GCP for prototyping (and collaborating on) machine learning algorithms on powerful server infrastructure such as GPUs and TPUs. It provides a "serverless" Jupyter notebook environment for interactive development. It is a free resource, with the limit being a maximum run time of 12 hours per session (but you can connect to a different VM for further GPU compute).

Colab completely eliminates the need for configuring Integrated Development Environments (IDEs), which can be technically challenging. all you need is a Google account, a web browser and Internet access to get started. 

Let us familiarise ourselves with Notebooks and Colab.

<br>

### **1. A first line of Python code**

A customary "Hello World" to begin with. 

The print function print() takes in a series of characters as a parameter and then displays it on the screen. 

To run the code hover the mouse over [ ] and click Run, or press shift-enter to execute.

In [2]:
print("Hello World!")

Hello World!


In [3]:
print("I want to build AI algorithms.")

I want to build AI algorithms.


<br>

### **2. Managing notebooks**

**Creating a notebook**

Navigate to Google Drive and select Google Colaboratory as shown below. 

<figure>
<center>
<img src='https://raw.githubusercontent.com/harsha89/public/master/image_colab.png' />
<figcaption>Create a notebook</figcaption></center>
</figure>

* As a best practice, use meaningful titles for the notebook, as it will be retained as the primary identifier on Google Drive and GitHub. 

<br>
<br>

**Open existing notebooks**

Navigate to File -> Open Notebook option which you can select notebook as per your preference.

<figure>
<center>
<img src='https://raw.githubusercontent.com/harsha89/public/master/image_onote.PNG' />
<figcaption>Open a notebook</figcaption></center>
</figure>

<br>
<br>

**Converting notebook to python executable**

Following the experimentation stage, the notebook code has to be moved into development, staging and production systems. Download the notebook as a python script using File -> Download.py option as shown below.

<figure>
<center>
<img src='https://drive.google.com/uc?id=1wrfq-BfNJG-NgIywc0c7Bpcgb9o9EIhK' />
<figcaption>Download python script</figcaption></center>
</figure>

* Github search option allows you to list all notebook from an well-known organization to speedup your experiments with existing resources

<figure>
<center>
<img src='https://raw.githubusercontent.com/harsha89/public/master/imagr_jup.PNG' />
<figcaption>Github Notebook Search</figcaption></center>
</figure>

<br>
<br>

**Saving to Google Drive or Github**

Colab notebooks provide option to save notebook to Google Drive or Github repository which allow you to save changes to organization wide repository. Follow File -> save copy in Drive or save copy in Github option as following image.

<figure>
<center>
<img src='https://drive.google.com/uc?id=124PMS1GzZfRBpAJ2YW51A1shFR1qs_6F' />
<figcaption>Save notebook</figcaption></center>
</figure>

Colab is a useful resource to perform quick machine learning experiments within an organization which is shareble across teams. Hope you will take the maximum advantage of the power of notebooks and free GPU and TPU resources.







### **3. Google Drive**

Besides GitHub, another effective means of managing data/ an AI algorithm/ a notebook is to store it in your Google Drive (cloud storage). 

* To save a notebook to your Google Drive you can use the file menu options: File > Save a copy in Drive

* You can connect Google Drive to this notebook using the code snippets given below. Having done this you can manage your files using the GUI of Google Drive.

Below is an example where you can connect Google Drive to the current runtime to create a new text file in Google Drive and write content through Python.


In [None]:
# Click 'Allow' in the authentication step, to permit this notebook to access the Google Drive contents

# Mounting your Google Drive on your runtime will enable to import content of Google Drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Open a text file in Google Drive and write some content
file01 = open('/content/drive/My Drive/AINfile01.txt', 'w') 
file01.write('Hello Google Drive!')
file01.close()

In [None]:
# Read the content written in text file
file01 = open('/content/drive/My Drive/AINfile01.txt', 'r') 
print(file01.read())

Optionally you may also click on the Google Drive Icon in the left panel as shown in below image to connect the Drive to the notebook

<figure>
<center>
<img src='https://raw.githubusercontent.com/harsha89/public/master/image_drive.PNG' />
<figcaption>Connect Google Drive</figcaption></center>
</figure>

<br>

### **4. Colab Runtime options**

**Code execution options **

* To clear the output in Colab: Edit -> Clear all outputs

*  To reset the current runtime you can use following approach
  *   Runtime -> Restart runtime... resets the Python shell
  *   Runtime -> Factory reset runtime... resets the entire Colab instance
  *   Runtime -> Manage sessions ? Terminate .. resets the entire Colab instance

If this doesnt work, call for help!

<br>

### **5. Hardware acceleration**

**Changing the Runtime**
* Use the Runtime tab to select your preferred Hardware accelerator for sophisticated deep learning algorithms or to process large datasets.
* Runtime -> Change runtime type -> Hardware accelerator -> GPU

<figure>
<center>
<img src='https://raw.githubusercontent.com/harsha89/public/master/image_gpu.PNG' />
<figcaption>Runtime Selection</figcaption></center>
</figure>

**Checking Utilization**
* By default, Colab provide dedicated 12GB RAM with 110GB disk space to for  machine learning. 
A free upgrade to 24GB RAM is described here [link](https://towardsdatascience.com/upgrade-your-memory-on-google-colab-for-free-1b8b18e8791d). 

<figure>
<center>
<img src='https://raw.githubusercontent.com/harsha89/public/master/iamge_utlization.png' />
<figcaption>Current Utlization</figcaption></center>
</figure>



<br>


### **6. Where does my notebook execute?**

Your colab notebook is running on a dedicated instance which binds to your session. This means that if you are familiar with shell commands, you can execute those commands as follows.

Inorder to run shell commands add ***!*** in front of the command

Example: 
* **!ls** will list the contents of the folder
* **!pwd** will output the current path
* **!ls -la** will output the permissions of the directory and contents
* **!mkdir** ***directory_name*** will make a new directory



In [None]:
!ls
!pwd
!ls -la

<br>

### **7. Utilizing existing helper functions**

Google Colab provide set of helper functions which may useful in some settings. View all the helper functions by clicking in the '<>' icon in the left panel as shown in the following image. 

<figure>
<center>
<img src='https://raw.githubusercontent.com/harsha89/public/master/image_func.PNG' />
<figcaption>Helper Functions</figcaption></center>
</figure>

Helper functions includes

* Form fields to capture inputs
* Camera capture
* Download file from workspace
* Execute javascript snippets
* Sample commands for importing libraries
* Connect Google Drive
* Listing files in Google Drive
* Open files from Github






In [None]:
#@markdown Forms support many types of fields.

no_type_checking = ''  #@param
string_type = 'example'  #@param {type: "string"}
slider_value = 142  #@param {type: "slider", min: 100, max: 200}
number = 102  #@param {type: "number"}
date = '2010-11-05'  #@param {type: "date"}
pick_me = "monday"  #@param ['monday', 'tuesday', 'wednesday', 'thursday']
select_or_input = "apples" #@param ["apples", "bananas", "oranges"] {allow-input: true}
#@markdown ---


<br>

### **8. GitHub**

GitHub is a code repository/hosting platform for version control,  collaboration and more recently public sharing of code. 
<br></br>

### GitHub Tutorial

If you haven't already, please complete the introductory tutorial provided by GitHub to familiarise yourself with the GitHub essentials.

https://guides.github.com/activities/hello-world/
<br></br>

### Connecting to a notebook in GitHub

To load a specific notebook from github, append the github path to http://colab.research.google.com/github/.

As an example let's say below is your github link for the colab notebook.

https://github.com/udacity/machine-learning/blob/master/projects/practice_projects/imdb/IMDB_In_Keras_Solutions.ipynb


Then append `udacity/machine-learning/blob/master/projects/practice_projects/imdb/IMDB_In_Keras_Solutions.ipynb` to `http://colab.research.google.com/github/`

Link after appending:

https://colab.research.google.com/github/udacity/machine-learning/blob/master/projects/practice_projects/imdb/IMDB_In_Keras_Solutions.ipynb

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/udacity/machine-learning/blob/master/projects/practice_projects/imdb/IMDB_In_Keras_Solutions.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/udacity/machine-learning/blob/master/projects/practice_projects/imdb/IMDB_In_Keras_Solutions.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>
<br></br>
<br></br>

 **Make sure you send your GitHub username to cdac@latrobe.edu.au, to get access to the 2022-S2-OptusAIN GitHub repository.**

**1. Cloning the 2022-S2-OptusAIN repository (working copy)**

Cloning a repository is the process of downloading a local copy of a remote repository, such that you can contribute to the existing work.



In [None]:
# Create a new directory 'repos', before cloning the Github repository
%cd /content/drive/MyDrive/
!mkdir repos
%cd /content/drive/MyDrive/repos

In [None]:
# Specify your github username here
username = 'replace this text with your github username'  #@param {type: "string"}
github_root = 'CDAC-Lab'  #@param {type: "string"}
repository = '2022-S2-OptusAIN' #@param {type: "string"}

To generate your personal access token, 
1. Navigate to [2022-S2-OptusAIN GitHub repo](https://github.com/CDAC-lab/2022-S2-OptusAIN)
2. Click on your profile icon at the top right-hand corner and select **Settings**

![](https://drive.google.com/uc?export=view&id=1Q2reNAPY2s5Fa1HusVbM4_MCODsvN7rM)

3. From the list of setting options listed on the left, select **Developer Settings**
4. From the developer settings, locate **Personal access tokens** and **Generate new token**
5. Fill the section as follows, 
      - Note: 2022-S2-OptusAIN-PreCourseWork
      - Select the **repo** checkbox


![](https://drive.google.com/uc?export=view&id=1hxrBJBzP56bEo1B169lCFodjsbFohXtb)

6. And select **Generate token** at the bottom of the page
7. Copy the generated token and paste it into the following code segment

In [None]:
# Specify your personal access token here
access_token = 'replace this text with your github personal access token' #@param {type: "string"}

In [None]:
# Cloning the GitHub repository
!git clone https://{username}:{access_token}@github.com/{github_root}/{repository}

In [None]:
# Viewing the cloned contents
%cd /content/drive/MyDrive/repos/2022-S2-OptusAIN
!ls
!pwd

In [None]:
# Check if there are any changes made to the content in the cloned repository
!git status

**2. Create a new branch**

When contributing to a repository, make sure to create a new branch for the development work. Do not submit it directly to the master branch.

In [None]:
# Create a branch to checkout the modified content to the repo
!git checkout -b {username}/PreCourseWork-mods
!git status

**3. Staging Area**

In [None]:
# Add the changes done in the current working directory to a staging area
!git add .

**4. Committing to local repository**

In [None]:
# Save the changes done to the local repository
!git config user.email "<replace this text including the enclosing symbols <> with your github email>"
!git config user.name {username}

!git commit -m "commit pre-course work modifications".

**5. Pushing to the remote repository**

In [None]:
# Push the local repository content to the remote repository branch
!git push -u origin {username}/PreCourseWork-mods

Once you have pushed the modifications to the repository, you can navigate to the repository and view the changes in the corresponding branch.

To navigate to the branch, locate the **master branch** dropdown option and choose the specific branch you pushed the changes to.

![](https://drive.google.com/uc?export=view&id=1gTEKpIw0ecPdBvOFtr_nKWcpgyLLmoKm)

<br>

### **9. More exercises**

Please follow this [link](https://colab.research.google.com/notebooks/intro.ipynb) for Google Colab's own resources for getting started. 

Try to complete as many of these as you can. Including [Machine Learning Examples](https://colab.research.google.com/notebooks/intro.ipynb#scrollTo=P-H6Lw1vyNNd)

If you come across any questions/concerns, please note them on the [LMS Discussion Forum](https://lms.latrobe.edu.au/mod/forum/view.php?id=5850706) 

