

#**Online MESA Bootcamp Machine Learning Notebook**

## **Introduction**

Machine Learning is being applied to many applications in science and technology. In this tutorial, we will

* Introduce the basis concepts of machine learning,
* Run through some examples of supervised and unsupervised machine learning, and finally
* Apply machine learning to various real-world applications.



## **How to use Google Colab**

Today we will be using Google Colab. This is an online notebook program that is like Microsoft Word, but also allows us to run the computer programs that will be used to make models of nanoparticles.

To begin, we want to download all the notebooks that we will be using to your Google Drive. This requires you to create a Google account, which includes your own personal Google Drive. Once you download these notebooks to your Google Drive, you will be able to save answers to questions, make notes to your notebooks, and access your notebooks when you get home!

### **First: Open Google Chrome**

We perfer that you use Google Chrome today to do this project.

If this is not possible, don't worry. Use the web browser that you would usually use and are comfortable with.

### **Second: Open your Google Account**

**Make sure you have a Google account and can log into it**. If you have Gmail, then you have a google account.
* If you have a Google account, make sure you can log into it. [Click here to test if you can login to your Gmail account](http://www.gmail.com).
* If you don't have a Google account, [click here to get a new Google account](https://accounts.google.com/signup/v2/webcreateaccount?hl=en&flowName=GlifWebSignIn&flowEntry=SignUp) before continuing.

Once you have done this, click the blue `Sign in` button at the top right of this page (see below):

<center>
<img src="https://github.com/geoffreyweal/MESA_Bootcamp_2023_ML_Tutorial/blob/main/images/Part_1.0/signin.png?raw=true" alt="drawing" width="900"/>
</center>

If you are asked to login, go through the login page, where you will enter in your gmail address (google account) and your google password.

Once you have signed in, you will see that the blue `Sign in` button will have changed to a coloured circle with a capital letter in it.

<center>
<img src="https://github.com/geoffreyweal/MESA_Bootcamp_2023_ML_Tutorial/blob/main/images/Part_1.0/signedin.png?raw=true" alt="drawing" width="900"/>
</center>

### **Third: Connect this notebook to your Google Drive**

We will now download the notebooks that we will be using to your Google Drive.

1. To begin, **hover your mouse over the line of code in the shaded box below and click on the <img src="https://github.com/GardenGroupUO/Computational_Silver_Nanoparticle_Exercise_Data/blob/main/Images/stop_images/playsvg.png?raw=true" alt="drawing" width="25"/> play button that will appear to the left of the shaded box**. This will run our code below:
* If a message appears that says **"Warning: This notebook was not authored by Google."**, click on this button that says **"Run anyway"**.
2. Next, you **may** see one of the following prompts pop up on your web browser:
* You may be ask to **"Permit this notebook to access your Google Drive files?"**. Click the button that says **"Connect to Google Drive"**.
* Or, You may ask you to **"Go to this URL in a browser". Click on this weblink**.
3. A new webpage will open in a new tab. **This webpage will ask you to login to your Google account**.
* You may be asked that **Google Drive for desktop wants additional access to your Google Account**. Under the **Select what Google Drive for desktop can access** title, first tick all the boxes that are give you (if you see these boxes. Don't worry if you don't see any boxes). Then, click the `Continue` button at the bottom of this pop-up page (you may need to scroll down to see this button).
4. Once you have logged-in through this website, the website *may* will give you an authorization code. If you do get an authorization code, **Copy the authorization code that the website gives you**. Then, if you got an authorization code, **Paste this authorization code</font> (Windows/Linux: Ctrl+p/Mac:** <img src="https://github.com/GardenGroupUO/Computational_Silver_Nanoparticle_Exercise_Data/blob/main/Images/Part_1.0/command_mac.png?raw=true" alt="drawing" width="14"/>**+p)** into the box below called "Enter your authorization code:". **Then press the Enter button**.
* If you don't get this message but instead the code tells you the drive is mounted, that's all good, Google has done everything we need to do without requiring any authorization codes

In [None]:
from google.colab import drive
print('-------------------------------------')
print('If you see a message below, ignore it')
drive.flush_and_unmount()
print('-------------------------------------')
print('The below message should say that a drive has been mounted')
drive.mount('/content/drive')
print('-------------------------------------')

-------------------------------------
If you see a message below, ignore it
Drive not mounted, so nothing to flush and unmount.
-------------------------------------
The below message should say that a drive has been mounted


MessageError: ignored

### **Third: Download notebooks to your Google Drive**

You have now connected your Google Drive to this notebook. Now we will download the notebooks we will be using to your Google Drive.

**Run the code below by hovering your mouse over the line of code below and clicking on the <img src="https://github.com/GardenGroupUO/Computational_Silver_Nanoparticle_Exercise_Data/blob/main/Images/stop_images/playsvg.png?raw=true" alt="drawing" width="25"/> button when it appears**.

If a message appears that says "Warning: This notebook was not authored by Google.", click on this button that says "Run anyway".

In [None]:
#@markdown <font color="black" size="+2">←</font><font color="red" size="+1"> **Click the play button to download our notebooks to your Google Drive**</font>

!echo --------------------------------
!echo Installing programs for downloading notebooks from Github
!apt install subversion &> /dev/nul
!echo Installed programs for downloading notebooks from Github
!echo --------------------------------

import os, subprocess

def download_notebooks(name_of_folder,path_to_folder):
    notebooks_name = ''
    command = ("svn export https://github.com/geoffreyweal/"+name_of_folder+"/trunk/"+notebooks_name+' '+path_to_folder+'/'+name_of_folder).split()
    sp = subprocess.Popen(command, stdout=subprocess.PIPE)#, stderr=subprocess.PIPE)
    output, err = sp.communicate()
    p_status = sp.wait()

path_to_folder = '/content/drive/MyDrive'
name_of_folder = 'MESA_Bootcamp_2023_ML_Tutorial_Notebooks'

if os.path.exists(path_to_folder):
    if (not name_of_folder in os.listdir(path_to_folder)):
        print('Downloading Notebooks')
        download_notebooks(name_of_folder,path_to_folder)
        print('Notebooks downloaded to '+str(name_of_folder)+' folder.')
    else:
        print('You already have downloaded these notebooks to your Google Drive.')
        print('If you want to re-download these notebooks to your Google Drive, ')
        print('delete the folder called '+str(name_of_folder)+' and try running this code again.')
        print('See https://drive.google.com/ to access your Google Drive.')
        print('You should see the folder called '+str(name_of_folder)+' here.')
        print('Delete this folder.')
else:
    print('Error: Could not download notebooks because you have not mounted your Google Drive')
    print('Perform the second step above before trying this step again')
!echo --------------------------------

We have now just downloaded our notebooks into your Google Drive into a folder called ``MESA_Bootcamp_ML_Tutorial``. We are now ready to go!

### **Fourth: Open your Google Drive and look at your notebooks for this lesson**

We will now open our Google Drive and look at our downloaded folders.
1. Click the following weblink to **open your Google Drive: https://drive.google.com/drive/u/0/my-drive**. This will **open a new tab that contains your Google Drive**.
2. **Click to open the folder called ``MESA_Bootcamp_2023_ML_Tutorial``**.

You will now see all the notebooks in this folder. **These notebooks all have names that begin with ``Part`` and end with ``.ipynb``**.


<img src="https://github.com/geoffreyweal/MESA_Bootcamp_2023_ML_Tutorial/blob/main/images/Part_1.0/gdrive.png?raw=tru" alt="drawing" width="3000"/>

-----------------------------------------------------------------------------------------------------------------------------------------------------------

<img src="https://github.com/geoffreyweal/MESA_Bootcamp_2023_ML_Tutorial/blob/main/images/Part_1.0/MESA_Bootcamp_ML_google_colab_gdrive_page.png?raw=true" alt="drawing" width="3000"/>

You are now ready to go and learn about machine learning!

## **Worksheet Lessons**

In this online lesson, we will use and write python code for creating machine learning programs using Scikit.

**Whenever you want to work on one of the worksheets below, go to your [Google Drive](https://drive.google.com/drive/u/0/) and click on the ``MESA_Bootcamp_ML_Tutorial`` folder to open up these notebooks**. Get familiar to using the Google Drive webpage, as we will be using it to access all our worksheets.

The worksheets we will be using are:

### **Part 1: Getting started**

In this section, we will learn some basics of running python scripts on Google Colab, as well as review machine learning and Scikit, the python program that we will use to make our machine learning models.

* **Part 1.1** (*Part_1.1_Intro_to_Colab.ipynb*): Introduction to Python3 and Google Colab. **When you are ready, start with this worksheet.**
* **Part 1.2** (*Part_1.2_Introduction_to_Machine_Learning.ipynb*): General Introduction to Machine Learning with Python: Scikit-Learn.

### **Part 2: Supervised Learning: Classification**

In this section, we will learn about supervised machine learning, and specifically look at how to classification methods work and how to use them, using the k-Nearest Neighbour (kNN) algorithm.

### **Part 3: Unsupervised Learning: Dimensionality Reduction**

In this section, we will learn about unsupervised machine learning, and specifically look at how to dimentionality reduction methods like the Principal Component Analysis (PCA) algorithm.

We will learn how to use the PCA algorithm, and how to apply the results from the PCA and other unsupervised algorithms with classification algorithms to make a machine vision algorithm.

### **Part 4: Applications of Machine Learning Algorithms**

In this section, we will use the skills we have learnt from Parts 2 and 3 to apply machine learning to a a practical chemistry application. Here, we will look at how we can predict the quality of red and white wine by using various properties of severeal wines.

Designed for Google Colab (online) and Visual Studio Code (on computer)

References:

* Introduction to Machine Learning with Python: Andreas C Müller and Sarah Guido
* https://github.com/jakevdp/sklearn_tutorial/ : Jake Vanderplas
* https://github.com/nesi/sklearn_tutorial : NeSI (nesi.org.nz)