# <a id='toc1_'></a>[COMP712 Classical Artificial Intelligence](#toc0_)

# <a id='toc2_'></a>[Workshop: Data Science using Python](#toc0_)

Dr Daniel Zhang @ Falmouth University\
2023-2024 Study Block 1

<div id="top"></div>

# Table of contents<a id='top'></a><a id='toc0_'></a>    
- [COMP712 Classical Artificial Intelligence](#toc1_)    
- [Workshop: Data Science using Python](#toc2_)    
- [<a id='toc0_'></a>](#toc3_)    
  - [Introduction](#toc3_1_)    
  - [Jupyter Notebook Basics](#toc3_2_)    
- [The Notebook Elements](#toc4_)    
  - [The Main Toolbar](#toc4_1_)    
  - [Cell Types](#toc4_2_)    
    - [Code Cells](#toc4_2_1_)    
    - [Markdown Cell](#toc4_2_2_)    
    - [Raw Cell](#toc4_2_3_)    
- [Working with `NumPy` and `Matplotlib`](#toc5_)    
  - [Task 1: Array manipulation](#toc5_1_)    
  - [Taks 2: Plot the first 10 rows of your matrix](#toc5_2_)    
  - [Task 3: Implement the Game of Life](#toc5_3_)    
  - [Task 4: The magic square](#toc5_4_)    
- [Note](#toc6_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

## <a id='toc3_1_'></a>[Introduction](#toc0_)

[Top](#top)

Welcome to this comprehensive workshop on Data Science using Python! Once the notebook is launched successfully, it indicates that all the necessary prerequisites are met, and you're all set to begin!

## <a id='toc3_2_'></a>[Jupyter Notebook Basics](#toc0_)

[Top](#top)

JupyterLab, the successor of Jupyter Notebook, is a powerful interactive development environment (IDE) that revolutionises the way we work with data, code, and visualisations. At its core, JupyterLab provides a flexible and user-friendly interface that combines various elements crucial for data analysis and scientific computing. 

JupyterLab provides an interactive notebook interface where you can write and execute code, add text explanations using markdown, and visualise results in real-time. This enables seamless transitions between various tasks, promoting a highly interactive and collaborative workflow.

At the heart of JupyterLab lies the concept of kernels, which are computational engines responsible for executing code within notebooks. This architecture allows users to work with different programming languages within the same interface. Users can choose from various kernels, such as `Python`, `R`, `Julia`, and others, making JupyterLab a versatile environment for multi-language development and analysis. Overall, JupyterLab's simplicity, versatility, and integration of different tools make it an indispensable platform for data scientists, researchers, and educators seeking an intuitive yet powerful environment for data exploration and analysis.

# <a id='toc4_'></a>[The Notebook Elements](#toc0_)

[Top](#top)

After opening a notebook, the left file panel remains unchanged. The main area would be used for displaying the notebook's contents and interactions with the Python kernel.

## <a id='toc4_1_'></a>[The Main Toolbar](#toc0_)

[Top](#top)

The JupyterLab main toolbar consists of buttons that provide quick access to essential functionalities, as shown below.

![JupyterLab Toolbar Buttons](jupyterlab_toolbar.png)

Here's an overview of some common toolbar buttons:

1. `Save`: The floppy disk icon represents the "Save" button, allowing users to save changes made to the notebook. Clicking this button or using the shortcut <kbd>Ctrl + S</kbd> (or <kbd>Cmd + S</kbd> on Mac) saves the notebook.

2. `Add Cell`: This button adds a new cell to the notebook. Clicking the "<kbd>+</kbd>" icon creates a new cell below the currently selected cell.

3,4,5. `Cut`, `Copy`, `Paste`: The scissors, copy, and clipboard icons respectively perform cut, copy, and paste operations on cells within the notebook.

6. `Run`: The "Run" button executes the code in the currently selected cell. Pressing this button or using <kbd>Shift + Enter</kbd> runs the cell and displays the output below the code cell.

7. `Interrupt Kernel`: The square "stop" icon is used to interrupt or halt the execution of code cells. It stops the execution of a cell that's taking too long to run or is stuck in an infinite loop.

8. `Restart Kernel`: The circular arrow icon restarts the kernel. Restarting the kernel resets the computational state, clearing all variables and previously executed code. Use this button cautiously, as it resets the notebook's memory.

9. `Restart Kernel and Run All Cells`: The fast-forward icon restarts the kernel and executes all cells in the current notebook one after another. This button resets the notebook's memory as well.

10. `Cell Type`: The dropdown menu allows users to change the cell type (`Code`, `Markdown`, and `Raw`) of the selected cell, which will be explained below in detail.


There is a kernel status indicator area on the notebook's top-right corner. This indicator displays the status of the kernel (the computational engine) associated with the notebook. It shows whether the kernel is idle, busy executing code, or has encountered an error, as shown below.

![Kernel Status](jupyterlab_kernel.png)

## <a id='toc4_2_'></a>[Cell Types](#toc0_)

[Top](#top)

Inside the notebook, you'll find cells where you can write code or text. The current cell can be executed by pressing <kbd>Shift + Enter</kbd> or the `Run` button in the main toolbar. If you highlight one cell, some extra shortcut buttons will appear at the top-right corner of the cell for cell manipulation. These buttons can be used to move the current cell up or down, make a duplication of the cell, add a new cell above or below the current one, or delete the cell from the notebook.

In JupyterLab, there are primarily three types of cells: `Code`, `Markdown`, and `Raw`.

### <a id='toc4_2_1_'></a>[Code Cells](#toc0_)

[Top](#top)

These cells are used to write and execute code. When you enter Python, R, Julia, or any other supported language's code into a code cell, you can run it by pressing <kbd>Shift + Enter</kbd>. The output of the code appears directly below the cell. Code cells are where you perform computations, define functions, import libraries, and execute algorithms. They are the core components for interactive programming within Jupyter notebooks.

For example, the following cell is a **`code`** cell that displays the information of your machine and operating system. 

> **Note**: the `%%time` is a useful magic command I used a lot personally. It can be placed as the first line of any `code cell` to measure the running time of the cell.

In [None]:
%%time

import platform
print('System information: ' + platform.machine() + '-' + platform.system()  + '-' + platform.version())

> **Note**: The output of a cell can be removed by right-clicking on the cell and selecting `Clear Cell Output` from the pop-up menu items. Or you can clear all the output blocks in the current notebook by selecting `Clear Outputs of All Cells`.

### <a id='toc4_2_2_'></a>[Markdown Cell](#toc0_)

[Top](#top)

Markdown cells are used for text explanations, formatted documentation, and commentary within the notebook. Markdown is a lightweight markup language that allows users to add formatted text, headings, lists, images, hyperlinks, and more. Users can create rich text content by applying simple syntax, making it a versatile tool for adding context, explanations, or instructions alongside code. For instance, most of this workshop materials are written in `Markdown` cells.

### <a id='toc4_2_3_'></a>[Raw Cell](#toc0_)

[Top](#top)

Raw cells are uncommonly used and are primarily used for storing unformatted text or content that should not be executed. Raw cells allow users to enter text that will be included in the notebook metadata but will not be formatted or executed as code or markdown. This feature might be useful for including raw data or annotations that don't require formatting. 

An example of a `Raw` cell is shown below. In practice, you might not use it very often, or at all, as the combination of `Code` and `Markdown` cells can already fulfil our requirements.

# <a id='toc5_'></a>[Working with `NumPy` and `Matplotlib`](#toc0_)

[Top](#top)

`NumPy` is the core library for scientific computing in Python. It provides high-performance arrays and matrices for efficient data manipulation. `Matplotlib` provides useful functions for data visualisations. As we already discussed their capabilities in the lecture session, some tasks are set up for you to practice. 

Let's import the relavent libraires first!

In [None]:
%%time 

# import numpy library 
import numpy as np 
print(f'NumPy version: {np.__version__}')

# import matplotlib for visualisation, you'll be learning this later
import matplotlib
from matplotlib import pyplot as plt
print(f'Matplotlib version: {matplotlib.__version__}')

# import time and other modules
import time
from IPython.display import clear_output

def animate_3d_mat(mat, t=0.1):
    ''' display a 3D matrix as animation, each layer will be considered as an image frame (0.1 frame/sec)
        mat: TxMxN matrix (ndarray in NumPy)
        T: would be the time ticks
    '''
    if not isinstance(mat,np.ndarray):
        mat = np.array(mat) 
    if mat.ndim != 3:
        print(f'ERROR: the input must be a 3D matrix, current dimension = {mat.ndim}')
        return 
    for layer in mat: 
        clear_output(wait=True)
        display_as_image(layer)
        if t>0:
            time.sleep(t)

def is_2d_array(arr):
    ''' return true if the input is a 2D array of python list of list '''
    if not isinstance(arr, np.ndarray) and not isinstance(arr,list):
        print('ERROR: the input must be either NumPy ndarray or Python 2D list')
        return False
    if isinstance(arr, np.ndarray) and arr.ndim != 2:
        print(f'ERROR: the input NumPy array must be a 2D array, current dim = {arr.ndim}')
        return False
    if isinstance(arr, list):
        if len(arr) == 0:
            return False
        if not isinstance(arr[0], list):
            print(f'ERROR: the input Python list must be a list of list')
            return False
    return True

def print_data(arr):
    ''' print numpy array and python list raw data separated by space  '''
    if not is_2d_array(arr):
        return 
    for row in arr:
        for col in row:
            print(f'{col}\t', end=' ')
        print()

def print_text(arr): 
    ''' print numpy array and python list data as text without space '''
    if not is_2d_array(arr):
        return 
    for row in arr:
        print(''.join([chr(c) for c in row]))

def display_as_image(arr): 
    ''' display a 2D array as image '''
    if not is_2d_array(arr):
        return
    arr = np.array(arr)
    plt.imshow(arr)
    plt.axis('off')
    plt.show()

## <a id='toc5_1_'></a>[Task 1: Array manipulation](#toc0_)

Create a 2D array with random integers and sort each row in ascending order.
Concatenate two arrays vertically and horizontally.

In [None]:
%%time

# YOUR CODE HERE

# create a 2D array with random integers

# sorting each row in ascending order

# concatenate two arrays vertically

# horizontally

# and as layers in the third dimension


## <a id='toc5_2_'></a>[Taks 2: Plot the first 10 rows of your matrix](#toc0_)

Now, make use of the matrix (or 2D array) you created in the last task since all variables created in the notebook are available during the session. They will be ready to use unless the kernel is restarted. Plot the first 10 rows in the same plot.

In [None]:
%%time 

# YOUR CODE HERE

# create a 2D integer matrix of size 50x100

# sort the rows according to the columns, if the 1st column is equal, then compare the 2nd, 3rd, and so on

# plot the first 10 rows on the same plot


## <a id='toc5_3_'></a>[Task 3: Implement the Game of Life](#toc0_)

Implement the Game of Life on a 50x100 grid board using NumPy arrays and the rules of the game to simulate cell evolution over 60 generations (increase the number of generations if you like).

**The Rules** 

The universe of the Game of Life is an infinite, two-dimensional orthogonal grid of square cells, each of which is in one of two possible states, live or dead (or populated and unpopulated, respectively). Every cell interacts with its eight neighbours, which are the cells that are horizontally, vertically, or diagonally adjacent. At each step in time, the following transitions occur:

- Any live cell with fewer than two live neighbours dies, as if by underpopulation.
- Any live cell with two or three live neighbours lives on to the next generation.
- Any live cell with more than three live neighbours dies, as if by overpopulation.
- Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.

The initial pattern constitutes the seed of the system. The first generation is created by applying the above rules simultaneously to every cell in the seed, live or dead; births and deaths occur simultaneously, and the discrete moment at which this happens is sometimes called a tick. Each generation is a pure function of the preceding one. The rules continue to be applied repeatedly to create further generations.

Check [[**the Wiki papge**](https://www.wikiwand.com/en/Conway%27s_Game_of_Life)] for more information)

In [None]:
%%time

# define the parameters
T, M, N = 60, 50, 100
mat = np.zeros((T,M,N))

# YOUR CODE HERE, fill the 3D matrix mat

# animate the generations layer by layer
animate_3d_mat(mat,0.1)

Challenge yourself to explore additional initial states that enable the Game of Life to run infinitely.

## <a id='toc5_4_'></a>[Task 4: The magic square](#toc0_)

[Top](#top)

A square matrix with positive integers between `1 - N` is a magic square if the sums of the numbers in each row, each column, and both main diagonals are the same. For example, the `3x3` magic square shown below has the sum of 15 for each row, column, and two diagonals. The size sometime is called its `order`.

```python
    2 7 6
    9 5 1
    3 4 8
```

The concept of the 3x3 magic square traces back to ancient China, as early as 650 BCE, documented in the legend of the ***Lo Shu*** (洛书) or the "***scroll of the river Lo***". Legend has it that during an ancient flood in China, King Yu endeavoured to divert the water. During this time, a turtle emerged from the floodwaters, bearing an intriguing pattern on its shell: a 3×3 grid with circular dots of numbers. The arrangement was such that the sum of numbers in each row, column, and diagonal equalled the magical number 15, as shown below.


![Luo Shu](https://upload.wikimedia.org/wikipedia/commons/e/e2/Magic_square_Lo_Shu.png)

Chinese mathematicians were familiar with the third-order magic square as early as 190 BCE. This marks the earliest recorded instance of a magic square in human history.

Create a function that returns the magic square of given size `N`. The `sum` or the *magic number* `M` can be calculated using the equation: 

```latex
M = n(n^2+1)/2
```

In [None]:
%%time 

def magic_square(n):
    ''' generate a magic square of size n with magic number n(n^2+1)/2
        return empty matrix if n < 2 
    '''

    # YOUR CODE HERE
    
    return np.array([[2,7,6],[9,5,1],[3,4,8]])

# show the magic square
N = 3
m = magic_square(N)
print_data(m)

## Task 5: Plotting the Monte Carlo `pi`

[Top](#top)

Using `matplotlib` to simulate the calculation of `pi` using Monte Carlo method. This is a very simple mathematical process so you can focus on the plotting capabilities of `matplotlib`. 

![Monte Carlo Calculation of PI](https://upload.wikimedia.org/wikipedia/commons/thumb/d/d4/Pi_monte_carlo_all.gif/406px-Pi_monte_carlo_all.gif)

In [None]:
%%time

# YOUR CODE HERE

# <a id='toc6_'></a>[Note](#toc0_)

No submission is required for this workshop.