---
# INTERMEDIATE PYTHON PROGRAMMING
# CHAPTER 5 - Use of AI, Data Visualization & Getting Started with Cloud Computing
---

# Use of AI for Programming Learning and Programming Productivity

![Github Copilot](https://code.visualstudio.com/assets/docs/copilot/shared/github-copilot-social.png)

## AI for Programming Learning
- Chatbots guide you through concepts like **loops**, **functions**, and **object-oriented programming**
- Auto-generated code explanations
- AI can translate code from one language to other languages
- It provides quick solutions for common errors and best practices.

## AI for Programming Productivity
- Code Assistant - code completion, generate code snippets
- Debugging - identify errors and recommend fixes instantly
- Generate comments and documentation, making the code more readable.

Many Dev Tools come with built-in AI features.  Some already have generative AI features. **Github Copilot** is an example.  Other similar products include Tabnine, CodeWhisperer. 

[Click here to find out more about Github Copilot](https://github.com/features/copilot)




## Sample Data for Practicing Generative AI Prompts

This part will demonstrates how to use GenAI to generate program codes and use them in notebook.

### Importing pandas

In [None]:
import pandas as pd

### Creating `students` sample DataFrame

In [None]:
students = pd.DataFrame({
    "StudentID": ["1001", "1002", "1003", "1004", "1005"],
    "Name": ["Andy", "Ben", "Cathy", "Debra", "Eva"],
    "Age": [20, 22, 23, 22, 21],
    "Sex": ["male", "male", "female", "female", "female"],
    "Year": [1, 3, 4, 3, 2]
})
students.set_index('StudentID', inplace=True) # set StudentID as indexing column
students

In [None]:
students

### Showing meta data

In [None]:
students.info()

In [None]:
students.index

## Prompt Examples
Let's use Microsoft Copilot to practice.

Go to [https://copilot.microsoft.com/](https://copilot.microsoft.com/) and use the following prompt for practicing

### simple prompts
- simple prompts are shorter  
- quicker to type  
- the codes provided are usually not directly usable, instead it need revising  
```
pandas select row or rows

explain pandas iloc function

compare pandas iloc function to loc function

delete row in pandas dataframe

pandas drop a column

explain pandas dataframe dropna() function

```

### complex prompts ###
- longer, more specific
- take more efforts to describe and type
- the codes provided are very likely directly usable in your notebook

```
a pandas dataframe named 'students', how to delete the column named 'Name'

a pandas dataframe named 'students', how to delete the column named 'Name' and 'Sex'

a pandas dataframe named 'students', show all the rows that have 'female' value for 'Sex' column

a pandas dataframe named 'students', show all the rows that have 'female' value for 'Sex' column or value for 'Year' column is '3'
```

# WHAT IS DATA VISUALIZATION?

![Data Visualization](https://echarts.apache.org/en/asset/theme/thumb/vintage.png?_v_=20240226)

Data visualization is the process of representing data graphically to **make complex information easier to understand and analyze**. It uses charts, graphs, maps, and other visual formats to reveal patterns, trends, and relationships within data.

Data visualization is widely used in business analytics, scientific research, finance, healthcare, and AI to extract valuable insights.

## Why is Data Visualization Important?

**Simplifies Complex Data** – Helps make large datasets more digestible.

**Enhances Decision-Making** – Identifies trends and insights quickly.

**Highlights Patterns & Outliers** – Makes anomalies and correlations easier to spot.

**Engages & Communicates Effectively** – Improves storytelling with data.

## Introduce Types for Charts

![Chart Types](https://res.cloudinary.com/dry8rzbyx/image/fetch/s--qnRgKQnZ--/f_auto/q_auto/c_scale,w_550/https://www.knime.com/sites/default/files/public/2024-03/echarts-blog-header.jpg)

1. **Bar Chart** 📊  Used to compare categories or groups using rectangular bars.  
Example: Comparing sales revenue across different product categories.
1. **Line Chart** 📈  Shows trends over time by connecting data points with a continuous line.  
Example: Tracking stock prices or temperature changes.
1. **Pie Chart** 🥧  Represents proportions in a circular format, dividing sections based on percentage values.  
Example: Market share distribution among competitors.  
1. **Scatter Plot** ⚫  Displays relationships between two continuous variables, highlighting correlations.  
Example: Examining height vs. weight of individuals.
1. **Histogram** 📉  Used to show the distribution of data, grouping values into bins.  
Example: Examining student test score frequency.
1. **Heatmap** 🔥  Represents density or intensity using colors in a grid format.  
Example: Website click heatmaps or correlation matrices.
1. **Box Plot** (Box-and-Whisker Plot) 📦  Summarizes data distribution through quartiles and identifies outliers.  
Example: Analyzing salary distributions across industries.

Each chart serves a unique purpose depending on the dataset and the insights needed.

## When to use which type of chart
There are so many ways to visualise data – how do we know which one to pick?

Financial Times compiles this [Visual Vocabulary](https://ft-interactive.github.io/visual-vocabulary/) help you understand different types of charts and when to use them.

![Visual Vocabulary](./images/ft-visual-vocabulary.png)

# Matplotlib Introduction
Matplotlib is a low level graph plotting library for python.

## Import `matplotlib`

Most of the features that we will use belong to `pyplot` submodule.

The sub module is usually imported as `plt` alias.

```
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
```

`%matplotlib inline` is a magic function used in Jupyter Notebook to display matplotlib plots directly within the notebook instead of opening a separate window.

You can try to ask AI to explain it
```
what is python %matplotlib inline
```

If your chart is not showing up, it's probably because of the notebook backend issue.  try run `%matplotlib inline` again.  If it still doesn't work, add `plt.show()` as the last line of each chart generation.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

## Simple Line Plot

**Syntax**:

Use `plot()` function to draw points.

By default, the `plot()` function draws a **line** from point to point.

It takes TWO parameters to make a point: 

* `x` is an array containing the points on the x-axis.

* `y` is an array containing the points on the y-axis.

```
plt.plot(x, y)
```

* `x`: a numpy array for x-axis
* `y`: a numpy array for y-axis

**`plot()` function generates line plot by default**:
```
n = np.arange(1,11) # a range of number from 1 to 10

plt.plot(n, n*2)
```


In [None]:
n = np.arange(1,11) # a range of number from 1 to 10
plt.plot(n, n*2)

## Plot Without Line (Markers Only)

Add a third parameter to specify the marker so as to plot without line. 

```
plt.plot(n, n*2, 'o') # the third parameter 'o' is the marker shape
```

In [None]:
plt.plot(n, n*2, 'o') # the third parameter 'o' is the marker shape

## Other Marker Shapes

Other markers to consider:

- `'o'`	Circle	
- `'*'`	Star	
- `'.'`	Point	
- `','`	Pixel	
- `'x'`	X	
- `'X'`	X (filled)	
- `'+'`	Plus	
- `'P'`	Plus (filled)	
- `'s'`	Square	
- `'D'`	Diamond	
- `'d'`	Diamond (thin)	
- `'p'`	Pentagon	
- `'H'`	Hexagon	
- `'h'`	Hexagon	
- `'v'`	Triangle Down	
- `'^'`	Triangle Up	
- `'<'`	Triangle Left	
- `'>'`	Triangle Right	
- `'1'`	Tri Down	
- `'2'`	Tri Up	
- `'3'`	Tri Left	
- `'4'`	Tri Right	
- `'|'`	Vline	
- `'_'`	Hline

Examples:
```
plt.plot(n, n*2, 'D') # diamond marker
```

In [None]:
plt.plot(n, n*2, 'D') # diamond marker

## Plotting Both Line and Markers

Add a parameter name `marker` to your function to keep both line and markers.

**Examples**:
```
n = np.arange(1,11) # a range of number from 1 to 10
plt.plot(n, n*2, marker='D') # diamond marker
```

**Specifying Marker Size**:
```
n = np.arange(1,11) # a range of number from 1 to 10
plt.plot(n, n*2, marker='o', ms=10) # ms: marker size
```

**Specifying Marker Fill Color**:
```
plt.plot(n, n*2, marker='o', mfc='#FF00FF') # mfc is marker fill color
```

In [None]:
plt.plot(n, n*2, marker='o', mfc='#FF00FF') # mfc is marker fill color

## Line Styles

Add parameter `ls` to specify the line style

* `-`: solid line
* `:`: dotted line
* `--`: dashed line
* `-.`: dashdot line

Example:
```
plt.plot(n, n*2, ls=":") # ls: line style
```

Add `color` or `c` parameter to specify the line color
```
plt.plot(n, n*2, color="purple") # color: line color
```

In [None]:
plt.plot(n, n*2, color="purple") # color: line color

## Help with `plot()` function

Type
```
?plt.plot
```

or

```
help(plt.plot)
```

In [None]:
?plt.plot

## Plot with Real Data

**import graduates.csv data file**
```
gra = pd.read_csv('./data/graduates.csv') 
```

In [None]:
gra = pd.read_csv('./data/graduates.csv')

In [None]:
gra

**use `unique()` function to list possible column value**
```
gra['AcademicYear'].unique()
gra['LevelOfStudy'].unique()
gra['ProgrammeCategory'].unique()
gra['Sex'].unique()
```

In [None]:
gra['AcademicYear'].unique()

In [None]:
gra['LevelOfStudy'].unique()

In [None]:
gra['ProgrammeCategory'].unique()

In [None]:
gra['Sex'].unique()

**select subset of undergraduate in 'Business and Management' and male students only**
```
ug_bm_m = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['ProgrammeCategory']=='Business and Management') 
                    & (gra['Sex']=='Male')]
ug_bm_m.head() # shows the rows previews
ug_bm_m.info() # show dataframe information
```

In [None]:
ug_bm_m = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['ProgrammeCategory']=='Business and Management') 
                    & (gra['Sex']=='Male')]

In [None]:
ug_bm_m.head() # shows the rows previews

In [None]:
ug_bm_m.info() # show dataframe information

**Extact the required columns in the format of numpy array for x-axis and y-axis**

extract `AcademicYear` column as `year` numpy array
```
year = ug_bm_m['AcademicYear']
```

extract `Headcount` column as `headcount` numpy array
```
headcount = ug_bm_m['Headcount'] 
```

`year` as x-axis and `headcount` as y-axis
```
plt.plot(year, headcount)
```

In [None]:
year = ug_bm_m['AcademicYear']

In [None]:
headcount = ug_bm_m['Headcount'] 

In [None]:
plt.plot(year, headcount)

**Plot without providing x column**:

you can plot without providing x-axis. the index number/row number will be automatically used as values for x-axis
```
plt.plot(headcount) 
```

In [None]:
plt.plot(headcount)

## Rotating the xtickes

Call `xticks()` function to rotate the xticks.  

It requires a `rotation` degree in as parameter.

Example:
```
plt.xticks(rotation=90)
plt.plot(year, headcount)
```

In [None]:
plt.xticks(rotation=90)
plt.plot(year, headcount)

## Setting Title

**Use `title()` function to add plot title**

Example:
```
plt.title('Undergraduate Students \n Business and Management')

plt.plot(year, headcount)
```

You can specify title location with `loc` paremeter
```
plt.title('Undergraduate Students \n Business and Management', loc="right")
plt.title('Undergraduate Students \n Business and Management', loc="left")

```

You can specify title font size with `fontsize` parameter
```
plt.title('Undergraduate Students \n Business and Management', fontsize=22) 
```

In [None]:
plt.title('Undergraduate Students \n Business and Management') 
plt.plot(year, headcount)

## Setting Axis Labels

**Use `xlabel()` and `ylabel()` to add axis labels**

Example:
```
plt.xlabel('Academic Year')
plt.xticks(rotation=60)
plt.ylabel('Student Headcount', fontsize=14)
plt.plot(year, headcount)
```

In [None]:
plt.xlabel('Academic Year')
plt.ylabel('Student Headcount', fontsize=14)
plt.xticks(rotation=60)
plt.plot(year, headcount)

## Plotting Multiple Lines

You can plot multiple lines simply by adding more `plot()` function.

**Extracting Female undergraduate in 'Business and Management' for another plot**
```
ug_bm_f = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['ProgrammeCategory']=='Business and Management') 
                    & (gra['Sex']=='Female')]
```

**Plotting multiple lines**
```
plt.plot(ug_bm_f['AcademicYear'], ug_bm_f['Headcount'])
plt.plot(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'])
```

In [None]:
ug_bm_f = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['ProgrammeCategory']=='Business and Management') 
                    & (gra['Sex']=='Female')]
ug_bm_f.head()

In [None]:
plt.plot(ug_bm_f['AcademicYear'], ug_bm_f['Headcount'])
plt.plot(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'])

## Add Labels to Line and Show Legend


**Adding Labels and Color to Plots**
In the above plots, you can't tell which line is which. 

Let's add label and color to differentiate the lines
```
plt.plot(ug_bm_f['AcademicYear'], ug_bm_f['Headcount'], label = "Female", c="tomato")

plt.plot(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'], label = "Male")

plt.legend() # call this function to show legend
```

You can change the location of legend if it's blocking your lines
```
plt.legend(loc="center right")
```

In [None]:
plt.plot(ug_bm_f['AcademicYear'], ug_bm_f['Headcount'], label = "Female", c="tomato")
plt.plot(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'], label = "Male")
plt.legend()

## Configuring Grid Lines

**Call `grid()` function to configure grid line**

Example:
```
plt.plot(ug_bm_f['AcademicYear'], ug_bm_f['Headcount'], label = "Female", c="tomato")

plt.plot(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'], label = "Male")

plt.grid() # shows both x and y axis grid lines

plt.grid(axis='y') # shows y-axis grid lines only

plt.grid(axis='x') # shows x-axis grid lines only

```

**Configuring `color` and `linewidth`**
```
plt.grid(c="purple", linewidth=0.1)
```


**Check help for more on configuring grid lines**
```
?plt.grid
help(plt.grid)
```


In [None]:
plt.plot(ug_bm_f['AcademicYear'], ug_bm_f['Headcount'], label = "Female", c="tomato")
plt.plot(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'], label = "Male")
plt.grid() # shows both x and y axis grid lines
plt.grid(axis='y') # shows y-axis grid lines only
plt.grid(axis='x') # shows x-axis grid lines only
plt.grid(c="purple", linewidth=0.1)

## Subplots

To draw multiple plots in one figure, use `subplot()` function to specify the the number of `row`, `column` and the `position` of current plot.

**Example 1: 1 Row x 2 Columns**
```
plt.subplot(1,2,1) # the first two number are row and column while the last number is position
plt.plot(ug_bm_f['AcademicYear'], ug_bm_f['Headcount'], label = "Female", c="tomato")
plt.title("Female")

plt.subplot(1,2,2)
plt.plot(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'], label = "Male")
plt.title("Male")
```

In [None]:
plt.subplot(1,2,1) # the first two number are row and column while the last number is position
plt.plot(ug_bm_f['AcademicYear'], ug_bm_f['Headcount'], label = "Female", c="tomato")
plt.title("Female")

plt.subplot(1,2,2)
plt.plot(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'], label = "Male")
plt.title("Male")


**Example 2: Two Rows x 1 Column**
```
plt.subplot(2,1,1)
plt.plot(ug_bm_f['AcademicYear'], ug_bm_f['Headcount'], label = "Female", c="tomato")
plt.title("Female")

plt.subplot(2,1,2)
plt.plot(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'], label = "Male")
plt.title("Male")
```

In [None]:
plt.subplot(2,1,1)
plt.plot(ug_bm_f['AcademicYear'], ug_bm_f['Headcount'], label = "Female", c="tomato")
plt.title("Female")

plt.subplot(2,1,2)
plt.plot(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'], label = "Male")
plt.title("Male")

## Bar Plot

bar plot can be either vertical or horizontal

**Example**:

```
plt.bar(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'])
```

In [None]:
plt.bar(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'])

**Example:**
```
plt.barh(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'])
```

In [None]:
plt.barh(ug_bm_m['AcademicYear'], ug_bm_m['Headcount'])

## Histograms

A histogram is a graph showing frequency distributions.
```
heights = np.random.normal(170, 10, 250)
print(heights)

plt.hist(heights)
```



In [None]:
heights = np.random.normal(170, 10, 250)
print(heights)

plt.hist(heights)

## Pie Chart

Pie Chart lets you emphasize the portion of whole.

**Use `unique()` function to show the distinct values of a column**
```
gra['LevelOfStudy'].unique()
gra['ProgrammeCategory'].unique()
```

In [None]:
gra['LevelOfStudy'].unique()

In [None]:
gra['ProgrammeCategory'].unique()

**Exacting Rows**
```
engr_m_2023 = gra[(gra['ProgrammeCategory']=="Engineering and Technology") &
             (gra['Sex']=='Male') &
             (gra['AcademicYear']=='2023/24')]

ug_m_2023 = gra[(gra['LevelOfStudy']=="Undergraduate") &
             (gra['Sex']=='Male') &
             (gra['AcademicYear']=='2023/24')]
```

In [None]:
engr_m_2023 = gra[(gra['ProgrammeCategory']=="Engineering and Technology") &
             (gra['Sex']=='Male') &
             (gra['AcademicYear']=='2023/24')]

ug_m_2023 = gra[(gra['LevelOfStudy']=="Undergraduate") &
             (gra['Sex']=='Male') &
             (gra['AcademicYear']=='2023/24')]

**Plotting Pie-chart with Labels**
```
plt.pie(engr_m_2023['Headcount'], labels=engr_m_2023['LevelOfStudy'])
plt.show()

plt.pie(ug_m_2023['Headcount'], labels=ug_m_2023['ProgrammeCategory'])
plt.show()
```

In [None]:
plt.pie(engr_m_2023['Headcount'], labels=engr_m_2023['LevelOfStudy'])

In [None]:
plt.pie(engr_m_2023['Headcount'], labels=engr_m_2023['LevelOfStudy'])
plt.show()

In [None]:
plt.pie(ug_m_2023['Headcount'], labels=ug_m_2023['ProgrammeCategory'])
plt.show()

## Saving charts for later use

Use `savefig()` function export generated charts as external files so that you can use them in your Power Point presentation or Word reports.

```
plt.pie(ug_m_2023['Headcount'], labels=ug_m_2023['ProgrammeCategory'])
plt.savefig("./charts/pie.png")
plt.show()
```

In [None]:
plt.pie(ug_m_2023['Headcount'], labels=ug_m_2023['ProgrammeCategory'])
plt.savefig("./charts/pie.png")
plt.show()

## More on Matplotlib
[Matplotlib Tutorial](https://matplotlib.org/stable/tutorials/index.html)

# Using Pandas Built-in Plotting

The plot method on `Series` and `DataFrame` is just a simple wrapper around Matplotlib `plt.plot()`

Using the Pandas built-in plotting is very handy. 

## Importing Packages

**Required Imports**
```
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
```

**plot inline**:
```
%matplotlib inline
```

**reading students data file**
```
students = pd.read_excel('./data/students.xlsx') 
```

In [None]:
students = pd.read_excel('./data/students.xlsx') 
students

**simple plot**
```
students.plot()

students.plot.bar()

students.plot(x='AcademicYear', y='Undergraduate')
```

The built-in plot is good enough to grab an early insight about your data.

When you need some theming and styling on your plots, go for Seaborn

In [None]:
students.plot()

In [None]:
students.plot.bar()

In [None]:
students.plot(x='AcademicYear', y='Undergraduate')

## Pandas Built-in Plot Types

Beside the default line-plot, you can choose the following plot types.

* `bar` or `barh` for bar plots
* `hist` for histogram
* `box` for boxplot
* `kde` or ‘density’ for density plots
* `area` for area plots
* `scatter` for scatter plots
* `hexbin` for hexagonal bin plots
* `pie` for pie plots


Example:
```
students.plot(kind='bar')

students.plot(kind='kde')

students.plot(kind='box')
plt.xticks(rotation=90)

students.plot(x='AcademicYear', y='Undergraduate')
```

alternative syntax:
```
students.plot.bar()
students.plot.bar(stacked=True)
students.plot.kde()

```

In [None]:
students.plot(kind='bar')

In [None]:
students.plot(kind='kde')

In [None]:
students.plot(kind='box')
plt.xticks(rotation=90)

In [None]:
students.plot(x='AcademicYear', y='Undergraduate')

## More on pandas chart visualization

[Pandas Chart Visualization](https://pandas.pydata.org/docs/user_guide/visualization.html)

# Using Seaborn

* Seaborn is a library that uses Matplotlib underneath to plot graphs.
* Matplotlib usually requires numpy array as parameter while seaborn is friendly to pandas `DataFrame`. (Remember: `DataFrame` is the best form for data analysis?)
* Seaborn offers built-in themes/styles and therefore makes plotting easier.
* Seaborn is not to replace Matplotlib but complete Matplotlib.  
* Seaborn and Matplotlib are usually used together.

## Required Imports
```
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
```

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

## Built-in Seaborn Data Sets

Use the following command to load the names of built-in data sets
```
sns.get_dataset_names()
```

To load built-in data set
```
iris = sns.load_dataset('iris')
```

Simple Scatter Plot
```
sns.scatterplot(data=iris, x='sepal_length', y='sepal_width')
sns.scatterplot(data=iris, x='petal_length', y='petal_width')
```

In [None]:
sns.get_dataset_names()

In [None]:
iris = sns.load_dataset('iris')

In [None]:
sns.scatterplot(data=iris, x='sepal_length', y='sepal_width')

In [None]:
sns.scatterplot(data=iris, x='petal_length', y='petal_width')

## Loading Data

**Reading CSV File**
```
gra = pd.read_csv('./data/graduates.csv') 
```

**Extracting Rows**
```
ug_bm_m = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['ProgrammeCategory']=='Business and Management') 
                    & (gra['Sex']=='Male')]
```

**Extracting the column required for plotting**
```
year = ug_bm_m['AcademicYear']
headcount = ug_bm_m['Headcount']
```

In [None]:
gra = pd.read_csv('./data/graduates.csv') 

In [None]:
ug_bm_m = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['ProgrammeCategory']=='Business and Management') 
                    & (gra['Sex']=='Male')]

In [None]:
year = ug_bm_m['AcademicYear']
year

In [None]:
headcount = ug_bm_m['Headcount']
headcount

## Matplotlib Style

Before we switching to Seaborn, let's generate a Matplotlib style chart here so that you compare side-by-side and see how different between Matplotlib charts and Seaborn charts.
```
plt.plot(year, headcount)
```

In [None]:
plt.plot(year, headcount)

## Set to Seaborn Style

Call `set()` function to activate seaborn style

activate to seaborn style
```
sns.set() 
```

use matplotlib plot() function for plotting
```
plt.plot(year, headcount) 
```

In [None]:
sns.set() 
plt.plot(year, headcount) 

## Seaborn and Matplotlib are Usually Used Together

In the example below, we call seaborn's `set()` to activate Seaborn style. 

Then we use matplotlib's `xticks()` to set x tick roration degree and also call its `plot()` function for plot generation.

**Example:**
```
plt.xticks(rotation=90) # calling Matplotlib xticks rotation function
plt.plot(year, headcount)

```

In [None]:
sns.set() 
plt.xticks(rotation=90) # calling Matplotlib xticks rotation function
plt.plot(year, headcount)

## Style
Seaborn splits the Matplotlib parameters into two groups

* Plot styles
* Plot scale


Use seaborn's `set_style()` function to manipulate the styles.

Below are some themes
* `darkgrid`
* `whitegrid`
* `dark`
* `white`
* `ticks`

**Example**:
```
sns.set_style('white')
plt.plot(year, headcount)
```

**More on `set_style()`**:

[set_style()](https://seaborn.pydata.org/generated/seaborn.set_style.html)

In [None]:
sns.set_style('white')
plt.plot(year, headcount)

## Removing Axes Spines

Call `despine()` function to remove the spine to achieve a **cleaner** output.

```
sns.set_style('white')
sns.despine()
plt.plot(year, headcount)
```

In [None]:
sns.set_style('white')
sns.despine()
plt.plot(year, headcount)

## Customizing Axes Style

Call `sns.axes_style()` to show the current axes style.
```
sns.axes_style()
```

**To change the axes style**:
```
sns.set_style("darkgrid", {'axes.facecolor': 'yellow', 'grid.color': '.8'})
plt.plot(year, headcount)

sns.set_style("darkgrid", {'axes.facecolor': 'white', 'grid.color': '.8'})
plt.plot(year, headcount)

```

In [None]:
sns.axes_style()
sns.set_style("darkgrid", {'axes.facecolor': 'yellow', 'grid.color': '.8'})
plt.plot(year, headcount)

In [None]:
sns.set_style("darkgrid", {'axes.facecolor': 'white', 'grid.color': '.8'})
plt.plot(year, headcount)

## Default Color Palettes

**Use `color_palette()` to give colors to plots and adding more aesthetic value to it**

```
current_palette = sns.color_palette()
sns.palplot(current_palette) # paplot() functions plot the array of colors horizontally
plt.show()
```

In [None]:
current_palette = sns.color_palette()
sns.palplot(current_palette) # paplot() functions plot the array of colors horizontally
plt.show()

## Ready to Use Palette

**Some built-in seaborn color palette**:
* `deep`
* `muted`
* `bright`
* `pastel`
* `dark`
* `colorblind`

**Show the palette**:
```
sns.palplot(sns.color_palette('pastel')) 
plt.show()
```

**Show more palettes**:
```
sns.color_palette("Set1")
sns.color_palette("Set2")
sns.color_palette("Set3")
```

In [None]:
sns.palplot(sns.color_palette('pastel')) 
plt.show()

## Reset Style

call `reset_defaults()` to reset the seaborn style.

```
sns.reset_defaults()
%matplotlib inline
```

In [None]:
sns.reset_defaults()
%matplotlib inline

## Line Plot

**Examples: undergraduate business management (male students only)**
```
ug_bm_m
sns.lineplot(x='AcademicYear', y='Headcount', data=ug_bm_m)
```

In [None]:
ug_bm_m
sns.lineplot(x='AcademicYear', y='Headcount', data=ug_bm_m)

**Examples: undergraduate business management**
```
ug_bm = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['ProgrammeCategory']=='Business and Management')]
ug_bm


plt.xticks(rotation=90)
sns.lineplot(x='AcademicYear', 
             y='Headcount', 
             data=ug_bm, 
             hue='Sex', 
             marker='o')
```

In [None]:
ug_bm = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['ProgrammeCategory']=='Business and Management')]
ug_bm

In [None]:
plt.xticks(rotation=90)
sns.lineplot(x='AcademicYear', 
             y='Headcount', 
             data=ug_bm, 
             hue='Sex', 
             marker='o')

**Examples: undergraduate female student**
```
ug_f = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['Sex']=='Female')]
ug_f

plt.xticks(rotation=90)
sns.lineplot(x='AcademicYear', y='Headcount', data=ug_f, hue='ProgrammeCategory')
plt.legend(loc='upper left')
```

In [None]:
ug_f = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['Sex']=='Female')]
ug_f

In [None]:
plt.xticks(rotation=90)
sns.lineplot(x='AcademicYear', y='Headcount', data=ug_f, hue='ProgrammeCategory')
plt.legend(loc='upper left')

## Scatter Plot

```
plt.xticks(rotation=90)
sns.scatterplot(x='AcademicYear', 
                y='Headcount', 
                data=ug_bm, 
                hue='Sex')
```

In [None]:
plt.xticks(rotation=90)
sns.scatterplot(x='AcademicYear', 
                y='Headcount', 
                data=ug_bm, 
                hue='Sex')

## Bar Plot

```
sns.color_palette("Set1")
```

```
sns.barplot(x='AcademicYear', 
            y='Headcount', 
            data=ug_bm_m, 
            hue='Headcount',
            palette=sns.color_palette("Set1")
           )
```


In [None]:
sns.color_palette("Set1")

In [None]:
sns.barplot(x='AcademicYear', 
            y='Headcount', 
            data=ug_bm_m, 
            hue='Headcount',
            palette=sns.color_palette("Set1")
           )

## Pie Chart

```
ug_m_2023 = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['Sex']=='Male')
                    & (gra['AcademicYear']=='2023/24')]
ug_m_2023

colors = sns.color_palette('pastel')
plt.pie(ug_m_2023['Headcount'], labels=ug_m_2023['ProgrammeCategory'], colors=colors)
plt.show()

colors = sns.color_palette('dark')
plt.pie(ug_m_2023['Headcount'], labels=ug_m_2023['ProgrammeCategory'], colors=colors)
plt.show()
```

In [None]:
ug_m_2023 = gra[(gra['LevelOfStudy']=='Undergraduate') 
                    & (gra['Sex']=='Male')
                    & (gra['AcademicYear']=='2023/24')]
ug_m_2023

In [None]:
colors = sns.color_palette('pastel')
plt.pie(ug_m_2023['Headcount'], labels=ug_m_2023['ProgrammeCategory'], colors=colors)
plt.show()

In [None]:
colors = sns.color_palette('dark')
plt.pie(ug_m_2023['Headcount'], labels=ug_m_2023['ProgrammeCategory'], colors=colors)
plt.show()

## Boxplots

```
ug_f

sns.catplot(x="ProgrammeCategory", y="Headcount", kind="box", data=ug_f)
plt.xticks(rotation=90)

ug = gra[gra['LevelOfStudy']=='Undergraduate']
ug

sns.catplot(x="ProgrammeCategory", y="Headcount", hue='Sex', kind="box", data=ug)
plt.xticks(rotation=90)
```

In [None]:
ug_f

In [None]:
sns.catplot(x="ProgrammeCategory", y="Headcount", kind="box", data=ug_f)
plt.xticks(rotation=90)

In [None]:
ug = gra[gra['LevelOfStudy']=='Undergraduate']
ug

In [None]:
sns.catplot(x="ProgrammeCategory", y="Headcount", hue='Sex', kind="box", data=ug)
plt.xticks(rotation=90)

## Build Your Own Palette

**Use `color_palette()` function to build you own palette**

```
sns.color_palette(n_colors=4) # specify number of colors to use
sns.palplot(sns.color_palette("Reds")) # specify a base color. Don't forget the ending 's'
sns.color_palette("light:purple") # light theme for a chosen color
sns.color_palette("light:#5A9") # light theme for a chosen color
sns.color_palette("dark:#f00") # dark theme for a chosen color
sns.color_palette("blend:#f00,#00F") # blending from one color to another color
```

**To reset style to default**
```
sns.reset_defaults()

```

**More on `color_palette()`**:

[seaborn.color_palette](https://seaborn.pydata.org/generated/seaborn.color_palette.html#seaborn.color_palette)

##### specify `n_colors`

```
sns.color_palette(n_colors=4)
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette(n_colors=4))
plt.show()
```

In [None]:
sns.color_palette(n_colors=4)
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette(n_colors=4))
plt.show()

### specify fading color series

```
sns.color_palette("Greens")
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette("Greens"))
plt.show()
```

In [None]:
sns.color_palette("Greens")
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette("Greens"))
plt.show()

### specify light color series using color name

```
sns.color_palette("light:purple")
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette("light:purple"))
plt.show()
```

In [None]:
sns.color_palette("light:purple")
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette("light:purple"))
plt.show()

### specify light color series using color code

```
sns.color_palette("light:#5A9")
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette("light:#5A9"))
plt.show()
```

In [None]:
sns.color_palette("light:#5A9")
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette("light:#5A9"))
plt.show()

### specify dark color series

```
sns.color_palette("dark:#f00")
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette("dark:#f00"))
plt.show()
```

In [None]:
sns.color_palette("dark:#f00")
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette("dark:#f00"))
plt.show()

### specify blend color series

```
sns.color_palette("blend:#f00,#00F")
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette("blend:#f00,#00F"))
plt.show()
```

In [None]:
sns.color_palette("blend:#f00,#00F")
plt.pie(ug_m_2023['Headcount'], 
        labels=ug_m_2023['ProgrammeCategory'], 
        colors=sns.color_palette("blend:#f00,#00F"))
plt.show()

## Grouping and Groups' Aggregation

```
grouped_by_level = gra.groupby(['LevelOfStudy', 'AcademicYear'])

type(grouped_by_level)

grouped_by_level.agg("sum")

plt.xticks(rotation=90)
sns.lineplot(x='AcademicYear', y='Headcount', 
             data=grouped_by_level.agg("sum"),
             hue='LevelOfStudy')
sns.set_style('whitegrid')
```


In [None]:
grouped_by_level = gra.groupby(['LevelOfStudy', 'AcademicYear'])

In [None]:
type(grouped_by_level)

In [None]:
grouped_by_level.agg("sum")

In [None]:
plt.xticks(rotation=90)
sns.lineplot(x='AcademicYear', y='Headcount', 
             data=grouped_by_level.agg("sum"),
             hue='LevelOfStudy')
sns.set_style('whitegrid')

## Seaborn Example Gallery

Check the webpage below for seaborn example gallery.

[Seaborn Example Gallery](https://seaborn.pydata.org/examples/index.html)

# JSON Live Data Loading & Visualization



## HTTP `requests` 

`requests` is the python package to get data by HTTP request

**To import**

```
import requests
```

In [None]:
import requests

**Provide a web page url (link) that you want to fetch data from**
```
hk_weather_api = 'https://data.weather.gov.hk/weatherAPI/opendata/weather.php?dataType=fnd&lang=tc'
hk_weather_api
```

In [None]:
hk_weather_api = 'https://data.weather.gov.hk/weatherAPI/opendata/weather.php?dataType=fnd&lang=tc'
hk_weather_api

**Issue HTTP GET requests and save server's response as variable `weather_responose`**
```
weather_response = requests.get(hk_weather_api)
```

## Starting HTTP request

In [None]:
weather_response = requests.get(hk_weather_api)

In [None]:
weather_response

**Exploring the response from server**

* `weather_response.status_code` # returns 200 if everything has gone fine 
* `weather_response.headers` # descriptive headers about server's response
* `weather_response.headers['content-type']` # returns 'application/json; charset=utf-8'
* `weather_response.encoding` # returns ''utf-8''
* `weather_response.text` # retrieve server's response in plain text format. type of str
* `weather_response.json()` # retrieve server's response in json format, type of dict

In [None]:
weather_response.status_code # returns 200 if everything has gone fine

In [None]:
weather_response.headers # descriptive headers about server's response

In [None]:
weather_response.headers['content-type'] # returns 'application/json; charset=utf-8'

In [None]:
weather_response.encoding # returns ''utf-8''

In [None]:
weather_response.text # retrieve server's response in plain text format. type of str

In [None]:
weather_response.json() # retrieve server's response in json format, type of dict

## Retrieving Data From JSON Object (Dictionary)

**How to retrieve JSON child element**
```
weather_9_days = weather_response.json()
type(weather_9_days) # json data as stored as Python dictionary
weather_9_days['generalSituation']
weather_9_days['weatherForecast']
weather_9_days['weatherForecast'][0]
weather_9_days['weatherForecast'][0]['week']
weather_9_days['weatherForecast'][0]['forecastWeather']
```

In [None]:
weather_9_days = weather_response.json()

In [None]:
weather_9_days

In [None]:
type(weather_9_days) # json data as stored as Python dictionary

In [None]:
weather_9_days['generalSituation']

In [None]:
weather_9_days['weatherForecast'][0]

In [None]:
weather_9_days['weatherForecast'][0]['week']

In [None]:
weather_9_days['weatherForecast'][0]['forecastWeather']

## Ploting Weather Forecast

**Retrieving weatherForecast element**
```
forecast = weather_9_days['weatherForecast']
forecast
type(forecast)
```

In [None]:
forecast = weather_9_days['weatherForecast']
forecast
type(forecast)

In [None]:
forecast

## Normalizing JSON dictionary to `DataFrame`

**Retrieve weatherForecast attribute and convert to DataFrame as flatten attributes**
```
forecast_normalized = pd.json_normalize(forecast, sep='_')
type(forecast_normalized)
forecast_normalized.info()
sns.lineplot(x='forecastDate', y='forecastMaxtemp_value', data=forecast_normalized)
```

In [None]:
forecast_normalized = pd.json_normalize(forecast, sep='_')
forecast_normalized

In [None]:
type(forecast_normalized)

## Plotting

In [None]:
sns.lineplot(x='forecastDate', y='forecastMaxtemp_value', data=forecast_normalized)

**Multiple Plots**
```
sns.set_style('darkgrid')
plt.xticks(rotation=90)
sns.lineplot(x='forecastDate', 
             y='forecastMaxtemp_value', 
             data=weather_normalized, 
             label='Max Temp', 
             marker='o')

sns.lineplot(x='forecastDate', 
             y='forecastMintemp_value', 
             data=weather_normalized, 
             label='Min Temp', 
             marker='o')
```

In [None]:
sns.set_style('darkgrid')
plt.xticks(rotation=90)
sns.lineplot(x='forecastDate', 
             y='forecastMaxtemp_value', 
             data=forecast_normalized, 
             label='Max Temp', 
             marker='o')

sns.lineplot(x='forecastDate', 
             y='forecastMintemp_value', 
             data=forecast_normalized, 
             label='Min Temp', 
             marker='o')

# EXERCISE: Interbank Liquidity

**The link for retrieving `daily figures interbank liquidity`**
```
inter_bank_liq_url = "https://api.hkma.gov.hk/public/market-data-and-statistics/daily-monetary-statistics/daily-figures-interbank-liquidity"
inter_bank_liq_url
```

## Constructing API url

In [None]:
inter_bank_liq_url = "https://api.hkma.gov.hk/public/market-data-and-statistics/daily-monetary-statistics/daily-figures-interbank-liquidity"
inter_bank_liq_url

**HTTP Request**
```
inter_liq_response = requests.get(inter_bank_liq_url)
type(inter_liq_response) # returns requests.models.Response
inter_liq_json = inter_liq_response.json()
type(inter_liq_json) # returns Dict
type(inter_liq_json["result"]["records"]) # returns list
```

In [None]:
inter_liq_response = requests.get(inter_bank_liq_url)

In [None]:
type(inter_liq_response) # returns requests.models.Response

## Converting HTTP request to JSON object

In [None]:
inter_liq_json = inter_liq_response.json()

In [None]:
type(inter_liq_json) # returns Dict

In [None]:
len(inter_liq_json)

In [None]:
inter_liq_json

In [None]:
inter_liq_json["result"]["records"]

In [None]:
type(inter_liq_json["result"]["records"])

## Converting JSON to Pandas `DataFrame`

```
inter_bank_liq_df = pd.DataFrame(inter_liq_json["result"]["records"])
```

In [None]:
inter_bank_liq_df = pd.DataFrame(inter_liq_json["result"]["records"])

In [None]:
type(inter_bank_liq_df)

In [None]:
inter_bank_liq_df.head(10)

In [None]:
inter_bank_liq_df[2:5]

**Let's try to directly `inter_liq_json` to JSON object and see what happen**
```
pd.DataFrame(inter_liq_json)
pd.json_normalize(inter_liq_json)
```

The results are NOT quite what we want.

In [None]:
pd.DataFrame(inter_liq_json)

In [None]:
pd.json_normalize(inter_liq_json)

## Data Tidying
- Reverse the order / sort by data in an ascending order
- reset the index number for rows

In [None]:
inter_bank_liq_df = inter_bank_liq_df.sort_values(by="end_of_date")
inter_bank_liq_df

In [None]:
inter_bank_liq_df = inter_bank_liq_df.reset_index(drop=True)
inter_bank_liq_df

In [None]:
inter_bank_liq_df.head()

## Plotting

```
inter_bank_liq_df.plot() # pandas built-in plots. use all the number column for plotting. Not a good plot.
```

In [None]:
inter_bank_liq_df.plot() # pandas built-in plots. use all the number column for plotting. Not a good plot.

In [None]:
inter_bank_liq_df.plot(y='hibor_fixing_1m') # pandas built-in plots

In [None]:
inter_bank_liq_df.plot(y='hibor_fixing_1m') # pandas built-in plots

In [None]:
inter_bank_liq_df.info()

**hibor_fixing_1m & hibo_overngith Overlay**
```
plt.xticks(rotation=90)
plt.xticks([]) # removes xticks
plt.yticks([]) # removes yticks
sns.lineplot(x='end_of_date', y='hibor_fixing_1m', data=inter_bank_liq_df, label="Hibor One Month")
sns.lineplot(x='end_of_date', y='hibor_overnight', data=inter_bank_liq_df, label="Hibor Overnight")
```

In [None]:
plt.xticks(rotation=90)
plt.xticks([]) # removes xticks
plt.yticks([]) # removes yticks
sns.lineplot(x='end_of_date', y='hibor_fixing_1m', data=inter_bank_liq_df, label="Hibor One Month")
sns.lineplot(x='end_of_date', y='hibor_overnight', data=inter_bank_liq_df, label="Hibor Overnight")

# Introduction to Cloud Computing

## What is Cloud Computing?
Cloud computing is the delivery of computing services—including **servers**, **storage**, **databases**, **networking**, **software**, and **analytics** over the internet instead of using physical hardware or local servers.

![](https://upload.wikimedia.org/wikipedia/commons/b/b5/Cloud_computing.svg)

**A more user friendly definition**  
computer resources (software or hardware) by rental

## Types of Cloud Computing
- **Infrastructure as a Service (IaaS)** – Provides virtual machines, storage, and networks (e.g., AWS EC2, Azure Virtual Machines).

- **Platform as a Service (PaaS)** – Offers development environments to build applications (e.g., Google App Engine, Azure App Services).

- **Software as a Service (SaaS)** – Delivers software over the internet (e.g., Gmail, Dropbox, Microsoft 365).

## Deployment Models

- **Public Cloud** – Shared infrastructure for multiple users (AWS, Azure, Google Cloud).

- **Private Cloud** – Dedicated resources for a single organization (on-premise or hosted).

- **Hybrid Cloud** – A mix of public and private clouds for flexibility.

## Major Cloud Providers

The major cloud providers dominating the market today are:

- **Amazon Web Services (AWS)** – The largest cloud provider, offering a vast range of services including computing, storage, AI, and machine learning.
- **Microsoft Azure** – A strong competitor to AWS, widely used for enterprise solutions, hybrid cloud, and AI-powered services.
- **Google Cloud Platform (GCP)** – Known for its data analytics, AI, and Kubernetes-based cloud solutions.
- **IBM Cloud** – Focuses on AI, blockchain, and enterprise cloud solutions.
- **Oracle Cloud** – Specializes in database and enterprise applications.
- **Alibaba Cloud** – A leading cloud provider in Asia, offering scalable cloud computing and AI services.

These providers offer Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) to businesses worldwide. 

## Accepting Invitation to Join AWS Academy
You will receive Sunny's invitation to join AWS Academy (a free yet brilliant platform for university students to learn cloud skills.  It's by invitation only).  The invitation email will be directly from AWS Academy.  

If you don't see the invitation email, check your junk mail folder.

Follow the instructions in the invitation email to register and activate your AWS Academy account. You will have **USD50** free credits to try various cloud products on AWS cloud (it's a lab environment, it shutdowns itself every 4 hours.  But you can start it up again as you wish)

In google search engine, google `aws academy student login` for the login page of AWS Academy.

Or you can directly go to this address [https://awsacademy.instructure.com/](https://awsacademy.instructure.com/)