**Digital Literacy and Computational Thinking, CUHK**

**Hands-on Python DEMO**

Trying on Jupyter Notebook:
* Click menu item Cell --> Run All
* Click a Code cell to edit
* Double-click a Text cell to edit mark-down content, click Run to confirm
* Ctrl-Enter to confirm edit and Run the current cell

Trying on Google Colab:
* Click menu item Run All (Ctrl+F9)
* Click a Code cell to edit
* Double-click a Text cell to edit mark-down content, press ESC to confirm
* Ctrl-Enter to confirm edit and Run the current cell


**Using online data from HK Fire Services Department**

**Ambulance Service Indicators**


In [1]:
# Using online data from HK Fire Services Department
# Ambulance Service Indicators

# Read Data from CSV using pandas
import pandas as pd
import requests

url = 'https://www.chp.gov.hk/files/misc/enhanced_sur_covid_19_eng.csv'

header = {
  "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
  "X-Requested-With": "XMLHttpRequest"
}

r = requests.get(url, headers=header)

case = pd.read_csv(r.text)


# show the first 3 rows in the data set
case.head(3)


ModuleNotFoundError: No module named 'requests'

In [None]:
# show the last 4 rows in the data set
case.tail(4)


**Statistical Summary**

In [None]:
case.describe()


**List Available "keys"/ Field names**

In [None]:
print(case.keys())


**Extract Some Fields/ Columns by keys**

In [None]:
# Caution: there are two pairs of square brackets!
print(case[
          ["Ambulance Service Indicators", "no. of emergency calls"]
      ])


In [None]:
# another way of extracting fields for study:
fields = ["Ambulance Service Indicators", "no. of emergency calls"]
target = case[fields]

# show the first 5 records
print(target.head())

print()
print("...not showing some rows...")
print()

# show the last 3 records
print(target.tail(3))

**Find Individual Aggregated Values**

In [None]:
print("Mean calls per ambulance:", case["calls per ambulance"].mean())
print("Median calls per ambulance:", case["calls per ambulance"].median())


---

**Plot a Graph**

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(15, 6))

plt.plot(case["Ambulance Service Indicators"],
         case["no. of emergency calls"])

plt.ylabel("no. of emergency calls")
plt.xlabel("Month/Year")

plt.ylim( bottom=0 )   # avoid BROKEN y-axis

axes = plt.gca() # get current axes
plt.setp(axes.get_xticklabels(), rotation=40, horizontalalignment='right', fontsize='medium')
plt.grid()
plt.show()

---

**Find Trend by Curve Fitting**

In [None]:
from numpy import arange
from scipy.optimize import curve_fit

# define an objective function
# this example is a cubic polynomial
# x is a numpy.ndarray, not a scalar; so these are numpy array operators; possible to use numpy.sin(), etc.
def objective_function(x, a, b, c, d):
    return a * x**3 + b * x**2 + c * x + d

y = case["no. of emergency calls"]
print("Number of data points:", y.size)

x = arange(start=0, stop=y.size, step=1)

# documentation shows usage as: popt, _ = curve_fit(objective_function, x, y)
(Parameter_OPTimal_values, Parameter_estimated_COVariance_table) = curve_fit(objective_function, x, y)
(a, b, c, d) = Parameter_OPTimal_values

print('y = %.3fx^3%+.3fx^2%+.3fx%+.3f' % (a, b, c, d))
# define a sequence of inputs between the smallest and largest known inputs
x_line = arange(min(x), max(x), 1)
# calculate the output for the range
y_line = objective_function(x_line, a, b, c, d)

print()
print("type(x) is", type(x))
print("type(y) is", type(y))
print("type(x_line) is", type(x_line))
print("type(y_line) is", type(y_line))


**Find Derivative**

Find differences between consecutive data points (months).

In [None]:
from numpy import diff

def compute_derivative(x, y):
    # Finite difference
    # See https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.diff.html
    return (x[:-1], diff(y))

(dx, dy) = compute_derivative(x, y)

print("type(dx) is", type(dx))
print("type(dy) is", type(dy))


**Plot a Combined Graph with Twin Axes**

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(15, 6))

# create a plot with two axes
axes1 = plt.gca() # get current axes
# Create a new Axes with an invisible x-axis and an independent y-axis positioned opposite to the original one (i.e. at right).
axes2 = axes1.twinx()


# show original data points
axes1.plot(case["Ambulance Service Indicators"],
           case["no. of emergency calls"])

# allow BROKEN y-axis to magnify details
plt.ylim( ) # DO NOT SET bottom=0

axes1.set_ylabel("no. of emergency calls / Curve fitting")
axes1.set_xlabel("Month/Year")
plt.setp(axes1.get_xticklabels(), rotation=40, horizontalalignment='right', fontsize='medium')


# create a plot for curve fitting
axes1.plot(x_line, y_line, '--', color='red', label='Curve fitting')


# create a plot for derivative
axes2.plot(dx, dy, '-*', color='green', label='Derivative')
axes2.set_ylabel("Derivative", color='green')


# tidy up and show the plot
plt.grid()
plt.show()
