# Bridging - Coding with Jupyter Notebook


0. Markdowns, Headings, Comments
1. Built-in modules, third party packages 
2. Arrays, Dictionary and Data frames 

You may go to `View` -> `Table of Contents` to see the flow of content clearly. 

## 1. Built-in Python Modules and Third-party Packages

### 1.1 Built-in modules 

You may import built-in python modules without installing them. For example, here we are going to use `math` module. 

- Check all built-in python libraries distributed with Python 3.12: https://docs.python.org/3.12/library/index.html

In [None]:
import math 
# Documentation https://docs.python.org/3.10/library/math.html
# various functions/variables in math module: ceil, floor, sqrt, pi, log10.

print(math.ceil(1.44))
print(math.floor(10.44))
print(math.sqrt(16))
print(math.log10(100))

### 1.2  Third-party packages
Unlike built-in moduels, there are lots of third-party python packages **which need to be installed first before imported and used**.
- In Jupyte Notebook, we can use ``pip`` to install packages in current environment.
- Here let's install three third-party packages (e.g., ``pandas``, ``numpy``, ``matplolib``) which we will use today.  

In [None]:
#pip install pandas numpy matplotlib

Check all packages in current environment.

In [None]:
pip list

## 2. Arrays, Dictionary and Data frames 

### 2.1 Handle arrays with `numpy`

Import `numpy` first before using its functions. 

- Check the [documentation](https://numpy.org/doc/stable/reference/) for `numpy` for details.

In [None]:
import numpy as np

arr = np.array([[1,2,6],[4,5,1]])
arr

In [None]:
arr.shape   

In [None]:
print(arr[0])    # print values on the first row
print(arr[0,0])  # print value in first row, first column

In [None]:
print(np.max(arr))         # return max value in the flattened array
print(np.max(arr,axis=0))  # compare across rows and return max values  
print(np.max(arr,axis=1))  # compare across columns and return max values

In [None]:
arr + 10   # element-by-element computation

In [None]:
arr ** 2     # raise each element to its 2nd power

### 2.2 Handle data frames with `pandas`

Import ``pandas``  first.

- Check the [documentation](https://numpy.org/doc/stable/reference/) for `numpy` for details.

First, let's create a data frame based on the array `arr`.

In [None]:
import pandas as pd

df = pd.DataFrame(data = arr, 
                  columns=['a', 'b', 'c'], 
                  index = ['r1','r2']) 

df       # display the data frame

In [None]:
df.shape       # check data shape

In [None]:
df.columns     # check column names

In [None]:
df.index       # check row names

In [None]:
df['a']        # select a column by its name

Save the above dataframe ``df`` as a csv file named ``df.csv`` in my work folder where this notebook is located (i.e., ``/Users/jingliu/OneDrive - Hong Kong Baptist University/Bridging_python``), just indicate the file name would be enough. 

In [None]:
df.to_csv('df.csv',index = False)    # save the data frame as a csv file in CWD, ignore index column

### 2.3 Simple Data Visualization with `matplotlib`

Here we have saved a csv file named ``diabetes.csv`` in current work folder. 

- The absolute path to this file is `/Users/jingliu/OneDrive - Hong Kong Baptist University/Bridging_python/diabetes.csv`
- Note we aleady imported `pandas` earlier. 

In [None]:
df = pd.read_csv('diabetes.csv')  # read in a csv file saved current directory - relative path

df.head()

Visualize the relationship between ``Age`` and ``BMI`` with a simple scatter plot. 

- Need to import ``matplotlib``first (make sure it is installed first)  
- Check the [documentation](https://matplotlib.org/stable/) for `matplotlib` for details. 

In [None]:
import matplotlib.pyplot as plt      

fig = plt.figure(figsize=(10,6))    # create a new figure with specific figsize(width, height in inches)
plt.scatter(df['Age'], df['BMI'], color='lightblue')
plt.xlabel("Age")
plt.ylabel("BMI")
plt.title('BMI and Age');

In [None]:
fig.savefig('BMI and Age.jpeg')   # save the figure to current local directory  

Last, let's save our work in a readable format.

- First, go to `Kernel` -> `Restart Kernel and Run all Cells`
- Go to `File` -> `Save and Export Notebook As` -> `HTML` or `PDF`. 
