# Part 1: Exercises of Python basics

### Excercise 1: Data types
- Use the assignment operator (=) to assign values to variables
- Try different data types
- Find more properties and operations of different data types through the online document https://www.w3schools.com/python/

Python has several built-in data types that you can use to store and manipulate data. These include:

**Numeric types**: `int`, `float`, and `complex`. These are used to represent numbers. For example, you can create them like this:

In [None]:
my_int = 5
my_int

In [None]:
my_float = 3.14
my_float

In [None]:
my_complex = 3 + 4j
my_complex

**Boolean type**: `bool`. This is used to represent truth values, either `True` or `False`. For example, you can create a boolean variable like this:

In [None]:
my_bool = True
my_bool

**String type**: `str`. This is used to represent text. You can create a string variable by enclosing text in quotation marks, like this:

In [None]:
my_string = "Hello, World!"
my_string

**List type**: `list`. This is used to represent an ordered collection of items. You can create a list by enclosing a comma-separated list of items in square brackets, like this:

In [None]:
list1 = [1,'str', True]
print(list1[1])
list1.append(3.14)
print(list1)

**Tuple type**: `tuple`. This is similar to a list, but it is immutable, which means you cannot change its contents once it is created. You can create a tuple by enclosing a comma-separated list of items in parentheses, like this:

In [None]:
tuple1 = (1,'str', True)
print(tuple1[0])

**Set type**: `set`. This is used to represent an unordered collection of unique items. You can create a set by enclosing a comma-separated list of items in curly braces, like this:

In [None]:
set1 = {"18","19",True,32}
set1

Sets have several useful methods for performing set operations, such as union and intersection.

In [None]:
# Create two sets
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}

# Union of A and B
union_AB = A.union(B)
print(f"Union of A and B: {union_AB}")

# Intersection of A and B
intersection_AB = A.intersection(B)
print(f"Intersection of A and B: {intersection_AB}")

**Dictionary type**: `dict`. This is used to represent a collection of key-value pairs. You can create a dictionary by enclosing a comma-separated list of key-value pairs in curly braces, like this:

In [None]:
dict1 = {
  "brand": "Ford",
  "electric": False,
  "year": 1964,
  "colors": ["red", "white", "blue"]
}
dict1

You can access the value associated with a key using square brackets:

In [None]:
value = dict1["year"]
print(value)

### Excercise 2: Loops
- Calculate the sum from 1 to 100.
- Calculate the sum of the even numbers from 1 to 100.
- Try both "for loop" and "while loop"

In [None]:
# Calculate the sum from 1 to 100
sum = 0
for i in range(1, 101):
    sum += i
print(f"The sum from 1 to 100 is: {sum}")

# Calculate the sum of the even numbers from 1 to 100
even_sum = 0
for i in range(2, 101, 2):
    even_sum += i
print(f"The sum of the even numbers from 1 to 100 is: {even_sum}")

In [None]:
# Calculate the sum from 1 to 100
sum = 0
i = 1
while i <= 100:
    sum += i
    i += 1
print(f"The sum from 1 to 100 is: {sum}")

# Calculate the sum of the even numbers from 1 to 100
even_sum = 0
i = 2
while i <= 100:
    even_sum += i
    i += 2
print(f"The sum of the even numbers from 1 to 100 is: {even_sum}")

### Excercise 3: Write a function
- Write a function to calculate the area of a rectangle
- It takes two arguments, length and width, and returns the area of a rectangle with the given dimensions
- The function should check if the input values are positive numbers and print an Error message if they are not.

In [None]:
def rectangle_area(length, width):
    if length <= 0 or width <= 0:
        print("Length and width must be positive numbers")
        return
    return length * width

In [None]:
rectangle_area(2,4)

In [None]:
rectangle_area(1,-1)

### Excercise 4: File operations
- Create a new file
- Write to the file
- Close the file
- Read the contents of the file

In [None]:
# Step 1: Create a new file
file = open("example.txt", "w")

# Step 2: Write to the file
file.write("This is an example of writing to a file using Python.\n")
file.write("We can write multiple lines by calling the write method multiple times.\n")
file.write("Don't forget to close the file when you're done!\n")

# Step 3: Close the file
file.close()

# Step 4: Read the contents of the file
file = open("example.txt", "r")
contents = file.read()
print(contents)
file.close()

# Part 2: Tutorials on some Python packages we will encounter

Please visit the official documents to learn deeper.
- Numpy: https://numpy.org/doc/stable/
- Pandas: https://pandas.pydata.org/pandas-docs/stable/
- Matplotlib: https://matplotlib.org/stable/users/index
- NetworkX: https://networkx.org/documentation/stable/reference/index.html#

# 1. NumPy

### 1.1 Introduction

NumPy, short for Numerical Python, is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

The main feature of NumPy is the array object class called ndarray which provides much more efficient storage and data operations as the size of the data increases. It is up to 50x faster than traditional Python lists with a lot of supporting functions that make working with ndarray very easy.

Unlike tuples and list, ndarrays can only store objects of the same type (e.g. only floats or only ints).

### 1.2 Get started

NumPy can be installed using pip: `pip install numpy`.

Then import it into your Python script or notebook with `import numpy as np`.

In [None]:
import numpy as np  # This is how we usually import numpy

### 1.3 Array Creation

Arrays can be created from lists using numpy.array(), or using various other methods like numpy.zeros(), numpy.ones(), numpy.arange(), numpy.linspace(), etc.

In [None]:
arr = np.array([1, 2, 3])
print(f"Array: {arr}")

Using numpy.arange() to create an array of evenly spaced values within a given interval.

In [None]:
# Create an array of 10 values from 0 to 9
arr = np.arange(10)
print(arr)

Using numpy.linspace() to create an array of evenly spaced values over a specified interval.

In [None]:
# Create an array of 5 values from 0 to 1
arr = np.linspace(0, 1, 5)
print(arr)

Using numpy.zeros() to create an array of all zeros with a given shape.

In [None]:
# Create a 3x3 array of zeros
arr = np.zeros((3, 3))
print(arr)

Using numpy.ones() to create an array of all ones with a given shape.

In [None]:
# Create a 2x4 array of ones
arr = np.ones((2, 4))
print(arr)

Using numpy.random.rand() to create an array of random numbers from a uniform distribution over [0, 1).

In [None]:
# Create a 4x4 array of random numbers
arr = np.random.rand(4, 4)
print(arr)

Using numpy.random.randn() to create an array of random numbers from a standard normal distribution.

In [None]:
# Create a 2x3 array of random numbers
arr = np.random.randn(2, 3)
print(arr)

### 1.4 Array Operations

Standard mathematical operations can be performed on arrays which applies elementwise. This includes addition, subtraction, multiplication, division and more.


In [None]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print(f"a + b: {a + b}")
print(f"a - b: {a - b}")
print(f"a * b: {a * b}")
print(f"a / b: {a / b}")

### 1.5 Indexing and Slicing

Elements in NumPy arrays can be accessed by index. Slicing is also possible in NumPy arrays, similar to Python lists.

In [None]:
arr = np.array([1, 2, 3, 4])
print("First element:", arr[0])
print("Second element:", arr[1])
print("Last element:", arr[-1])

In [None]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print("Array from index 1 to index 5:", arr[1:5])

### 1.6 Advanced Array Manipulation

There are many advanced ways you can manipulate arrays such as changing the shape of arrays, stacking multiple arrays, and more.


In [None]:
arr = np.array([1, 2, 3, 4, 5])
reshaped_arr = arr.reshape((5, 1))
stacked_arr = np.hstack((arr, arr))

print(f"Reshaped array: \n{reshaped_arr}")
print(f"Stacked array: \n{stacked_arr}")

### 1.7 Mathematical Functions

NumPy provides a variety of mathematical functions that can be performed on arrays such as trigonometric functions, exponential and logarithmic functions, etc.

In [None]:
arr = np.array([1, 2, 3])

print(f"np.sin(arr): {np.sin(arr)}")
print(f"np.exp(arr): {np.exp(arr)}")
print(f"np.log(arr): {np.log(arr)}")

### 1.8 Linear Algebra in NumPy

NumPy provides many functions to perform linear algebra operations like dot product, matrix multiplication, finding determinants, solving linear equations and more.

In [None]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
print(f"a: \n{a}")
print(f"b: \n{b}")

dot_product = np.dot(a, b)
matrix_multiplication = a @ b
determinant = np.linalg.det(a)

print(f"Dot product: \n{dot_product}")
print(f"Matrix multiplication: \n{matrix_multiplication}")
print(f"Determinant of a: {determinant}")

# 2. Pandas

### 2.1 Introduction

Pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables.

Core Components of Pandas:
- Series: It's a one-dimensional array holding data of any type.
- DataFrame: It's a two-dimensional table of data with rows and columns.

### 2.2 Get started

Install pandas using pip package manager by running ```pip install pandas``` and  import pandas in your python script using ```import pandas as pd```

In [None]:
import pandas as pd

### 2.3  Creating a DataFrame

You can create a DataFrame from dictionary like so. Each key-value pair in the dictionary corresponds to a column in the DataFrame.

In [None]:
data = {'Name':['Tom', 'Tom', 'Steve', 'Steve', 'Jack', 'Jack', 'Emma', 'Emma', 'Oliver', 'Oliver'],
        'Age':[28, 28, 29, 29, 34, 34, 30, 30, 32, 32],
        'Score':[9.1, 8.5, 8.7, 8.9, 9.3, 8.8, 9.0, 8.6, 9.2, 8.7],
        'Subject':['Math', 'Science', 'Math', 'Science', 'Math', 'Science', 'Math', 'Science', 'Math', 'Science']}
df = pd.DataFrame(data)
print(df)

Or you can read data from a CSV file using the `read_csv` function.

In [None]:
# df = pd.read_csv('file.csv')
# print(df.head())

Then, you can write data to a CSV file using the `to_csv` function.

In [None]:
# df.to_csv('file.csv')

### 2.4 Dataframe Manipulation

The index of a DataFrame is like an address that’s how any data point across the DataFrame or Series can be accessed.

This is how to display the index of the DataFrame.

In [None]:
print("DataFrame Index: ", df.index)

You can also display the column names of the DataFrame.

In [None]:
print("DataFrame Columns: ", df.columns)

You can select part of the data via index or multi-index.

In [None]:
print(df.loc[0]) # Selects the first row

In [None]:
print(df.loc[[0,3]]) 

You can sum the selected column.

In [None]:
# Sum per column
print(df['Age'].sum())

You can add a new column directly.

In [None]:
df['NewColumn'] = range(1, len(df) + 1)
print(df)

Or you can add a new column by a Series.

In [None]:
df['NewColumn2'] = pd.Series(range(len(df), 0, -1),index =[0,1,2,3,4,9,8,7,6,5])
df

You can add a new object to the dataframe.

In [None]:
df.reset_index(inplace=True)
new_data = {'Name':'John', 'Age':31, 'Score':8.9, 'Subject':'Math', 
            'NewColumn':11, 'NewColumn2':0}
df = df.append(new_data, ignore_index=True)
print(df)

You can select and filter data based on conditions. For example, to select rows where Age is greater than 30:

In [None]:
print(df[df['Age'] > 30])

You can sort data like this:

In [None]:
print(df.sort_values('Age'))

You can group rows of data together based on some column value:

In [None]:
# Grouping data by 'Name' and calculating mean score
grouped = df.groupby(['Name'])['Score'].mean()
print(grouped.to_frame())

Combine DataFrames together:

- The join() function is used to combine two DataFrames on a common index or column. For example, suppose we have two DataFrames, df1 and df2, that have the same index but different columns:

In [None]:
# Create two dataframes with some common and some different columns
df1 = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie', 'David'],
                    'age': [25, 30, 35, 40],
                    'gender': ['F', 'M', 'M', 'M']})

df2 = pd.DataFrame({'name': ['Alice', 'Bob', 'Eve', 'Frank'],
                    'height': [160, 170, 165, 180],
                    'weight': [50, 70, 55, 80]})

print(f"df1: \n{df1}\n")
print(f"df2: \n{df2}\n")

# Perform an outer join on the two dataframes using the name column as the key
# This will include all rows from both dataframes, and fill missing values with NaN
outer_join = df1.join(df2.set_index('name'), on='name', how='outer')
print(f"Outer join result:\n{outer_join}\n")

# Perform an inner join on the two dataframes using the name column as the key
# This will include only rows that have a matching value in the name column in both dataframes
inner_join = df1.join(df2.set_index('name'), on='name', how='inner')
print(f"Inner join result:\n{inner_join}\n")

- The concat() function is used to combine two or more DataFrames along a specified axis. For example, suppose we have two DataFrames, df3 and df4, that have the same columns but different rows:

In [None]:
import pandas as pd

# create two sample DataFrames
df1 = pd.DataFrame({"name": ["Alice", "Bob", "Charlie"], "age": [25, 30, 35]})
df2 = pd.DataFrame({"name": ["David", "Eve", "Frank"], "age": [40, 45, 50]})
print(f"df1: \n{df1}\n")
print(f"df2: \n{df2}\n")


# concatenate them along the 0-axis
result = pd.concat([df1, df2], axis=0, ignore_index=True)

# print the result
print(f"result: \n{result}\n")


### 2.5 Handling Missing Data (Detecting, Dropping, and Filling)

This is a DataFrame with some missing values. 

In [None]:
import numpy as np

data = {'A':[1, 2, np.nan], 'B':[5, np.nan, np.nan], 'C':[1, 2, 3]}
df = pd.DataFrame(data)

print(df)

The `isnull` function returns a DataFrame where each cell is either True or False depending on that cell's null status.

In [None]:
print(df.isnull())

The `dropna` function removes missing values. By default, it removes any row which contains at least one missing value.

In [None]:
print(df.dropna())

You can also remove columns which contain at least one missing value by setting the `axis` parameter to 1.

In [None]:
print(df.dropna(axis=1))

The `fillna` function fills the missing values with a value you specify.

For example, you can fill all missing values with the mean of the non-missing values in the same column:

In [None]:
print(df.fillna(value=df.mean()))

# 3. Matplotlib

### 3.1 Introduction to Matplotlib
Matplotlib is a powerful data visualization library for the Python programming language. It provides an object-oriented API for creating a wide range of static, animated, and interactive visualizations. Matplotlib is built on top of NumPy, a fundamental library for scientific computing in Python, and can be used in conjunction with other Python libraries such as SciPy and Pandas. With Matplotlib, you can create high-quality figures and graphs that can be customized to your needs. It is widely used in academia and industry for data analysis and presentation.

### 3.2 Get started
Import it into your Python script or notebook with `import matplotlib.pyplot as plt`.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

### 3.3 Plotting different charts
Matplotlib can create a wide variety of plots, including line plots, scatter plots, bar plots, pie charts, and histograms. To create a plot, you need to use the appropriate plotting function.

In [None]:
# Line plot
x = np.arange(10)
y = np.random.rand(10)
plt.plot(x, y)
plt.show()

In [None]:
# Scatter plot
x = np.random.rand(100)
y = np.random.rand(100)
plt.scatter(x, y)
plt.show()

In [None]:
# Bar plot
x = ['A', 'B', 'C']
y = [3, 1, 4]
plt.bar(x, y)
plt.show()

In [None]:
# Pie chart
x = ['A', 'B', 'C']
y = [3, 1, 4]
plt.pie(y, labels=x)
plt.show()

In [None]:
# Histogram
x = np.random.randn(1000)
plt.hist(x)
plt.show()

### 3.4 Customizing display attributes
You can change the appearance of your plots by customizing the display attributes.

Change the color and marker style of a line plot.

In [None]:
x = np.linspace(0, 10, 100)
y = np.sin(x)
plt.plot(x, y, color='red', marker='o')
plt.show()

### 3.5 Adding labels and titles
You can add labels to the x- and y-axes of your plot using the `xlabel` and `ylabel` functions. You can also add a title to your plot using the `title` function.

In [None]:
# Adding labels and titles
x = np.linspace(0, 10, 100)
y = np.sin(x)
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('sin(x)')
plt.title('A simple line plot')
plt.show()

### 3.6 Multiple plotting
You can plot multiple lines on the same graph by calling the plotting function multiple times before calling `show`.

In [None]:
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)
plt.plot(x, y1)
plt.plot(x, y2)
plt.show()

### 3.7 Legends
You can add a legend to your plot to label the different lines. To do this, you need to pass the `label` argument to the plotting function when creating each line and then call the `legend` function before calling `show`.

In [None]:
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)
plt.plot(x, y1, label='sin(x)')
plt.plot(x, y2, label='cos(x)')
plt.legend()
plt.show()

## 4. NetworkX

### 4.1 Introduction to NetworkX
NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. It provides tools for working with graphs and networks, including classes for graph objects, generators to create standard graphs, IO routines for reading in existing datasets, algorithms to analyze the resulting networks and some basic drawing tools.

### 4.2 Get started
To get started with NetworkX, you'll need to install it by running `!pip install networkx` in your Jupyter notebook or command line. Then, you can import it into your Python script or notebook with `import networkx as nx`.

### 4.3 Creating a Graph

You can create an empty graph with no nodes and no edges using the `Graph` class:

In [None]:
import networkx as nx
G = nx.Graph()

By definition, a Graph is a collection of nodes (vertices) along with identified pairs of nodes (called edges, links, etc). In NetworkX, nodes can be any hashable object e.g., a text string, an image, an XML object, another Graph, a customized node object, etc.

In [None]:
import math
g = nx.Graph()
g. add_node ('string')
g. add_node (math.cos) # cosine function
f = open ('temp.txt' , 'w') # file handle
g. add_node (f)
print (g.nodes())

### 4.4 Adding and Removing Nodes

The graph `G` can be grown in several ways. You can add one node at a time:

In [None]:
G.add_node(1)

Or add nodes from any iterable container, such as a list:

In [None]:
G.add_nodes_from([2, 3])

Use matplotlib to help draw the graph.

In [None]:
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (4, 3) #Set the image size
nx.draw(G, node_color='red', with_labels = True)
plt.show()

You can also add nodes from another graph.

In [None]:
h = nx. path_graph(5) # In a path graph, all vertices and edges lie on a single straight line
G.add_nodes_from(h) # Add all the nodes from graph ‘h’ to graph ‘g’.
nx.draw(G,node_color='red',with_labels = True)
plt.show()

You can also remove any node of the graph.

In [None]:
G.remove_node (2)
nx.draw(G,node_color='red',with_labels = True)
plt.show()

### 4.5 Adding Edges

`G` can be grown by adding one edge at a time:

In [None]:
G.add_edge(1, 2)
G.add_edge(3, 4, weight=0.1)
nx.draw(G,node_color='red',with_labels = True)
plt.show()

Or by adding a list of edges:

In [None]:
G.add_edges_from([(1, 3), (2, 4)])
nx.draw(G,node_color='red',with_labels = True)
plt.show()

You can add edges with weights (let's do this in a different graph G1):

In [None]:
G1 = nx.Graph()
G1.add_edge(0, 1, weight = 0.1)
G1.add_edge(0, 2, weight = 1.5)
G1.add_edge(0, 3, weight = 1.0)
G1.add_edge(0, 4, weight = 2.2)
weights = [G1[u][v]['weight'] for u,v in G1.edges()]
nx.draw(G1, node_color='yellow', with_labels = True, width=weights)
plt.show()

### 4.6 Accessing nodes and edges

Show all the nodes.

In [None]:
print("Nodes of G:", G.nodes())

Show all the edges

In [None]:
print("Edges:", G.edges())

Give the number of edges and nodes

In [None]:
print("Number of nodes:", G.number_of_nodes())
print("Number of edges:", G.number_of_edges())

Show the neighbors of one node and give the degree of one node

In [None]:
print("Neighbors of the node 1:", list(G.neighbors(1)))
print("Degree of the node 1:", G.degree(1))