<a href="https://colab.research.google.com/github/shmiiing/machine_learning/blob/main/Additional_Materials/Programming_Session_0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://drive.google.com/uc?export=view&id=1gmxxmwCR1WXK0IYtNqvE4QXFleznWqQO" height="100"/>

# **<center>Machine Learning and Finance </center>**


## <center> Programming Session 0 - Introduction to Python </center>



# Introduction to Python

Welcome to this comprehensive guide designed for our course and anyone interested in learning Python, the leading programming language in data analysis and machine learning. This notebook will take you through the basics of Python and its libraries, with a focus on data handling and visualization techniques.
## Outline

- [Python Basics](#Python-Basics)

Here, we will start with the fundamentals of Python programming. Learn about variables, data types, loops, and functions to get started on your Python journey.
- [Understanding Numpy](#Understanding-Numpy)

Dive into Numpy, Python's library for numerical computing. We'll cover array creation, operations, and manipulations to help you handle large, multi-dimensional arrays and matrices.

- [Exploring Panda](#Exploring-Panda)

Get to know Pandas, the must-have library for data manipulation and analysis. This section will guide you through data structures like Series and DataFrames, data cleaning, and preprocessing steps.

- [Visualizing Data with Matplotlib](#Visualizing-Data-with-Matplotlib)

Learn the basics of Matplotlib, Python's plotting library. We'll explore different types of plots and visualizations to bring your data to life and gain insights from it.

#1. Python Basics

## 0- `print` Function in Python
> In Python, the `print` function is one of the most basic and frequently used built-in functions. Its primary purpose is to display information to the console.

### Basic Usage

To print a simple message or variable's value, you can use the `print` function like this:
```
print("Hello, World!")
```

**Printing Multiple Items**
 You can print multiple items in a single call by separating them with commas:
```
name = "Alice"
print("Hello,", name, "!") # This will output: Hello, Alice !

```

**String Formatting**
Python provides several methods to format strings for display:



1.   Using f-strings:
```
age = 30
print(f"Alice is {age} years old.")
```
2.   Using the str.format() method:
```
print("Alice is {} years old.".format(age))
```
3. Using %-formatting:
```
print("Alice is %d years old." % age)
```

**End and Separator**

The print function also offers parameters like end and sep to control the output:

- end: Specifies what to print at the end.
- sep: Specifies how to separate multiple items.
```
print("Hello", "World", sep="-", end="!") # This will output: Hello-World!
```




- Use the `print` function to display the message "Hello, Python learners!" to the console.
- Print the words "Python", "is", "fun" to the console, but use dashes (`-`) instead of spaces as separators between the words.

In [None]:
##Q0 Insert your code here
print("Hello, Python learners!")
print("Python", "is", "fun", sep="-")

## 1 - Basic Variable Manipulation
> Let's delve into variable creation in Python.

> When creating a variable in Python, the computer allocates a spot in the RAM. This spot holds the value of the created variable.

> To establish a connection between a variable's name and its content, we use the symbol '='. The syntax is as follows:
`variable_name = variable_value`. Commonly, we say we are assigning the variable_value to our variable variable_name, but in reality, Python associates a reference, an address to our variable_value with our `variable_name`. Thus, when you copy a variable, you are only copying the reference to that variable, making the copy operation faster. We will explore this concept further below.

>- Create a variable called `my_variable`.
> - Assign it the value 2.


In [None]:
## Q1: Insert your code here

>- Create another variable called global.
>- Assign it any value.

In [None]:
## Q2: Insert your code here

> Python returns an invalid syntax error. Indeed, there are **reserved words ** in Python, and **global** is one of them. Similarly, it's impossible to start a variable's name with a number.

<a href="https://fr.wikibooks.org/wiki/Programmation_Python/Tableau_des_mots_r%C3%A9serv%C3%A9s"> For more information on **reserved words** </a>

> To display the value of a variable, there are several methods. Here are two:

> 1. Simply type and enter the variable name at the keyboard.
> 2. Use the print() function.
> `print(X)` will output the value of the variable X.


>- Create a variable called `number`.
>- Assign it the value 3.0.
>- Display the value on the screen using the first method.



In [None]:
## Q3: Insert your code here

> A variable is a piece of data stored by the computer in a specific location in the RAM. There are various types of variables in Python:
<center>
<table>
  <tr>
    <th>Type</th>
    <th>Meaning</th>
    <th>Example</th>
  </tr>
  <tr>
    <td><i>int</i></td>
    <td>Integer number</td>
    <td>2</td>
  </tr>
  <tr>
    <td><i>float</i></td>
    <td>Floating-point number</td>
    <td>3.0</td>
  </tr>
  <tr>
    <td><i>complex</i></td>
    <td>Complex number</td>
    <td>2j</td>
  </tr>
  <tr>
    <td><i>str</i></td>
    <td>String</td>
    <td>"Dauphine"</td>
  </tr>
  <tr>
    <td><i>tuple</i></td>
    <td>Fixed-length list</td>
    <td>(1,2)</td>
  </tr>
  <tr>
    <td><i>list</i></td>
    <td>Variable-length list</td>
    <td>[1,2]</td>
  </tr>
  <tr>
    <td><i>dict</i></td>
    <td>Dictionary</td>
    <td>{0:'a',1:'b'}</td>
  </tr>
  <tr>
    <td><i>bool</i></td>
    <td>Boolean</td>
    <td>True</td>
  </tr>
</table>
</center>




> To know the type of data in Python, use `type(X)` where `X` is the variable whose type you want to know.

- Create four variables named `a`, `b`, `c`, and `d`.
- Assign:
>1. `a` the value 2
>2. `b` the value 3.0
>3. `c` the value "Hello"
>4. `d` the value True

In [None]:
## Q4: Insert your code here

> To better grasp the concept of references in Python, we suggest the following exercise:

- Create a variable `a` and assign it a list [1,2].
- Create another variable `b` and assign it the value of `a`.
- Add the element 3 to b by writing `b.append(3)` where 3 is the element you want to add to the list `b`.
- Display `a`. What do you notice?

In [None]:
## Q5: Insert your code here

> In reality, the references of `a` and `b` are the same since `a` and `b` are what we call mutable objects. We haven't covered objects, but understand that in Python, everything you manipulate is an object.

> There are thus two types of objects: mutable (lists, dictionaries) and immutable (strings, int, complex, floats, tuples). Mutable objects are those that can be altered after creation. On the contrary, when you "modify" an immutable object, Python creates a new memory address for this new object.

> The following exercise illuminates this principle.
- Create a variable `a` and assign it the value 1.
- Create a variable `b` and assign it the value of `a`.
- Increment b by 1.
- Display `a`.

In [None]:
## Q6: Insert your code here

##  Lists in Python

> Lists in Python are ordered collections of items which can be of any type. Lists are very flexible and can be modified after they have been created. A list is created by placing all the items (elements) inside square brackets `[]`, separated by commas.

Here's a simple example:

```
my_list = [1, 2, 3, 4, 5]
```
 **Accessing Elements**
> Elements in a list can be accessed using an index, with the first element at index 0. For example:

```
first_element = my_list[0]  # This will be 1
```

**Modifying Lists**
> Lists are mutable, meaning that you can change their content:
```
my_list[0] = 10  # Now, my_list is [10, 2, 3, 4, 5]
```

**Length of a List**

> The length of a list can be obtained with the len() function:
```
list_length = len(my_list)  # This will be 5
```

**Adding Elements**
> You can add elements to the end of a list using the append() method:
```
my_list.append(6)  # Now, my_list is [10, 2, 3, 4, 5, 6]
```

**Removing Elements**
> You can remove elements from a list using the remove() method, or the pop() method which removes and returns the last item:

```
my_list.remove(2)  # Now, my_list is [10, 3, 4, 5, 6]
last_item = my_list.pop()  # last_item is 6, my_list is [10, 3, 4, 5]
```

- Create a list named fruits containing the following items: apple, banana, cherry.
- Print the second item in the fruits list.
- Change the value of the second item of the fruits list to blackberry.
- Add orange to the end of the fruits list.
- Remove apple from the fruits list.


In [None]:
## Q7: Insert your code here

## 2- Dictionary in Python

Dictionaries in Python are a collection of key-value pairs, where each key must be unique. They are mutable and unordered. Dictionaries are defined by enclosing a comma-separated sequence of key-value pairs in curly braces `{}`, with a colon `:` separating the keys and values.

```
my_dict = {
    'key1': 'value1',
    'key2': 'value2',
    'key3': 'value3',
}
```

**Accessing Elements**
> To access the value associated with a particular key, you can use square brackets enclosing their key.
```
print(my_dict['key1'])  # Output: value1
```

**Adding and Updating Elements**

> You can add new key-value pairs or update the value of an existing key.
```
my_dict['key4'] = 'value4'  # Adds a new key-value pair
my_dict['key1'] = 'new_value1'  # Updates the value of an existing key
```


**Removing Elements**

> You can remove a particular key-value pair with the pop method, or remove all entries with the clear method.
```
my_dict.pop('key2')  # Removes the key-value pair with key 'key2'
my_dict.clear()  # Removes all key-value pairs
```

- Create a dictionary student with keys 'name', 'age', and 'course', and assign some values to these keys.
- Print the value associated with the key 'name'.

In [None]:
## Q8: Insert your code here

- Update the age in the student dictionary to 26.
- Add a new key-value pair 'grade' with a value 'A' to the student dictionary.
- Print the updated dictionary.


In [None]:
## Q9: Insert your code here

- Remove the key-value pair 'course' from the student dictionary using the pop method.
- Print the updated dictionary.


In [None]:
## Q10: Insert your code here

> Now we are going to look at the different operators in Python. The following table summarizes the different mathematical operators.

<center>

| Symbol | Effect                | Example         |
|--------|-----------------------|-----------------|
| +      | Addition              | 6 + 4 returns 10|
| -      | Subtraction           | 6 - 4 returns 2  |
| *      | Multiplication        | 6 * 4 returns 24 |
| /      | Real Division         | 6 / 4 returns 1.5|
| **     | Exponentiation        | 12 ** 2 returns 144 |
| //     | Integer Division      | 6 // 4 returns 1 |
| %      | Remainder of Division | 6 % 4 returns 2  |

</center>

- Define a variable `age`.
- Assign the value 35 to `age`.
- Display the number of days associated with the age using a mathematical operator.

In [None]:
## Q11: Insert your code here

- Define a new variable `day`.
- Assign to day a number of days greater than 10,000.
- Display the corresponding number of years using a mathematical operator. Assume that a year has 365 days.

In [None]:
## Q12: Insert your code here

- Create the variables `distance` and `time`.
- Assign the value 15 to `distance`.
- Assign the value 14.4 to `time`.
- Create a new variable `speed` and assign it the corresponding speed based on the previous two variables.
- Display the variable `speed`, using the formula $speed = \frac{distance}{time} $


In [None]:
## Q13: Insert your code here

There are two other types of operators in Python that return Boolean values (True or False): logical operators and comparison operators.   

Here's a table summarizing the logical operators:
<center>

| Expression | Meaning |
|------------|---------|
| X or Y     | Logical OR. If either X or Y is True, the expression is True. If neither is True, the expression is False. |
| X and Y    | Logical AND. If both X and Y are True, the expression is True. Otherwise, the expression is False. |
| not X      | Logical NOT. Opposite of X. If X is True, it returns False. If X is False, it returns True. |

</center>


- Create two variables `x` and `y`.
- Assign `True` to `x` and `False` to `y`.
- Create a new variable `z_false`.
- Using a logical operator with `x` and `y`, assign the value `False` to `z_false`.
- Create a new variable `z_true`.
- Using a logical operator with `x` and `y`, assign the value `True` to `z_true`.
- Display `z_false`.
- Display `z_true`.

In [None]:
## Q14: Insert your code here

Now we turn our attention to the last type of operators: comparison operators.

Below is a table summarizing the comparison operators:
<center>

| Expression | Meaning            |
|------------|--------------------|
| <          | Strictly less than |
| >          | Strictly greater than |
| <=         | Less than or equal to |
| >=         | Greater than or equal to |
| ==         | Equality            |
| !=         | Inequality          |

</center>

- Create two variables `x` and `y`.
- Assign a float `2.0` to `x` and an int `2` to `y`.
- Return `True` in three different ways using only comparison operators.

In [None]:
## Q15: Insert your code here

> Conditionals are a fundamental concept in programming, allowing you to perform different actions based on certain conditions. In Python, the `if`, `elif` (else if), and `else` statements are used to control the flow of execution in a program based on the evaluation of specified conditions.





```
if condition:
    # code to execute if condition is True
elif another_condition:
    # code to execute if another_condition is True
else:
    # code to execute if no conditions are True```

```

### `if` Statement

>The `if` statement evaluates a condition and executes the indented block of code only if the condition is true.



```
x = 10
if x > 5:
    print("x is greater than 5")
```

### `elif` Statement

> The `elif` (else if) statement allows you to check multiple conditions, executing its block of code if its condition is true and all previous conditions have been false.



```
x = 10
if x > 15:
    print("x is greater than 15")
elif x > 5:
    print("x is greater than 5 but not greater than 15")
```

### `else` Statement

The `else` statement executes its block of code if no previous conditions have been true.



```
x = 10
if x > 15:
    print("x is greater than 15")
else:
    print("x is not greater than 15")
```





- You are given the age variable.
- Write a sequence of instructions that prints whether the person is a minor (under 18), an adult (18 to 64), or a senior (65 or older).

In [None]:
## Q16: Insert your code here

- You are given the variable number.
- Write a sequence of instruction that checks whether the number is positive, negative, or zero and prints the result.

In [None]:
## Q17: Insert your code here

## To go further..
### Exercise 1: Mathematical Operators
- Define a variable `radius` with a value of `7`.
- Calculate the area of a circle using the formula \( Area = \pi r^2 \) (you can use `3.1415` for \(\pi\)) and store the result in a variable called `area`.
- Display the `area` variable.

### Exercise 2: Logical Operators
- Create two variables `p` and `q`.
- Assign `True` to `p` and `False` to `q`.
- Create a new variable `r` and use a logical operator to assign it a value of `True` based on the values of `p` and `q`.
- Create a new variable `s` and use a different logical operator to assign it a value of `False` based on the values of `p` and `q`.
- Display `r` and `s`.

### Exercise 3: Comparison Operators
- Create two variables `a` with a value of `10` and `b` with a value of `20`.
- Return `True` in three different ways using only comparison operators.

### Exercise 4: Variable Types and Casting
- Define four variables `m`, `n`, `o`, and `p`.
- Assign the following values to them: `5`, `5.0`, `"5"`, and `True`, respectively.
- Display the type of each variable using the `type()` function.
- Cast variable `n` to integer, `o` to float, and `p` to string, and display the updated variable types.


### Exercise 5: Understanding References in Python
Understanding how references work in Python is crucial, especially when dealing with mutable and immutable objects. In this exercise, you will explore this concept through a practical example:

- Create a variable `str_1` and assign the string value `"hello"` to it.
- Now create another variable `str_2` and assign `str_1` to it.
- Modify `str_2` by concatenating the string `" world"` to it (i.e., `str_2 = str_2 + " world"`).
- Display both `str_1` and `str_2`.
- Now create a variable `list_1` and assign a list containing two string elements: `["hello", "world"]` to it.
- Create another variable `list_2` and assign `list_1` to it.
- Modify `list_2` by appending another string `"!"` to it (i.e., `list_2.append("!")`).
- Display both `list_1` and `list_2`.

## 3- Loops and Function in Python

### Introduction to Loops in Python

Loops in programming allow for the repetition of a block of code as long as a specified condition is met. They are essential for performing repetitive tasks with minimal code. Python provides two main types of loops: `for` loops and `while` loops.

#### For Loops
A `for` loop in Python is used to iterate over a sequence (such as a list, tuple, or string) or other iterable objects.


### Example of a for loop
```
for i in range(5):
    print(i)
```

#### While Loops
A `while` loop in Python is used to repeatedly execute a block of code as long as a condition is true.

```
# Example of a while loop
count = 0
while count < 5:
    print(count)
    count += 1
```


#### Loop Control Statements
Loop control statements change the execution of a loop from its normal sequence. Python supports the following control statements:

break: Terminates the loop and transfers execution to the statement immediately following the loop.
continue: Causes the loop to skip the rest of its body and immediately retest its condition prior to reiterating.


```
# Example using break and continue
for num in range(10):
    if num == 5:
        break  # Exit loop when num is 5
    elif num == 3:
        continue  # Skip iteration when num is 3
    print(num)
```








To recall, **a block in Python is defined by a certain level of indentation** (and begins after a line ending with "**`:`**").

The expression represented by `condition` is _evaluated_ as a **boolean** variable (for example, if `condition` is a string, the test is evaluated as `True` if and only if the string is non-empty).

>- Using your knowledge of the `print` function, define a `while` loop to display the multiplication table of the number 2 (factors 1 to 10):

In [None]:
## Q18: Insert your code here

>- As previously, display the multiplication table of 2, this time using a **`for`** loop.





In [None]:
## Q19: Insert your code here

### Function in python

Now that the loop structure has been introduced, we will present more advanced use cases.

The example of the multiplication table may seem a bit simplistic to you. To make it more general, we will define a **function** that displays this multiplication table for any number (we keep the same factors from 1 to 10 for now).

We remind you that the definition of a function in Python is done as follows:

```
def my_function(arg_1, arg_2):
    instruction_1
    ...
    instruction_N
    return something # if needed
```

- Define a function, called `simple_multiplication`, that takes a number n as an argument and displays the multiplication table as in previous questions:



In [None]:
## Q20: Insert your code here

## To go further..
### Exercise 1: Sum of Natural Numbers
- Write a Python program to find the sum of all natural numbers between 1 and a given number `n` using a `for` loop.

### Exercise 2: Factorial Calculation
- Write a Python program to calculate the factorial of a given number `n` using a `for` loop.

### Exercise 3: Table of Squares
- Write a Python program that displays the table of squares from 1 to a given number `n` using a `for` loop. Each line should be formatted as "`i` squared is `i*i`".

### Exercise 4: Counting the Occurrences
- Write a Python program to count the occurrences of a specific character in a given string using a `for` loop.

### Exercise 5: Accumulating the Elements of a List
- Write a Python program to find the cumulative sum of a list using a `for` loop.

### Exercise 6: Reversing a List
- Write a Python program to reverse the order of the elements in a list using a `for` loop.

### Exercise 7: Odd-Even Count
- Write a Python program to count the number of even and odd numbers from a series of numbers using a `for` loop.


#2. Understanding Numpy


## Context and Objective

Python is an almost indispensable programming language in the world of Quantitative finance. It's simple, open source, and increasingly popular.  
In this exercise, you will learn to use the NumPy module. NumPy is a Python package specialized in the manipulation of arrays.  
This exercise will only focus on one-dimensional arrays (vectors) and two-dimensional arrays (matrices).

[For more information on NumPy](http://www.numpy.org/)

## Prerequisite Skills

- Basic programming concepts
- Lists
- Basic linear algebra concepts

## Instructions

The exercise is composed of several questions; please answer them in order.

To begin, you need to import the `numpy` module using the alias `np`. Execute the following preamble cell:



In [None]:
import numpy as np

In Python, an array is an ordered collection of values, which can be of any type, not **only numbers**.

The `array()` method allows you to define a **one-dimensional array** from a list. Given `X` as a list of values, you can use the command `np.array(X)` to transform the list into a one-dimensional array.

* Create an array from the list `[1,1,1,1]`

In [None]:
## Q21: Insert your code here


There are commands to inquire about the variables we are manipulating. Here's a table summarizing these commands:

| Command    | Effect                                         | Example                     |
|------------|------------------------------------------------|-----------------------------|
| type(X)    | Returns the type of the variable X            | type(2) returns `<class 'int'>`      |
| np.shape(X)| Returns the dimension of the variable X       | np.shape([1,2]) returns (2,) |

By default, Numpy creates one-dimensional arrays from lists. If you want a different dimension, you should specify it using the command `np.reshape(X, new_shape)` where `X` is the array whose dimensions you want to change.

* Create a variable *a* and assign to it an array with the list [1,2,3,4,5]
* Verify that its dimension is indeed (5,)

In [None]:
## Q22: Insert your code here



Now that we've seen how to get information about arrays, we'd like to create some. There are various commands to generate one-dimensional arrays. Here's a table summarizing them:

| Command               | Meaning                                                        | Example                                    |
|-----------------------|----------------------------------------------------------------|--------------------------------------------|
| np.ones(n)            | Returns an array of dimension (n,) of 1s                        | np.ones(5) returns array([1, 1, 1, 1, 1])  |
| np.zeros(n)           | Returns an array of dimension (n,) of 0s                        | np.zeros(5) returns array([0, 0, 0, 0, 0]) |
| np.arange(n)          | Returns an array of dim(n,) of ordered numbers from 0 to n-1    | np.arange(5) returns array([0, 1, 2, 3, 4])|
| np.linspace(a,b,n)    | Returns an array of dim(n,) of n numbers evenly spaced between a and b | np.linspace(0,5,5) returns array([0, 1.25, 2.5, 3.75, 5.0])|
| np.linspace(a,b)      | Returns an array of dim(50,) of 50 numbers evenly spaced between a and b |                                            |
| np.concatenate((X,Y)) | Returns an array of dim(dimX+dimY,) resulting from the assembly of X and Y | np.concatenate((array([1]),array([0]))) returns array([1,0])|

- Create 3 variables a, b, c
- Assign to a an array with 5 zeros
- Assign to b an array with 5 ones
- Assign to c an array of size 10 containing 5 zeros followed by 5 ones, arranged judiciously.


In [None]:
## Q23: Insert your code here

- Generate two arrays of ordered numbers from 0 to 10 (thus of size 11) using different commands.
- Display them.

In [None]:
## Q24: Insert your code here

- Create a list `c` with numbers ranging from 0 to 10 **A list, not an array**. Use the following syntax: `list(range())`.
- Add 5 to all the terms in `c`. *You may need to change the type of c*.
- Display `c`.


In [None]:
## Q25: Insert your code here

We can perform similar operations with matrices, which are 2-dimensional arrays.

Thus, `np.ones((n, p))` returns an `NxP` matrix filled with ones, `np.zeros((n, p))` returns an `NxP` matrix filled with zeros.

`np.diag(v)` returns a matrix whose diagonal consists of the vector v. Moreover, `np.diag(v, k)` returns a matrix where the k-th diagonal consists of the vector v. k can be positive or negative; if k is positive, the shift is to the "right," otherwise to the left.

- Create a matrix *mat* of size 5x5 with 1s on the diagonal.
- Display it.

In [None]:
## Q26: Insert your code here


We can use mathematical operators **+**, **-**, on arrays provided that the mathematical operation makes sense.

**Caution: If you use the operators '\*' or '/' you will only perform a term-by-term operation**
- Create a 6x6 matrix with 1s on the diagonal and on the sub-diagonal using a mathematical operator.
- Display it.



In [None]:
## Q27: Insert your code here


Accessing specific elements of an array is done similarly to lists. If the array is two-dimensional, two parameters are needed.

*For example*: let X be a two-dimensional array, `X[0, 0]` returns the element located at row 1, column 1. `X[:, 0]` returns the first column. `X[0:3, 0]` returns the first three rows of the first column. This method is referred to as *slicing* in programming.

- Create this matrix using `np.ones()`, `np.diag()` and slicing:
$$
\begin{pmatrix}
5 & 0 & 0 & 0 \\
5 & 1 & 0 & 0 \\
4 & 4 & 4 & 4 \\
5 & 0 & 0 & 1
\end{pmatrix}
$$

In [None]:
## Q28: Insert your code here

With the Numpy module, you can create random numbers uniformly distributed between 0 and 1. The syntax is as follows: `np.random.rand()` to return a single draw, `np.random.rand(n)` to return a row array of n draws, and `np.random.rand(n, p)` to return an NxP matrix of uniformly distributed random draws.

- Display a random number distributed between 0 and 1


In [None]:
## Q29: Insert your code here

- Display a 5x5 matrix of random numbers distributed between 0 and 1

In [None]:
## Q30: Insert your code here

- Write a function `random_number()` that takes two integer parameters and returns a random number uniformly distributed between the two integers.
- Call the function `random_number(10, 15)`

*Note: If $X \sim U[0,1]$, then $Y := (b-a)X + a \sim U[a,b]$*



In [None]:
## Q31: Insert your code here

- Write a function `random_matrix()` that takes an integer parameter N and returns a NxN matrix with 1s everywhere except on the diagonal where there are numbers uniformly distributed between 0 and 1.
- Test for N=3 and N=5

> Example: `random_matrix(3)` should return a matrix similar to
$$
\begin{pmatrix}
0.62678954 & 1 & 1 \\
1 & 0.94077299 & 1 \\
1 & 1 & 0.29263003 \\
\end{pmatrix}
$$

In [None]:
## Q32: Insert your code here

- In NumPy, operations can be performed between arrays and scalars.
> Example:
```
a = np.array([1, 2, 3])
a * 4 returns array([4, 8, 12])
a + 2 returns array([3, 4, 5])
```

- Create a matrix mat_one of size 5x5 with fives on the diagonal
- Create two matrices mat_two and mat_two_bis of size 5x5 with twos everywhere, in two different ways
- Display the matrices

In [None]:
## Q33: Insert your code here

In NumPy, operations between arrays are performed element-wise by default.

> Example:
```
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
a * b returns array([4, 10, 18])
```

To perform matrix multiplication in the mathematical sense, the following syntax is used: np.dot(X,Y)

If the dimensions are incompatible, errors are triggered.



- Create a matrix `mat_one` of size 5x5 with random numbers.
- Create a matrix `mat_two` of size 5x5 with ones everywhere.
- Create a matrix `mat_three` and assign to it the element-wise product between `mat_one` and `mat_two`
- Create a matrix mat_four and assign to it the matrix product between `mat_one` and `mat_two`
- Display `mat_three` and `mat_four`

In [None]:
## Q34: Insert your code here


- Create a matrix `a` with dimensions 5x2 with arbitrary values
- Create another matrix `b` with dimensions 2x5 with arbitrary values
- Return the meaningful product of the two matrices here



In [None]:
## Q35: Insert your code here

The use of logical operators is possible via NumPy.

- Create two matrices *mat_one* and *mat_two* of size 5x5 with random values.
- Using the operator `*` and the logical operator '`==`', return a 5x5 matrix of True.
- Using matrix multiplication and the logical operator '`==`', return a 5x5 matrix of False.


In [None]:
## Q36: Insert your code here

Lastly, it is possible to analyze data with NumPy. Here are some functions summarized:

| Command   | Meaning                 |
|-----------|-------------------------|
| np.mean(X) | returns the mean of X   |
| np.var(X)  | returns the variance of X|
| np.std(X)  | returns the standard deviation of X |
| X.sum()  | sums the elements of X   |
| X.prod() | multiplies the elements of X |
| X.min()  | returns the minimum of X |
| X.max()  | returns the maximum of X |

Furthermore, when working with matrices, it's possible to specify a second argument or a parameter to clarify where we are working. For example:

```
mat = np.random.rand(5, 5)
np.mean(mat, axis = 0)  ## returns the mean of the rows
np.mean(mat, axis = 1) ## returns the mean of the columns
mat.sum(axis = 0) ## returns the sum of the rows
```
- Verify that the mean of a uniformly distributed law on [0,1] is close to 0.5 for a large number of draws.

**Note:**

*As the number of draws increases, the mean value of the uniformly distributed random values should converge to 0.5 according to the Law of Large Numbers.*

In [None]:
## Q37: Insert your code here

- Calculate this product using only Numpy methods, display the result

$$
\frac{\pi}2=\prod_{n=1}^{\infty}\frac{4n^2}{4n^2-1}
$$

- Compare using `np.math.pi`

---

In [None]:
## Q38: Insert your code here


## To go further..

**Exercise 1:**
- Create an array of 10 zeros.
- Create an array of 10 ones.
- Create an array of 10 fives.
- Create an array of the integers from 10 to 50.

**Exercise 2:**
- Create an array of all even integers from 10 to 50.
- Create a 3x3 matrix with values ranging from 0 to 8.
- Create a 3x3 identity matrix.
- Use indexing to replace the top row of the matrix from Exercise 2 with 9s.

**Exercise 3**
- Generate a random array of size 25. Find its mean.
- Generate a random matrix of size 5x5. Find the sum of all the elements, the sum of the columns, and the sum of the rows.

**Exercise 4**
- Multiply a 5x3 matrix by a 3x2 matrix using matrix multiplication.
- Multiply a 5x5 matrix by a 5x1 vector.

**Exercise 5**
- Create an array of 10 random numbers. Replace all the values less than 0.5 with 0.

#3. Exploring Pandas

In this exercise, you will learn to use the pandas module. Pandas is a Python package specialized in data manipulation.

[For more information on pandas click here](http://pandas.pydata.org/)

#### Required Skills

- Basic programming knowledge
- Lists
- Linear algebra concepts
- Introduction to NumPy

#### Instructions

The exercise consists of several questions, complete them in order.

To begin, you need to import the `pandas` module under the abbreviated name `pd`. Therefore, execute this preamble cell. NumPy will also be used in this exercise.



In [None]:
# Importing necessary libraries
import pandas as pd
import numpy as np

> A Series is a one-dimensional array with labels, that can hold any data type. The labels are referred to as the index. Its syntax is as follows: pd.Series(X) where X is a list or an array.
- Create a Series of 5 data points from a random list of numbers distributed between 0 and 1.

In [None]:
## Q39: Insert your code here

> It is possible to specify the indices using the following syntax: pd.Series(X, index = Y) where X is the data list and Y is a list of associated indices.
- Create a Series of 4 data points, each with a value of 1, and specify the following list of indices: ['a', 'b', 'c', 'd'].

In [None]:
## Q40: Insert your code here

> By using indices, you can access the data in the Series in the same way you access elements in a list.

Slicing is also possible.
- Create a variable named series_one and assign it a Series created from a list of 4 random numbers distributed between 0 and 1.
- Use the list of indices from the previous question: ['a', 'b', 'c', 'd'].
- Retrieve the first element of series_one using the corresponding index.

In [None]:
## Q41: Insert your code here

- Change the fourth data point of `series_un` to 0.
- Display `series_un`.

In [None]:
## Q42: Insert your code here

Python consistently returns a `dtype: float64` when calling the Series. This represents the data type, in this case floats, and their encoding, here on 64 bits. You can specify the data type you want to handle when creating a Series.

Furthermore, you can name the Series using the `name` parameter.

- Create a variable `series_two` from an array of four ones.
- Specify the data type `dtype` as `int`.
- Name this Series `my_series`.
- Display the Series.

In [None]:
## Q43: Insert your code here

The `describe()` function returns a variety of information about the Series it is applied to.

- Create a variable `series_three` from an array of 20 random numbers uniformly distributed between 0 and 1.
- Display information about the Series using `describe()`.


In [None]:
## Q44: Insert your code here

It's possible to add Series together. Pandas will sum the data with matching *indices*. If an *index* is missing in one of the Series, the resulting sum Series will display `NaN` (Not a Number) at that index.

- Create a Series `series_four` from an array of 19 random numbers uniformly distributed between 0 and 1.
- Sum `series_three` and `series_four`.

In [None]:
## Q45: Insert your code here

However, you can specify a particular value to use where the *indices* do not match during a summation. The following syntax is used:
```
## Assume a and b are two Series
a.add(b, fill_value = 0)  ## we decide to replace with 0
```

- Sum `series_three` and `series_four` by specifying fill_value equal to 100.

In [None]:
## Q46: Insert your code here

Lastly, it's possible to use mathematical operators on Series. The following syntax is used:
```
# Assume a is a Series
a[a >= 0.5]  ## returns the data from a greater than 0.5
a * 2  ## multiplies the data from a by two
```

- Create a variable `a`, and assign it a Series of integer numbers uniformly distributed between 1 and 20, with a size of 20.
- Display the Series with data strictly greater than 10.

In [None]:
## Q47: Insert your code here

To conclude on the manipulation of Series, we propose the following instructions:

- Create an *index* of size 20 that includes "boy" or "girl" randomly distributed.
- Create an array of size 20 that displays ages ranging from 3 to 16 years, randomly distributed.
- Create a Series `cousins` with `name = "my cousins"`, the index created previously, and data from the array.
- Retrieve the Series of "boys" into a Series `boys` and the Series of "girls" into a variable `girls`.
- Display information about these two Series.


In [None]:
## Q48: Insert your code here

Now we turn our attention to DataFrames. DataFrames are the two-dimensional extension of Series. Thus, the *indices* are shared among the columns of the DataFrame.

A common way to create a DataFrame is by using a dictionary. The syntax is as follows:
```
pd.DataFrame({'Name of the first column': data, 'Name of the second column': data})
```
- Create a DataFrame data_one with two columns: 'Gender' and 'Age', using the data from the previous question.
- Display the DataFrame.

In [None]:
## Q49: Insert your code here

- Create a list *dominant_hand* of size 20 that contains "left-handed" or "right-handed" distributed randomly.
- Add this list as a new column to *data_one*.
  *Use the command `data_one['Dominant Hand'] = dominant_hand`*


In [None]:
## Q50: Insert your code here

*Slicing* is possible with DataFrames.

- Display the first 5 rows of *data_one*.

In [None]:
## Q51: Insert your code here

- Display the columns "Gender" and "Dominant Hand".


It is possible to concatenate two DataFrames using the command `pd.concat()`. The syntax is as follows:
```
# Assume X and Y are two DataFrames
pd.concat([X,Y], axis = 0)  ## concatenates vertically
pd.concat([X,Y], axis = 1)  ## concatenates horizontally
```
- Create a list of size 20 that includes "red", "blue", or "green" distributed randomly.
- Create a DataFrame data_two from the previous list.
- Add a name to the column using the command data_two.columns = ['Name of Column'].
- Concatenate data_one and data_two into a new variable data_three.


In [None]:
## Q52: Insert your code here

## **Open Exercise**
We advice you to use pandas documentations or stackoverflow to find the answers of the following exercises .
#### 1. Basic DataFrame Operations:

- Create a DataFrame from a dictionary with keys: 'Name', 'Age', 'City' and populate it with some data.
- Display the first 5 rows of the DataFrame.
- Display the last 3 rows of the DataFrame.
- Display the data types of each column.

#### 2.Indexing and Selection:

- Select the 'Name' and 'City' columns from the DataFrame.
- Select the row at index 2 from the DataFrame.
- Select the rows where 'Age' is greater than 25.

#### 3. Sorting and Ranking:

- Sort the DataFrame based on 'Age' in descending order.
- Rank the DataFrame based on 'Age', with the oldest as rank 1.

#### 4. Missing Data:

- Introduce some missing values in the DataFrame using np.nan.
- Fill the missing values with the mean of the non-missing values.
- Drop the rows with missing values.

#### 5.Grouping and Aggregation:

- Group the DataFrame by 'City' and calculate the mean age for each city.
- Find the maximum and minimum age for each city.

#### 6.Merging, Joining, and Concatenating:

- Create a second DataFrame with keys: 'Name', 'Job Title'.
- Merge the two DataFrames on the 'Name' column.
- Concatenate the two DataFrames vertically and then horizontally.

#4. Visualizing Data with Matplotlib



> Matplotlib is a Python library that serves as a powerful tool for plotting and visualizing data. It is designed to produce a wide variety of plots and graphs. Matplotlib includes a sub-library called pyplot, which creates an interface similar to that of the commercial software Matlab, and contains functions very similar to it.

> There are libraries like Seaborn that can automatically beautify the figures or give them a different style, but we will not be integrating them in this training.

> Furthermore, Matplotlib is a very rich library, and not all its functionalities can be covered. This training chooses to explore certain functions more than others, with the overall goal of enabling any student to become proficient with the module by the end of the course.

> To begin, it's necessary to import the matplotlib.pyplot module under the shortened name of 'plt'. <br>
Once a graph is constructed, the plt.show() command will allow you to visualize it.<br>
However, in a Notebook, like here on collab, adding %matplotlib inline at the beginning of the page will automatically display the figures with each modification or use of a pyplot command, after the execution of the cell.<br>

- Import matplotlib.pyplot and add %matplotlib inline
- Import numpy

In [None]:
## Q53: Insert your code here


**Curve Definition and Plotting**
> A curve is, by definition, a set of points with coordinates (x, y) that may or may not be connected by a line. The more points there are, the smoother the curve will appear.

> The plot() method allows for plotting curves that connect points whose x (abscissa) and y (ordinate) values are provided in lists or arrays.<br>
To plot a graph with 'x' values on the horizontal axis and 'y' values on the vertical axis, we write: plt.plot(x,y).

- Plot a curve with the x-values [0,2,4,6] and the y-values [1,4,4,8].

In [None]:
## Q54: Insert your code here

**Automatic Abscissa Generation**

> If only a single list or array is inserted into the plot() command, Matplotlib assumes that it's a sequence of y (ordinate) values and automatically generates the x (abscissa) values for you. The x values will be the indices of the y values, starting from 0.

- Plot a curve using the list [1,3,2,4].


In [None]:
## Q55: Insert your code here

**Adding Titles and Axis Labels**

> To add a title to the graphs, we use the title method.
> To add labels to the axes, we use the xlabel and ylabel methods.

- Plot a curve passing through the following points: (50,1), (100,3), (200,4).
- Title the figure 'My First Curve'.
- Label the x-axis as 'abscissa' and the y-axis as 'ordinates'.

In [None]:
## Q56: Insert your code here

> The plot() function simply connects points in the order they are provided. It's possible to provide multiple points with the same x-coordinate to draw a specific shape.

- Create the following x and y lists: x = [0, 0, 1, 1, 0, 0.5, 1] , y = [1, 0, 0, 1, 1, 2, 1].<br>
Use the plot function to connect these points and set the limits of both axes from -1 to 2.

In [None]:
## Q57: Insert your code here

Similarly, one can plot a parametric curve using a sequence **t**. For this, provide the `plot` method with one function of **t** for the x-coordinates and another for the y-coordinates.

- Use the **linspace** method from *Numpy* to create a sequence of 100 numbers between 0 and $2\pi$.
- Plot the parametric curve defined by $$ f(t) = (\sin(2t), \sin(3t)), t \in [0, 2\pi].$$


In [None]:
## Q58: Insert your code here

- Create an Array of values from 0 to $2\pi$
- Calculate sine values for these points.
- Calculate cosine values for these points.
- Plot both curves on the same graph.
- Make the sine curve blue and the cosine curve red.
- Add a title, legend, and labels for the X and Y axes.


In [None]:
## Q59: Insert your code here




**Basic Scatter Plot**

*We advice you to check on matplotlib documentation to see how to plot a basic scatter plot.*

- Generate two arrays of 100 random values each for X and Y coordinates. You can use numpy for this.
- Generate an array of 100 random values to be used for coloring the points.
- Create a scatter plot using the X, Y, and color arrays.
- Add a colorbar to the plot.


In [None]:
## Q60: Insert your code here


**Histogram**

- Generate 1000 random numbers from a normal distribution with mean 0 and standard deviation 1.
- Create a histogram using these numbers.
Label the X axis as "Value" and Y axis as "Frequency".
- Add a title "Histogram of Normally Distributed Random Numbers".

In [None]:
## Q61: Insert your code here

**Bar Chart**
- Create an array of months: ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'].
- Generate a random array of sales values for each month (for simplicity, use integer values between 5000 and 15000).
- Plot a bar chart using the months and sales data.
- Add a title, and labels for the X and Y axes.
- Color the bar of the month with the highest sales in green.

In [None]:
## Q62: Insert your code here

**Pie Chart**
- Create an array of departments: ['R&D', 'Marketing', 'Sales', 'Operations', 'Others'].
- Generate a random array of budget allocations for these departments (making sure they sum to 1).
- Plot a pie chart with labels being the department names.
- Highlight the department with the largest budget by "exploding" that slice.
Add a title "Budget Distribution Across Departments".

In [None]:
## Q63: Insert your code here

**Exercise: Simulating Asset Paths using Geometric Brownian Motion**

**Goal:** Create a simulation of potential future stock prices using the Geometric Brownian Motion (GBM) model and visualize the results.

**Import Required Libraries**
- Import numpy and matplotlib.pyplot

**Define GBM Parameters**
- $S_0$: Initial stock price
- $T$: Time horizon (in years)
- $r$: Risk-free rate (annualized)
- $sigma$: Volatility (annualized)
- $dt$: Time increment, e.g., a day
- $n$: Number of time steps
- $m$: Number of potential paths to simulate

**Simulate GBM Paths**

We recall that GBM is defined as
$$ dS_t = \mu S_t dt + \sigma S_t dW_t $$

Where:
- $S_t$: Stock price at time $t$.
- $\mu$: Expected return or the drift, which is typically the risk-free rate.
- $\sigma$: Volatility of the stock.
- $dW_t$: Wiener process or Brownian motion.

For discrete time intervals, the formula to simulate the stock price $S_{t+1}$  given a stock price $S_t$ is:

$$ S_{t+1} = S_t  \exp \left( (\mu - \frac{\sigma^2}{2}) \Delta t + \sigma \sqrt{\Delta t} Z \right) $$

Where:
- $ \Delta t$: Size of the time step.
- $Z$: A random draw from a standard normal distribution.

**Visualize the Simulated Paths**




In [None]:
## Q64: Insert your code here

## Matplotlib Exercises

### 1. Basic Line Plot
- Generate a sequence of numbers from -10 to 10.
- Plot their square values on a graph. Label the x-axis as "Numbers" and the y-axis as "Squares" and give the plot a title.

### 2. Multiple Line Plots
- Using the same sequence from the previous exercise, plot both the square and cube values on the same graph. Use different colors and line styles for each plot and add a legend to distinguish between them.

### 3. Scatter Plot
- Generate 50 random numbers for x-values and y-values.
- Plot them using a scatter plot. Experiment with changing the size, color, and marker style.

### 4. Histogram
- Generate 1000 random numbers following a normal distribution using `numpy`.
- Plot a histogram to visualize the distribution. Play around with the number of bins.

### 5. Bar Chart
- Take a list of fruits: `['Apple', 'Banana', 'Cherry', 'Date', 'Elderberry']`.
- Assume some random sale values for each fruit and plot a bar chart to visualize fruit sales.

### 6. Pie Chart
- Using the fruit sales data from the previous exercise, plot a pie chart to show the proportion of sales by fruit. Ensure each section of the pie chart is labeled and displays a percentage.

### 7. Subplots
- Create four subplots in a 2x2 grid.
  - In the top-left, plot a sine curve.
  - In the top-right, plot a cosine curve.
  - In the bottom-left, plot a tangent curve.
  - In the bottom-right, plot a circle.
- Ensure each subplot has a title.

### 8. 3D Plotting
- Create a 3D surface plot. You can use the `numpy` meshgrid function to generate x, y, and z values. For instance, plot the surface \( z = x^2 + y^2 \).

### 9. Animation
- Create an animation of a sine wave whose frequency increases over time.

### 10. Customization
- Choose any of the previous exercises and experiment with the aesthetics. Change colors, fonts, line styles, marker styles, etc. Also, explore how to add text annotations, grid lines, and other custom elements.

