![ecole.png](attachment:1fa979c5-d55c-4dc3-a945-05619b9b66b1.png)
### Higher School of Technology

# Python basics
**Prerequisites**

[Install tools](local_install.ipynb)

## Variable Assignment

The first thing we will learn is the idea of *variable assignment*.

Variable assignment associates a value to a variable name.

Below, we assign the value “Hello World” to the variable `x`

In [None]:
x = "Hello World"

Once we have assigned a value to a variable, Python will remember this variable as long as the *current* session of Python is still running.

Notice how writing `x` into the prompt below outputs the value “Hello World”.

In [None]:
x

However, Python returns an error if we ask it about variables that have not yet
been created.

In [None]:
# uncomment (delete the # and the space) the line below and run
# y

### Exercise 1

What do you think the value of `z` is after running the code below?

In [None]:
z = 3.1456
z = z + 4
z

### Code Comments

Comments are short notes that you leave to explain what the code does.

A comment is made with the `#`. Python ignores everything in a line that follows a `#`.

Let’s practice making some comments.

In [None]:
i = 1 # Assign the value 1 to variable i
j = 2 # Assign the value 2 to variable j
i + j # We add i and j below this line

## Functions

Functions are processes that take an input (or inputs) and produce an output.

If we had a function called `f` that took two arguments `x` and
`y`, we would write `f(x, y)` to use the function.

For example, the function `print` simply prints whatever it is given.
Recall the variable we created called `x`.

In [None]:
x = 3
print('x=',x)
print('x='+str(x))
print('z=',round(z,2))

You can use either `"` or `'` to create a mmessage. Just make sure
that you start and end the string with the same one!

### Getting Help

We can figure out what a function does by asking for help.

In Jupyter notebooks, this is done by placing a `?` after the function
name (without using parenthesis) and evaluating the cell.

For example, we can ask for help on the print function by writing `print?`.

In [None]:
# print? # remove the comment and <Ctrl-Enter>

## Numeric data

Python has two types of numbers:

1. Integer (`int`): These can only take the values of the integers i.e. $ \{\dots, -2, -1, 0, 1, 2, \dots\} $  
1. Floating Point Number (`float`): Think of these as any real number such as $ 1.0 $, $ 3.1415 $, or $ -100.022358923223 $… 

The easiest way to differentiate these types of numbers is to find a decimal place after the number.

A float will have a decimal place, but an integer will not.

Below, we assign integers to the variables `xi` and `zi` and assign floating point numbers to the variables `xf` and `zf`.

In [None]:
xi = 1
xf = 1.0
zi = 123
zf = 1230.5  # Notice -- There are no commas!

## Exercise
Create the following variables:

- `a`: An integer number with the value 3  
- `b`: An integer number with value 2  
- `c`: A floating point number with value 2.5  

We will use them in a later exercise.

In [None]:
# your code here!

### Python as a Calculator

You can use Python to perform mathematical calculations.

In [None]:
print("a + b is", a + b)
print("a - b is", a - b)
print("a * b is", a * b)
print("a / b is", a / b)
print("a ** b is", a**b)

You likely could have guessed outputs

Python uses `**` for exponentiation (raising a number to a power)!

All operations involving a float will result in a float.

All operations involving a integer will result in an integer, but `/` converted the result to a float.

We can also chain together operations.

When doing this, Python follows the standard [order of operations](https://en.wikipedia.org/wiki/Order_of_operations) — parenthesis, exponents,
multiplication and division, followed by addition and subtraction.

For example : 

In [None]:
x = 2.0
y = 3.0
z1 = x + y * x
z2 = (x + y) * x

What do you think `z1` is?

How about `z2`?

### Exercise
Calculate the monthly payment "mp" of a principal loan "pl" of 1 000 000 DA, over a loan duration in years "ldy" of 10, with a fixed interest rate "ir" of 6%, using the following formulas:
compute the number of payments "np"
$$
\text{np = ldy*12}
$$
then compute the monthly rate "mr"
$$
\text{mr} = \frac{ir}{(100 * 12)}
$$
$$
\text{mp} = pl * (mr + \frac{mr}{(1 + mr)^{np} - 1})
$$

In [None]:
# initialize the variables

# Since payments are once per month, number of payments is number of years for the loan * 12

# Calculate the monthly payment based on the formulas

# Result


### Comparison Operators

Making comparison create a booleans variables, for example, you might want to evaluate whether the price of a particular asset
is greater than or less than some price.

For two variables `x` and `y`, we can do the following comparisons:

- Greater than: `x > y`  
- Less than: `x < y`  
- Equal to: `==`  
- Greater than or equal to: `x >= y`  
- Less than or equal to: `x <= y`  

We demonstrate these below.

In [None]:
a = 4
b = 2
print("a > b", "is", a > b)
print("a < b", "is", a < b)
print("a == b", "is", a == b)
print("a >= b", "is", a >= b)
print("a <= b", "is", a <= b)

# Modules

By module we mean sets of related tools are bundled together into *packages* or library

For example:

- `pandas` is a package that implements the tools necessary to do scalable data analysis.  
- `matplotlib` is a package that implements visualization tools.  
- `requests` and `urllib` are packages that allow Python to interface with the internet.  

As we move further into the class, being able to access these packages will become very important.

We can bring a package’s functionality into our current Python session by writing

In [None]:
import package_name

Once we have done this, any function or object from that package can be accessed by using `package.name`.

Here’s an example.

In [None]:
import sys   # for dealing with your computer's system
sys.version  # information about the Python version in use

### Module Aliases
Some packages have long names (see `matplotlib`, for example) which makes accessing the package functionality somewhat inconvenient.

To ease this burden, Python allows us to give aliases or “shortnames” to packages.

For example we can write:

In [None]:
import package as p

This statement allows us to access the packages functionality as
`p.function_name` rather than `package.function_name`.

Some common aliases for packages are

- `import pandas as pd`  
- `import numpy as np`
- `import matplotlib.pyplot as plt`
- `import datetime as dt`  

While you *can* choose any name for an alias, we suggest that you stick to the common ones.

# Used Modules
In the following section, we will present the librairies used in the labs, and describe the most important functions. 

## 1. Pandas (pandas)

`pandas` is a Python library designed for data manipulation, cleaning, and analysis—especially powerful when working with tabular financial data like stock prices, earnings reports, and time series.

### Common Instructions in Finance Labs:

✅ 1. Importing pandas

loads the library and allows you to use pd as a shortcut prefix throughout your code.

In [None]:
import pandas as pd

✅ 2. Loading Financial Datasets

Reads data from a CSV file into a DataFrame (a 2D table of data). This is the starting point for most analysis.

You can also load Excel or JSON:

data_frame_name = pd.read_csv('data_path.xls')
data_frame_name = pd.read_csv('data_path.json')

In [None]:
stock_data = pd.read_csv('datasets/stock_details_3_years.csv')

✅ 3. Exploring the Data

These methods help you understand the structure of the dataset.

In [None]:
stock_data.head() # Shows the first 5 rows
stock_data.info() # Shows data types and null values
stock_data.describe() # Shows statistical summary (mean, std, min, max, etc.)
stock_data.shape # Shows the size of the data

✅ 4. Access Data by iloc

iloc is used for position-based access (like arrays).

In [None]:
stock_data.iloc[0] # First row of data
stock_data.iloc[:5] # First five rows
stock_data.iloc[2,3] # Element at row 2 and column 3
stock_data.iloc[2,:] # Elements at entire row 2
stock_data.iloc[:,3] # Elements at entire column 3

✅ 5. Access Columns by Name

Accessing columns using their string labels is more readable and flexible.

In [None]:
close_prices = stock_data['Close'] # Get the 'Close' column by column name
close_prices = stock_data.Close # Get the 'Close' column by column reference
price_volume = stock_data[['Close', 'Volume']] # Get multiple columns names
high_closes = stock_data[stock_data['Close'] > 180] # Filter rows by value
high_closes = stock_data[stock_data.Close > 180] # Filter rows by value
stock_data_google = stock_data[stock_data['Company'] == 'GOOGL'].copy()
stock_data_google = stock_data[stock_data.Company == 'GOOGL'].copy()

✅ 6. Grouping Data by Time Period or Categories

Use groupby() to summarize data.

In [None]:
group_stats = stock_data.groupby('Company')['Open'].mean() # average open price per company
group_stats = stock_data.groupby('Company')['Open'].count() # number of open price per company

✅ 7. Correlation Between Financial Features

Use corr() to analyze relationships between variables (like returns and volume).

In [None]:
correlation_matrix = stock_data[['Open', 'Close', 'Volume']].corr() 
# Correlation matrix of numeric columns.

✅ 8. Setting Date as Index

In pandas, the set_index() method sets one (or more) of the columns of a DataFrame as the row index when analyzing time series (like stock prices), it's common to set the date as the index.

In [None]:
stock_data = stock_data.set_index('Date') # Setting the Date as the index helps to filter data by Date
# or
stock_data.set_index('Date',inplace=True) # Setting the Date as the index helps to filter data by Date

In [None]:
start_date = '2021-01-01'
end_date = '2021-01-31'
stock_data = stock_data_google.loc[start_date:end_date] # Filter data by date

✅ 9. Drop column

Drop clean the irrelevant column for analysis or modeling (e.g., it's just an identifier).

In [None]:
stock_data = stock_data.drop(columns=['Company'])  # Preferred
# OR
stock_data.drop(columns=['Company'], inplace=True)

## 2. NumPy (numpy)

`numpy` provides high-performance multidimensional arrays and tools for numerical computations—often used under the hood by pandas, matplotlib, and ML libraries.

### Common Instructions in Finance Labs:

✅ 1. Importing numpy

Loads the library and allows you to use np as a shortcut prefix throughout your code.

In [None]:
import numpy as np

In [None]:
stock_data.iloc[5,2] = np.nan # Simulates missing data using np.nan which is NumPy’s missing data
correct = np.sum(stock_data.Close == stock_data.Open) # counts how many close = open
absolute_errors = np.abs(stock_data.Close - stock_data.Open) # take absolute value
mae = np.mean(absolute_errors) # compute the mean 

## 3. MatPlotLib (matplotlib)

`matplotlib.pyplot` is the most widely used plotting library in Python. It creates static, animated, and interactive plots.

### Common Instructions in Finance Labs:

✅ 1. Importing matplotlib

In [None]:
import matplotlib.pyplot as plt

✅ 2. Line plot of closing prices

In [None]:
plt.plot(stock_data_google.index, stock_data_google['Close'])
plt.title("Google Stock Prices")
plt.xlabel("Date")
plt.ylabel("Google Close Price (USD)")
plt.show()

## 4. Seaborn (seaborn)

`seaborn` offers more advanced visualizations and simpler syntax for statistical graphics.

### Common Instructions in Finance Labs:

✅ 1. Importing seaborn

In [None]:
import seaborn as sn

✅ 2. Plot histogram of distribution

In [None]:
sn.histplot(data=stock_data_google['Open'])
plt.title("Google Opening Price of the Stock")
plt.xlabel("Open")
plt.show()

✅ 3. Plot time series line

Helps to identify trends and seasonality.

In [None]:
sn.lineplot(data=stock_data_google, x="Date", y="Open")
plt.title("Google Opening Price of the Stock")
plt.xlabel("Stock Price")
plt.show()

✅ 4. Plot a boxplot

Excellent for analyzing volatility, price ranges, and detecting abnormal market behavior.

In [None]:
sn.boxplot(x=stock_data_google['Close'])
plt.title("Google Close Price of the Stock Boxplot")

✅ 5. Plot a correlation heatmap

Key for portfolio analysis, feature selection, and understanding relationships between financial variables (e.g., Open, Close, Volume).
plt.show()

In [None]:
corr = stock_data_google.corr(numeric_only=True) # compute the correlation
sn.heatmap(round(corr, 2), annot=True, cmap='coolwarm')
plt.title("Correlation Heatmap")
plt.show()

## 5. Scikit-learn (sklearn)

`sklearn` is a machine learning library with tools for classification, regression, clustering, and more.

### Common Instructions in Finance Labs:

✅ 1. Importing sklearn

In [None]:
from sklearn import * # imports all submodules, classes, and functions) from the sklearn library

✅ 2. Initialize Model

Build different models:
- DecisionTreeClassifier: Predicts categories (e.g., "loan approved" vs "rejected")
- SVC: Support Vector Classifier (robust to noise, good in high-dimensional space)
- DecisionTreeRegressor: Predicts continuous values (e.g., stock price)
- SVR: Support Vector Regression

In [None]:
model = tree.DecisionTreeClassifier()
model = svm.SVC()
model = tree.DecisionTreeRegressor()
model = svm.SVR()

✅ 3. Train Model

Trains the model using the training data to learns relationships between features and target.

In [None]:
model.fit(x_train, y_train)

✅ 4. Make Predictions

Makes predictions using the trained model on new/unseen data.

In [None]:
y_pred = model.predict(x_test) 

✅ 5. Normalize Features

Standardizes features to prevents features with larger units from dominating.

In [None]:
scaler = preprocessing.StandardScaler()
x_train_scaled = scaler.fit_transform(x_train) 

✅ 6. Evaluate Model Using Confusion Matrix

Computes the confusion matrix to evaluate classification performance.

In [None]:
cm = metrics.confusion_matrix(y_test, y_pred)

## 6. Keras (tensorflow.keras)

`keras` is a high-level deep learning API that runs on top of TensorFlow, used to build, train, and evaluate neural networks. It’s ideal for time series forecasting, stock trend prediction, and price classification.

### Common Instructions in Finance Labs:

✅ 1. Importing keras

Imports the Keras API from TensorFlow. This allows you to access modules like:
- Sequential (model architecture)
- Dense, Conv1D, LSTM, etc. (layers)
- optimizers, losses, metrics

In [None]:
from tensorflow import keras
from keras import *

✅ 2. Build Model

Initializes a Sequential model, which stacks layers linearly (one after the other).

In [None]:
model = keras.Sequential()

✅ 3. Model Summary

Prints a summary of the model architecture: layers, output shapes, number of parameters.

In [None]:
model.summary()

✅ 4. Compile the Model

Prepares the model for training:
- optimizer='adam': adaptive optimizer for faster convergence.
- loss='mse': mean squared error, often used for regression tasks like predicting stock prices.

In [None]:
model.compile(optimizer='adam', loss='mse')

✅ 5. Train the Model
The training process adjusts model weights to minimize the loss.

In [None]:
model.fit(X_train, y_train)

✅ 6. Make Predictions

Uses the trained model to make predictions on the test set (x_test).

For classification, you might convert output probabilities into class labels using numpy :

In [None]:
y_pred_cnn = model.predict(x_test)
y_pred_cnn = np.argmax(y_pred_cnn, axis=1)