# Introduction to Machine Learning


## Outline
1. Introduction to Artificial Intelligence and Machine Learning
    * What is Artificial Intelligence (AI)?
    * What is Machine Learning (ML)?
    * Relationship between AI and ML
    * Importance of Machine Learning
2. Goals and Objectives of Machine Learning
    * Supervised Learning
    * Unsupervised Learning
    * Reinforcement Learning
    * Semi-supervised Learning
    * Self-supervised Learning
    * Transfer Learning
3. Basic Concepts in Machine Learning
    * Features and Labels
    * Training and Testing Data
    * Model Evaluation Metrics (Accuracy, Precision, Recall, F1-Score)
4. Main Directions in Machine Learning
    * Regression
    * Classification
    * Clustering
    * Dimensionality Reduction
    * Anomaly Detection
5. Basic Machine Learning Methods
    * Linear Regression
    * Logistic Regression
    * Decision Trees
    * k-Nearest Neighbors
    * Support Vector Machines
6. Introduction to Deep Learning
    * Artificial Neural Networks
    * Convolutional Neural Networks
    * Recurrent Neural Networks
    * Generative Adversarial Networks
    * Transfer Learning
7. Python Libraries for Machine Learning
    * NumPy
    * pandas
    * Matplotlib
    * scikit-learn
    * TensorFlow
    * Keras
    * PyTorch
8. Machine Learning Examples
    * Simple Linear Regression Example
    * Image Classification Example
    * Time Series Forecasting Example
9. Conclusion
    * Recap of the tutorial
    * Real-world applications of Machine Learning
    * Further resources and recommendations

## 2. Introduction to Artificial Intelligence and Machine Learning
### 2.1. What is Artificial Intelligence (AI)?
Artificial Intelligence (AI) is the field of computer science that aims to create machines that can simulate or mimic human intelligence. The goal is to enable computers to perform tasks that would typically require human intelligence, such as speech recognition, visual perception, decision-making, and natural language understanding.
<img src="https://www.researchgate.net/profile/Stephan-De-Spiegeleire/publication/316983844/figure/fig5/AS:494820222095361@1494985742395/AN-OVerVieW-OF-NOTABLe-APPrOACHeS-AND-DiSCiPLiNeS-iN-Ai-AND-MACHiNe-LeArNiNg-101_W640.jpg" alt="AI Overview" width="600"/>


### 2.2. What is Machine Learning (ML)?
Machine Learning (ML) is a subset of Artificial Intelligence that involves the development of algorithms that enable computers to learn from and make predictions or decisions based on data. Rather than being explicitly programmed, these algorithms allow computers to improve their performance on a specific task as they are exposed to more data over time.
<img src="https://miro.medium.com/max/1000/1*ZB6H4HuF58VcMOWbdpcRxQ.png" alt="ML Overview" width="600"/>


### 2.3. Relationship between AI and ML
AI is a broad field that encompasses various techniques and approaches to achieve intelligent behavior in machines. Machine Learning is one of the approaches within AI, and it's currently the most successful and widely used. In other words, Machine Learning is a means to achieve AI.
<img src="https://i0.wp.com/blog.forumias.com/wp-content/uploads/2022/07/AI-ML-DL.jpg?w=958&ssl=1" alt="AI and ML Relationship" width="600"/>

### 2.4. Importance of Machine Learning
Machine Learning has become increasingly important in recent years due to its ability to solve complex problems and provide valuable insights from vast amounts of data. Some key factors driving the growth of Machine Learning include:
1. **Data Availability**: The exponential growth of data generated by various sources, such as social media, IoT devices, and e-commerce, has provided a rich source of information for Machine Learning algorithms to learn from.
2. **Computational Power**: Advances in computing hardware, such as GPUs and TPUs, have enabled the processing of large datasets and the training of complex models in a much shorter time.
3. **Algorithmic Advancements**: Researchers and practitioners have developed more efficient and accurate Machine Learning algorithms, enabling better performance across a wide range of tasks.
4. **Real-world Applications**: Machine Learning has been successfully applied to various domains, including healthcare, finance, transportation, and marketing, leading to significant improvements in efficiency, accuracy, and decision-making.
![Importance of ML](https://miro.medium.com/max/1200/1*U_L8qI8RmYS-MOBrYvXhSA.png)

## 3. Goals and Objectives of Machine Learning
Machine learning is a subfield of artificial intelligence that focuses on creating algorithms that can learn from and make predictions or decisions based on data. The primary goal of machine learning is to enable machines to automatically learn from past experiences, adapt to new situations, and improve their performance without explicit programming. There are several different approaches to achieving this goal, and these can be broadly categorized into five types: Supervised Learning, Unsupervised Learning, Reinforcement Learning, Semi-supervised Learning, and Transfer Learning.

### 3.1. Supervised Learning
![Supervised Learning](https://techvidvan.com/tutorials/wp-content/uploads/sites/2/2020/07/Supervised-Learning-in-ML.jpg)

Supervised Learning is the most common type of machine learning. In this approach, an algorithm is trained on a labeled dataset, where the input features and their corresponding output labels are provided. The primary goal of supervised learning is to learn a mapping from inputs to outputs and use this mapping to make predictions on new, unseen data. Supervised learning can be further divided into two types: classification and regression. In classification, the output is a discrete label, while in regression, the output is a continuous value.

### 3.2. Unsupervised Learning
![Unsupervised Learning](https://miro.medium.com/v2/resize:fit:1400/0*tamvSiqDneDfw2Vr)

Unsupervised Learning is a type of machine learning where the algorithm is provided with an unlabeled dataset, and its objective is to find patterns or structures in the data. Common unsupervised learning tasks include clustering, dimensionality reduction, and anomaly detection. In clustering, the goal is to group similar data points together, while in dimensionality reduction, the goal is to reduce the number of features in the dataset while preserving its underlying structure.

### 3.3. Reinforcement Learning
![Reinforcement Learning](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*4u2GtNnMa9xso1WkLh7hVA.png)

Reinforcement Learning is an approach to machine learning where an agent learns to make decisions by interacting with an environment. The agent receives feedback in the form of rewards or penalties for its actions and adjusts its behavior accordingly to maximize the cumulative reward. Reinforcement learning is particularly useful in situations where the optimal solution is not known in advance and must be learned through trial and error.

### 3.4. Semi-supervised Learning
<img src="https://miro.medium.com/v2/resize:fit:1400/format:webp/1*CzVENZj3bWrwhRBN4hQq7Q.png" alt="Semi-supervised Learning" width="600"/>

Semi-supervised Learning is a type of machine learning that combines elements of both supervised and unsupervised learning. In this approach, the algorithm is trained on a dataset that contains both labeled and unlabeled data. The goal of semi-supervised learning is to leverage the information contained in the unlabeled data to improve the performance of the algorithm on the labeled data. This can be particularly useful in situations where obtaining labeled data is expensive or time-consuming.



### 3.5. Self-supervised Learning
<img src="https://velog.velcdn.com/images/jaeheon-lee/post/b7c932cc-cc96-4b1d-a3a9-d890e8ba3d3c/image.png" alt="Self-supervised Learning" width="600"/>
Self-supervised Learning is an emerging type of machine learning that aims to learn useful representations of data without relying on human-annotated labels. It can be considered as a subcategory of unsupervised learning but with a specific focus on learning representations that can be useful for downstream tasks, such as classification or regression.

In self-supervised learning, an algorithm learns to solve a "pretext" task, which is derived from the input data itself, without requiring any additional labels. For example, the pretext task could be predicting the next word in a sentence, predicting the rotation of an image, or predicting the color of a grayscale image. By solving these pretext tasks, the algorithm learns to extract meaningful features from the data that can be later used for other tasks.

One of the main advantages of self-supervised learning is that it can leverage large amounts of unlabeled data to learn useful representations, which can then be fine-tuned on smaller labeled datasets for specific tasks. This approach has been particularly successful in the field of natural language processing and computer vision, where self-supervised learning has led to significant improvements in performance on a variety of tasks.

In [None]:
# Example of self-supervised learning using image rotation as the pretext task
from torchvision import transforms
from torchvision.datasets import CIFAR10
from torch.utils.data import DataLoader

# Define the rotation transforms
rotate_transforms = [
    transforms.RandomRotation(degrees=(0, 0)),
    transforms.RandomRotation(degrees=(90, 90)),
    transforms.RandomRotation(degrees=(180, 180)),
    transforms.RandomRotation(degrees=(270, 270))
]

# Load the CIFAR10 dataset
dataset = CIFAR10(root="./data", train=True, transform=transforms.Compose([
    transforms.RandomChoice(rotate_transforms),
    transforms.ToTensor()
]))

# Create a DataLoader
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)

# In the training loop, use the rotation angle as the pretext task label
for images, _ in dataloader:
    # Train the self-supervised model using the rotated images and the corresponding rotation angles
    pass


In this example, the self-supervised learning task is to predict the rotation angle applied to each image in the CIFAR-10 dataset. By learning to solve this task, the model will learn useful features that can be later used for other tasks, such as image classification.

### 3.6. Transfer Learning
![Transfer Learning](https://www.mdpi.com/sensors/sensors-23-00570/article_deploy/html/images/sensors-23-00570-g001-550.jpg)

Transfer Learning is an approach to machine learning where a pre-trained model is fine-tuned on a new dataset. The idea behind transfer learning is that the knowledge gained from solving one problem can be applied to solve a related problem more efficiently. This is particularly useful in deep learning, where training a model from scratch can be computationally expensive and time-consuming. By using a pre-trained model as a starting point, the training process can be significantly accelerated, and better performance can be achieved with less data.



## 3. Basic Concepts in Machine Learning
### 3.1. Features and Labels
<img src="https://majdarbash.github.io/assets/images/aws-cmls/observations-features-labels.png" alt="Features and Labels" width="600"/>
 

In machine learning, the data we work with is usually represented in a structured format, with a set of attributes known as features, and a target attribute known as the label. Features are the variables that describe the data, while the label is the output variable that we want to predict or classify.

For example, in an email spam detection problem, features could be the words or phrases in the email, the sender's email address, and the time the email was sent. The label would be a binary variable indicating whether the email is spam or not spam.

### 3.2. Training and Testing Data
 
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/0/0e/Traintest.svg/2880px-Traintest.svg.png" alt="Training and Testing Data" width="600"/>

In order to build a machine learning model, we need to split our dataset into two separate parts: the training data and the testing data. The training data is used to train the model and learn the relationship between the features and the label. The testing data is used to evaluate the model's performance on unseen data and to ensure that the model is not overfitting to the training data.

Typically, the dataset is split into 70-80% training data and 20-30% testing data. It is important to make sure that both the training and testing data have a similar distribution of the label variable to ensure that the model performs well on both sets.

### 3.3. Model Evaluation Metrics

<img src="https://cdn-images-1.medium.com/max/800/1*1WPbfzztdv50V22TpA6njw.png" alt="Model Evaluation Metrics" width="600"/>

Model evaluation metrics are used to measure the performance of a machine learning model. There are several evaluation metrics available, and the choice of which one to use depends on the problem at hand. Some of the most common evaluation metrics are:
1. **Accuracy**: The proportion of correct predictions out of the total predictions made. It is suitable for balanced datasets, but not for imbalanced datasets where one class is significantly more frequent than the other(s).
2. **Precision**: The proportion of true positive predictions out of the total positive predictions made. Precision is a measure of how well the model correctly identifies positive instances.
3. **Recall**: The proportion of true positive predictions out of the total actual positive instances. Recall is a measure of how well the model identifies all the positive instances.
4. **F1-Score**: The harmonic mean of precision and recall. F1-score is a balanced metric that considers both precision and recall, making it suitable for imbalanced datasets.

In [None]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Example of calculating evaluation metrics for a classification problem
y_true = [1, 0, 1, 1, 0, 1]
y_pred = [1, 1, 1, 0, 0, 1]

accuracy = accuracy_score(y_true, y_pred)
precision = precision_score(y_true, y_pred)
recall = recall_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1-Score: {f1:.2f}")

In this example, we calculate the accuracy, precision, recall, and F1-score for a binary classification problem using the '**scikit-learn**'

## 4.Main Directions in Machine Learning
### 4.1.Regression
 
<img src="https://static.javatpoint.com/tutorial/machine-learning/images/linear-regression-in-machine-learning.png" alt="Regression" width="600"/>

Regression is a type of supervised learning task where the goal is to predict a continuous output variable based on the input features. In regression, the relationship between the input features and the output variable is modeled using a mathematical function. Common examples of regression tasks include predicting house prices, stock prices, and customer lifetime value.

In [None]:
# Load the necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Load the dataset (e.g., Boston Housing dataset)
from sklearn.datasets import load_boston
boston = load_boston()
data = pd.DataFrame(boston.data, columns=boston.feature_names)
data['PRICE'] = boston.target

# Split the dataset into training and testing sets
X = data.drop('PRICE', axis=1)
y = data['PRICE']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the model
regressor = LinearRegression()
regressor.fit(X_train, y_train)

# Make predictions and evaluate the model
y_pred = regressor.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print("Mean squared error:", mse)

#### Linear Regression
Linear Regression is a supervised machine learning algorithm used for predicting a continuous target variable based on one or more input features. It assumes that there's a linear relationship between the input features and the target variable. The goal of the algorithm is to find the best-fitting line (in the case of simple linear regression) or plane/hyperplane (in the case of multiple linear regression) that minimizes the sum of the squared differences between the actual target values and the predicted values.
![Linear Regression](https://miro.medium.com/max/1280/1*fX95txC9xSwSPeP6ch2nmg.gif)

#### Exercise 1 
Try using a different regression algorithm and compare the results.

In [None]:
# Hint: Look into Ridge, Lasso, or ElasticNet from sklearn.linear_model

### 4.2. Classification
 
<img src="https://static.javatpoint.com/tutorial/machine-learning/images/classification-algorithm-in-machine-learning.png" alt="Classification" width="400"/>

Classification is another type of supervised learning task, where the goal is to predict a discrete output variable (class label) based on the input features. In classification, the relationship between the input features and the output variable is modeled using a function that maps the input features to one of the possible class labels. Common examples of classification tasks include email spam detection, image recognition, and medical diagnosis.

#### K Neighbors Classification
K Neighbors Classification is a supervised machine learning algorithm used for classification tasks. It is a non-parametric, lazy learning algorithm, meaning that there is no explicit training phase, and the algorithm makes predictions based on the training data itself. The idea behind K Neighbors Classification is to predict the class of a new data point by looking at the K closest data points in the training set and assigning the most common class among these neighbors.
![K Neighbors Classification](https://miro.medium.com/max/900/1*OyYyr9qY-w8RkaRh2TKo0w.png)

In [None]:
# Load the necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Load the dataset (e.g., Iris dataset)
from sklearn.datasets import load_iris
iris = load_iris()
data = pd.DataFrame(iris.data, columns=iris.feature_names)
data['Species'] = iris.target

# Split the dataset into training and testing sets
X = data.drop('Species', axis=1)
y = data['Species']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the model
classifier = KNeighborsClassifier(n_neighbors=3)
classifier.fit(X_train, y_train)

# Make predictions and evaluate the model
y_pred = classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)


#### Exercise 2  
Try using a different classification algorithm and compare the results.


In [None]:
# Hint: Look into LogisticRegression, DecisionTreeClassifier, or RandomForestClassifier from sklearn

### 4.3. Clustering

<img src="https://media.geeksforgeeks.org/wp-content/uploads/merge3cluster.jpg" alt="Clustering" width="600"/>

Clustering is an unsupervised learning task where the goal is to group similar instances together based on their features. Clustering algorithms analyze the input features and identify patterns or structures within the data. The resulting groups, or clusters, are formed such that instances within the same cluster are more similar to each other than to instances in other clusters. Common examples of clustering tasks include customer segmentation, image segmentation, and anomaly detection.

#### K Means
K Means is an unsupervised machine learning algorithm used for clustering tasks. The goal of the algorithm is to partition the input data into K distinct clusters based on similarity (usually Euclidean distance). The algorithm works iteratively to assign each data point to one of K clusters based on the features that are provided. Data points are clustered based on feature similarity.
![K Means](https://miro.medium.com/max/1400/1*KrcZK0xYgTa4qFrVr0fO2w.gif)


In [None]:
# Load the necessary libraries
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

# Load the dataset (e.g., Iris dataset)
from sklearn.datasets import load_iris
iris = load_iris()
data = pd.DataFrame(iris.data, columns=iris.feature_names)

# Train the model
kmeans = KMeans(n_clusters=3)
kmeans.fit(data)

# Visualize the clusters
plt.scatter(data['sepal length (cm)'], data['sepal width (cm)'], c=kmeans.labels_, cmap='viridis')
plt.xlabel('Sepal Length')
plt.ylabel('Sepal Width')
plt.title('K-Means Clustering')
plt.show()

#### Exercise 3  
Try using a different clustering algorithm and compare the results.

In [None]:
# Hint: Look into DBSCAN, AgglomerativeClustering, or SpectralClustering from sklearn.cluster

### 4.4 Dimensionality Reduction
 
<img src="https://media.geeksforgeeks.org/wp-content/uploads/Dimensionality_Reduction_1.jpg" alt="Dimensionality Reduction" width="600"/>

Dimensionality reduction is another unsupervised learning task, where the goal is to reduce the number of input features while retaining the most important information in the data. Dimensionality reduction techniques transform the original high-dimensional data into a lower-dimensional representation that is easier to analyze and visualize. This can help improve the performance of machine learning models and reduce the computational cost. Common examples of dimensionality reduction tasks include Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), and Linear Discriminant Analysis (LDA).

#### Principal Component Analysis (PCA)
PCA is an unsupervised machine learning algorithm used for dimensionality reduction. It is a linear transformation technique that aims to project the original data into a lower-dimensional subspace while retaining as much of the data's variance as possible. This is achieved by identifying the directions (called principal components) in the feature space along which the variance is maximized.
![PCA](https://miro.medium.com/max/874/1*H38t3YUv_QktLwalzDYRRg.png)

In [None]:
# Load the necessary libraries
import numpy as np
import pandas as pd
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

# Load the dataset (e.g., Iris dataset)
from sklearn.datasets import load_iris
iris = load_iris()
data = pd.DataFrame(iris.data, columns=iris.feature_names)

# Apply PCA
# Apply PCA
pca = PCA(n_components=2)
principal_components = pca.fit_transform(data)

# Visualize the results
plt.scatter(principal_components[:, 0], principal_components[:, 1], c=iris.target, cmap='viridis')
plt.xlabel('First Principal Component')
plt.ylabel('Second Principal Component')
plt.title('PCA Dimensionality Reduction')
plt.show()



#### Exercise 4 
Try using a different dimensionality reduction algorithm and compare the results.


In [None]:
# Hint: Look into t-SNE, UMAP, or LinearDiscriminantAnalysis from sklearn

### 4.5. Anomaly Detection
 
<img src="https://dezyre.gumlet.io/images/blog/anomaly-detection-using-machine-learning-in-python-with-example/image_22571226771643385810847.png?w=900&dpr=2.0" alt="Anomaly Detection" width="600"/>

Anomaly detection is a type of machine learning task where the goal is to identify instances that deviate significantly from the norm or the majority of the data. These instances, or anomalies, can represent errors, fraud, or other interesting events that warrant further investigation. Anomaly detection algorithms analyze the input features and assign a score to each instance based on how similar or dissimilar it is to the rest of the data. Common examples of anomaly detection tasks include fraud detection, network intrusion detection, and equipment failure prediction.

#### Elliptic Envelope
Elliptic Envelope is an unsupervised machine learning algorithm used for anomaly detection. The algorithm assumes that the data is normally distributed and works by fitting an ellipse (in two dimensions) or an ellipsoid (in higher dimensions) around the central data points. Data points that lie outside the ellipse or ellipsoid are considered outliers or anomalies. The method is sensitive to the assumption of normality, but it is robust in detecting outliers in multivariate data.

![Elliptic Envelope](https://miro.medium.com/v2/resize:fit:952/format:webp/1*CYgTfGIjb35GkZCh5NgD4g.png)


In [None]:
# Load the necessary libraries
import numpy as np
import pandas as pd
from sklearn.covariance import EllipticEnvelope
import matplotlib.pyplot as plt

# Generate some sample data
np.random.seed(42)
data = np.random.multivariate_normal(mean=[0, 0], cov=[[1, 0.5], [0.5, 1]], size=300)

# Train the model
outlier_detector = EllipticEnvelope(contamination=0.1)
outlier_detector.fit(data)

# Predict outliers
outliers = outlier_detector.predict(data)

# Visualize the results
plt.scatter(data[:, 0], data[:, 1], c=outliers, cmap='viridis', marker='o')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Anomaly Detection')
plt.show()


#### Exercise 5 
Try using a different anomaly detection algorithm and compare the results.


In [None]:
# Hint: Look into IsolationForest, LocalOutlierFactor, or OneClassSVM from sklearn

## 5. Basic Machine Learning Methods
### 5.1. Linear Regression
Linear Regression is a simple machine learning algorithm that models the linear relationship between a dependent variable (output) and one or more independent variables (features). It is mainly used for regression tasks, i.e., predicting continuous values.
![Linear Regression](https://miro.medium.com/max/1280/1*fX95txC9xSwSPeP6ch2nmg.gif)

In [None]:
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Load your dataset
# X, y = ...

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Linear Regression model
lr = LinearRegression()

# Train the model
lr.fit(X_train, y_train)

# Make predictions
y_pred = lr.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)

#### Exercise 6
Try using different independent variables for the Linear Regression model and see how it affects the mean squared error.

In [None]:
# Hints:
# 1. Select different columns from the dataset as independent variables.
# 2. Train and evaluate the Linear Regression model as shown in the example above.

### 5.2. Logistic Regression
Logistic Regression is a variation of Linear Regression that is used for binary classification tasks. It models the probability of the default class using the logistic function, which outputs a value between 0 and 1. Logistic Regression can also be extended to multi-class classification using techniques like one-vs-rest or multinomial logistic regression.
<img src="https://miro.medium.com/max/1400/1*RqXFpiNGwdiKBWyLJc_E7g.png" alt="Logistic Regression" width="600"/>

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load your dataset
# X, y = ...

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Logistic Regression model
log_reg = LogisticRegression()

# Train the model
log_reg.fit(X_train, y_train)

# Make predictions
y_pred = log_reg.predict(X_test)

# Evaluate the model
acc = accuracy_score(y_test, y_pred)
print("Accuracy:", acc)

#### Exercise 7
Try using different independent variables for the Logistic Regression model and see how it affects the accuracy score.

In [None]:
# Hints:
# 1. Select different columns from the dataset as independent variables.
# 2. Train and evaluate the Logistic Regression model as shown in the example above.

### 5.3. Decision Trees
Decision Trees are a class of machine learning algorithms that are used for both regression and classification tasks. They work by recursively splitting the input space into regions based on the values of the input features. The tree is grown until a stopping criterion is reached, such as a maximum tree depth or minimum number of samples per leaf.

![Decision Trees](https://miro.medium.com/max/1400/1*XMId5sJqPtm8-RIwVVz2tg.png)

In [None]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load your dataset
# X, y = ...

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Decision Tree model
dt = DecisionTreeClassifier()

# Train the model
dt.fit(X_train, y_train)

# Make predictions
y_pred = dt.predict(X_test)

# Evaluate the model
acc = accuracy_score(y_test, y_pred)
print("Accuracy:", acc)


#### Exercise 8
Try using different hyperparameters for the Decision Tree model, such as the maximum depth or minimum samples split, and see how it affects the accuracy score.

In [None]:
# Hints:
# 1. Adjust the hyperparameters when initializing the DecisionTreeClassifier.
# 2. Train and evaluate the Decision Tree model as shown in the example above.


### 5.4. k-Nearest Neighbors
k-Nearest Neighbors (k-NN) is a simple machine learning algorithm used for classification and regression tasks. It works by finding the k training examples that are closest to a new input instance and predicting the output based on the majority class (for classification) or the average value (for regression) of these neighbors.

![k-Nearest Neighbors](https://miro.medium.com/max/1000/1*OyYyr9qY-w8RkaRh2TKo0w.png)


In [None]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load your dataset
# X, y = ...

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a k-Nearest Neighbors model
knn = KNeighborsClassifier(n_neighbors=5)

# Train the model
knn.fit(X_train, y_train)

# Make predictions
y_pred = knn.predict(X_test)

# Evaluate the model
acc = accuracy_score(y_test, y_pred)
print("Accuracy:", acc)

#### Exercise 9 
Try using different values of k and different distance metrics for the k-NN model and see how it affects the accuracy score.

In [None]:
# Hints:
# 1. Adjust the value of n_neighbors and the distance metric when initializing the KNeighborsClassifier.
# 2. Train and evaluate the k-NN model as shown in the example above.

### 5.5 Support Vector Machines
Support Vector Machines (SVM) are a class of machine learning algorithms used for classification and regression tasks. They work by finding the hyperplane that best separates the classes (for classification) or predicts the target values (for regression) while maximizing the margin between the hyperplane and the nearest data points (support vectors).


<img src="https://torchbearer.readthedocs.io/en/0.3.0/_images/svm_fit.gif" alt="Support Vector Machines" width="600"/>

In [None]:
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load your dataset
# X, y = ...

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Support Vector Machines model
svm = SVC()

# Train the model
svm.fit(X_train, y_train)

# Make predictions
y_pred = svm.predict(X_test)

# Evaluate the model
acc = accuracy_score(y_test, y_pred)
print("Accuracy:", acc)

#### Exercise 10 
Try using different kernel functions and regularization parameters for the SVM model and see how it affects the accuracy score.

In [None]:
# Hints:
# 1. Adjust the kernel function and regularization parameter when initializing the SVC.
# 2. Train and evaluate the SVM model as shown in the example above.

## 6. Introduction to Deep Learning
Deep learning is a subset of machine learning that focuses on neural network architectures with many layers, also known as deep neural networks. These deep neural networks can automatically learn to represent data by training on large datasets. In this section, we will discuss various deep learning architectures and their applications.

### 6.1. Artificial Neural Networks
Artificial Neural Networks (ANNs) are inspired by the human brain and consist of interconnected artificial neurons. ANNs are designed to recognize patterns and make decisions by learning from input data.

![Artificial Neural Network](https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Colored_neural_network.svg/1200px-Colored_neural_network.svg.png)


Here is a simple implementation of a feedforward neural network using TensorFlow and Keras:

In [None]:
import tensorflow as tf
from tensorflow.keras import layers

model = tf.keras.Sequential([
    layers.Dense(64, activation='relu', input_shape=(784,)),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Assuming you have loaded and preprocessed the dataset
# model.fit(X_train, y_train, epochs=10, batch_size=32)

#### Exercise 11
Modify the architecture of the neural network by adding more hidden layers or changing the number of neurons in each layer.

In [None]:
# You can add more layers to the model by adding more `layers.Dense()` instances
# For example, you can add another hidden layer with 128 neurons like this:

model = tf.keras.Sequential([
    layers.Dense(64, activation='relu', input_shape=(784,)),
    layers.Dense(128, activation='relu'),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])


### 6.2. Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are a type of deep learning architecture designed specifically for processing grid-like data, such as images. CNNs consist of convolutional layers, pooling layers, and fully connected layers.

![Convolutional Neural Network](https://miro.medium.com/max/3288/1*uAeANQIOQPqWZnnuH-VEyw.jpeg)


Here's a simple implementation of a CNN using TensorFlow and Keras:

In [None]:
from tensorflow.keras import layers

model = tf.keras.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Assuming you have loaded and preprocessed the dataset
# model.fit(X_train, y_train, epochs=10, batch_size=32)


#### Exercise 11 
Try changing the number of filters in the convolutional layers or modify the kernel size to see how it affects the performance of the network.

In [None]:
# You can change the number of filters in the convolutional layers by changing the first argument of `layers.Conv2D()`
# For example, you can change the first convolutional layer to have 64 filters like this:

layers.Conv2D(64, (3, 3), activation='relu', input_shape=(28, 28, 1))

# You can also change the kernel size by modifying the second argument of `layers.Conv2D()`
# For example, you can change the kernel size of the first convolutional layer to (5, 5) like this:

layers.Conv2D(32, (5, 5), activation='relu', input_shape=(28, 28, 1))

### 6.3. Recurrent Neural Networks
Recurrent Neural Networks (RNNs) are a type of neural network architecture designed for processing sequences of data. RNNs contain loops that allow information to persist across time steps, making them well-suited for time series data or natural language processing tasks.

![Recurrent Neural Network](https://miro.medium.com/max/1838/1*SKGAqkVVzT6co-sZ29ze-g.png)

![Recurrent Neural Network](https://miro.medium.com/max/1400/1*WMnFSJHzOloFlJHU6fVN-g.gif)


Here's a simple implementation of an RNN using TensorFlow and Keras:

In [None]:
from tensorflow.keras import layers

model = tf.keras.Sequential([
    layers.SimpleRNN(64, activation='relu', input_shape=(None, 28)),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Assuming you have loaded and preprocessed the dataset
# model.fit(X_train, y_train, epochs=10, batch_size=32)


#### Exercise 12
Modify the RNN architecture by adding more recurrent layers or changing the number of units in the recurrent layers.

In [None]:
# You can add more recurrent layers to the model by adding more `layers.SimpleRNN()` instances
# For example, you can add another SimpleRNN layer with 128 units like this:

model = tf.keras.Sequential([
    layers.SimpleRNN(64, activation='relu', input_shape=(None, 28), return_sequences=True),
    layers.SimpleRNN(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])


### 6.4. Generative Adversarial Networks
Generative Adversarial Networks (GANs) are a type of deep learning architecture that consists of two neural networks, a generator, and a discriminator. The generator learns to create realistic samples, while the discriminator learns to distinguish between real and generated samples. The two networks are trained in a competitive setting, where the generator tries to fool the discriminator and the discriminator tries to correctly identify the samples.

![Generative Adversarial Network](https://i.imgur.com/6NMdO9u.png)


### 6.5. Transfer Learning
Transfer learning is a technique in deep learning where a pre-trained model is used as a starting point for training a new model. By leveraging the knowledge from the pre-trained model, transfer learning can lead to faster training times and improved performance, especially when dealing with small datasets.

![Transfer Learning](https://www.bbvaaifactory.com/media/post_transfer_learning/gif_transferlearning_prediction_v3.gif)


Here's an example of using transfer learning with TensorFlow and Keras:

In [None]:
from tensorflow.keras.applications import VGG16
from tensorflow.keras import layers

# Load the pre-trained VGG16 model without the top
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

# Freeze the base model layers
for layer in base_model.layers:
    layer.trainable = False

# Add custom layers on top of the base model
model = tf.keras.Sequential([
    base_model,
    layers.GlobalAveragePooling2D(),
    layers.Dense(1024, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Assuming you have loaded and preprocessed the dataset
# model.fit(X_train, y_train, epochs=10, batch_size=32)


#### Exercise 13 
Try using a different pre-trained model, such as '**ResNet50**', '**InceptionV3**', or '**MobileNet**', and see how it affects the performance of your new model

In [None]:
# To use a different pre-trained model, import the corresponding module and instantiate it as the base model
# For example, to use ResNet50, do the following:

from tensorflow.keras.applications import ResNet50

base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

## 7. Python Libraries for Machine Learning
### 7.1. NumPy
NumPy is an open-source library in Python that provides support for arrays and matrices, along with a large collection of high-level mathematical functions. It is an essential library for machine learning as it provides efficient numerical computations and a wide range of mathematical operations that are commonly used in machine learning algorithms.


**Example: Creating a NumPy array**

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
print(arr)

#### Exercise 14 
Create a 3x3 NumPy array containing random numbers.

In [None]:
# Hints:
# - Use np.random.rand() function

# Code sketch
import numpy as np
random_arr = # your code here
print(random_arr)

####  7.2. pandas
pandas is an open-source library that provides data manipulation and data analysis tools in Python. It offers data structures such as Series and DataFrame, which are built on top of NumPy arrays. pandas makes it easy to load, manipulate, and analyze data in a tabular format, which is very important in the data preprocessing stage of machine learning.

**Example: Creating a pandas DataFrame**

In [None]:
import pandas as pd

data = {'A': [1, 2, 3],
        'B': [4, 5, 6],
        'C': [7, 8, 9]}

df = pd.DataFrame(data)
print(df)

#### Exercise 15
Load the Iris dataset from a CSV file using pandas.

In [None]:
# Hints:
# - Use pd.read_csv() function
# - Iris dataset URL: 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

# Code sketch
import pandas as pd
url = # your code here
iris_df = # your code here
print(iris_df.head())

### 7.3. Matplotlib
Matplotlib is a widely used Python library for creating static, animated, and interactive visualizations. It provides a MATLAB-like interface for creating plots, histograms, bar charts, scatter plots, and more. Visualizing data is an important step in understanding the underlying patterns and relationships in the data, which helps in building better machine learning models.

**Example: Plotting a simple line graph using Matplotlib**

In [None]:
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

plt.plot(x, y)
plt.xlabel('x-axis')
plt.ylabel('y-axis')
plt.title('Line graph')
plt.show()


#### Exercise 16 
Create a bar chart to visualize the distribution of a categorical variable in the Iris dataset.

In [None]:
# Hints:
# - Use plt.bar() function

# Code sketch
import pandas as pd
import matplotlib.pyplot as plt

# Load the Iris dataset
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
iris_df = pd.read_csv(url, header=None)

# Count the occurrences of each class
class_counts = # your code here

# Create the bar chart
plt.bar(# your code here)
plt.xlabel('Class')
plt.ylabel('Count')
plt.title('Class Distribution')
plt.show()


### 7.4. scikit-learn
scikit-learn is a popular open-source library in Python for machine learning, providing a wide range of algorithms for classification, regression, clustering, and dimensionality reduction. It also provides tools for model evaluation, parameter tuning, and preprocessing. scikit-learn is built on top of NumPy, SciPy, and matplotlib, making it an essential library for machine learning in Python.

**Example: Training a simple linear regression model using scikit-learn**

In [None]:
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic data
X = np.random.rand(100, 1)
y = 2 * X + 1 + 0.1 * np.random.randn(100, 1)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a linear regression model
model = LinearRegression()

# Train the model on the training data
model.fit(X_train, y_train)

# Make predictions on the test data
y_pred = model.predict(X_test)

# Calculate the mean squared error
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.4f}")

#### Exercise 17 
Train a decision tree classifier on the Iris dataset and evaluate its accuracy.

In [None]:
# Hints:
# - Use DecisionTreeClassifier from sklearn.tree
# - Use train_test_split() function
# - Use accuracy_score() function from sklearn.metrics

# Code sketch
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import pandas as pd

# Load the Iris dataset
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
iris_df = pd.read_csv(url, header=None)

# Split the data into features and labels
X = iris_df.iloc[:, :-1].values
y = iris_df.iloc[:, -1].values

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a decision tree classifier
clf = DecisionTreeClassifier()

# Train the classifier on the training data
clf.fit(X_train, y_train)

# Make predictions on the test data
y_pred = clf.predict(X_test)

# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")

### 7.5. TensorFlow
TensorFlow is an open-source machine learning library developed by Google. It provides a flexible platform for defining and running machine learning algorithms, with support for deep learning and other advanced techniques. TensorFlow is particularly suited for building large-scale neural networks and deploying them on various hardware platforms, such as CPUs, GPUs, and TPUs.

### 7.6. Keras
Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, or Theano. It was developed with a focus on enabling fast experimentation, providing an easy-to-use interface for building and training deep learning models. Keras has been integrated into TensorFlow as its official high-level API since TensorFlow 2.0.

### 7.7 PyTorch
PyTorch is an open-source machine learning library developed by Facebook's AI Research lab, which provides tensor computation and deep neural networks built on tape-based autograd system. It is widely used for applications such as natural language processing, computer vision, and reinforcement learning. PyTorch is known for its dynamic computational graph, which allows for easier debugging and more flexibility in building complex models.

## 8. Machine Learning Examples
### 8.1. Simple Linear Regression Example
In this example, we will use a simple linear regression model to predict the relationship between two variables. We will generate synthetic data and split it into training and testing sets. Then, we will train a linear regression model using scikit-learn and evaluate its performance.



In [None]:
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import numpy as np
import matplotlib.pyplot as plt

# Generate synthetic data
X = np.random.rand(100, 1)
y = 2 * X + 1 + 0.1 * np.random.randn(100, 1)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a linear regression model
model = LinearRegression()

# Train the model on the training data
model.fit(X_train, y_train)

# Make predictions on the test data
y_pred = model.predict(X_test)

# Calculate the mean squared error
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.4f}")

# Visualize the regression line
plt.scatter(X, y, color='blue')
plt.plot(X, 2 * X + 1, color='red', linewidth=2)
plt.xlabel('X')
plt.ylabel('y')
plt.title('Simple Linear Regression Example')
plt.show()


#### Exercise 18
Modify the above code to use a different dataset, and observe the results.

In [None]:
# Hints:
# - Create your own dataset, or use an existing one
# - Ensure the dataset has only one input feature

# Code sketch
# your code here

### 8.2 Image Classification Example
In this example, we will use the famous MNIST dataset, which consists of 70,000 handwritten digits, to train a  classifier for image classification.

Look at tutorial [Class4Extra](https://colab.research.google.com/drive/1R6VS3TdYDitCYi4MF7APtXjrz-v69dnK?usp=sharing)

### 8.3. Time Series Forecasting Example
In this example, we will use the famous Airline Passengers dataset, which shows the monthly number of airline passengers between 1949 and 1960, to train a simple time series forecasting model using the ARIMA method.