In [7]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt 

In [None]:
cols = ["fLength", "fWidth", "fSize", "fConc", "fConc1", "fAsym", "fM3Long", "fM3Trans", "fAlpha", "fDist", "class"]
df = pd.read_csv("magic04.data", names=cols)
df.head()
# Machine learning is a subdomain of computer science
# that focuses on algorithms which help a computer learn
# from data without being explicitly programmed.


Unnamed: 0,fLength,fWidth,fSize,fConc,fConc1,fAsym,fM3Long,fM3Trans,fAlpha,fDist,class
0,28.7967,16.0021,2.6449,0.3918,0.1982,27.7004,22.011,-8.2027,40.092,81.8828,g
1,31.6036,11.7235,2.5185,0.5303,0.3773,26.2722,23.8238,-9.9574,6.3609,205.261,g
2,162.052,136.031,4.0612,0.0374,0.0187,116.741,-64.858,-45.216,76.96,256.788,g
3,23.8172,9.5728,2.3385,0.6147,0.3922,27.2107,-6.4633,-7.1513,10.449,116.737,g
4,75.1362,30.9205,3.1611,0.3168,0.1832,-5.5277,28.5525,21.8393,4.648,356.462,g


In [None]:
df['class'] = df['class'].map({'g': 0, 'h': 1})


## Artificial Intelligence

Artificial Intelligence is an area of computer science where the goal is for computers and machines to mimic human behavior.

---

## Machine Learning

Machine learning is a subdomain of computer science that focuses on algorithms which help a computer learn from data without being explicitly programmed.

Machine learning is also a subset of Artificial Intelligence (AI) that tries to solve specific problems and make predictions using data. It uses statistical techniques to give computers the ability to "learn" from data, improving their performance on a specific task over time.



## Types of Machine Learning

1. **Supervised Learning**  
   The model is trained on labeled data, where the input data is paired with the correct output. The model learns to map inputs to outputs and can make predictions on new, unseen data.

2. **Unsupervised Learning**  
   The model is trained on unlabeled data, where the input data does not have corresponding output labels. The model learns to find patterns and relationships in the data without explicit guidance.

3. **Reinforcement Learning**  
   The model learns by interacting with an environment and receiving feedback in the form of rewards or penalties to improve its future behavior.


## Supervised Learning

![Basic Machine Learning Diagram](Machine_Learning_basic.png)  
*Figure 1: Basic structure of machine learning approaches.*

Supervised learning involves training a model on labeled data — where each input is paired with a known output — to learn the mapping from inputs to outputs. It is used in tasks where the goal is to predict outcomes based on historical data.

---

### Types of Input Data (Feature Vectors)

- **Qualitative (Categorical) Data**  
  Data grouped into categories without any inherent order.  
  _Example: colors, brands, animal species._

- **Ordinal Data**  
  Categorical data that has a defined, meaningful order or ranking.  
  _Example: customer satisfaction levels (e.g., low, medium, high)._

- **Quantitative (Numerical) Data**  
  Data expressed in numbers, allowing for mathematical operations.  
  _Example: height, weight, temperature._

---

### Encoding Categorical Data

- **One-Hot Encoding**  
  A technique used to convert categorical variables into a binary matrix. Each unique category is represented as a vector with one element set to 1 (hot) and all others to 0 (cold).  
  _Used to make categorical data usable by machine learning algorithms._

---

### Common Supervised Learning Tasks

- **Classification**  
  Predicts a discrete label or category.  
  - **Binary classification**: Two classes (e.g., spam vs. not spam)  
  - **Multiclass classification**: More than two classes (e.g., digit recognition)

- **Regression**  
  Predicts a continuous numerical value.  
  _Example: house price prediction, stock price forecasting_

---

### Data Splitting Strategy

To evaluate and train supervised learning models effectively, the dataset is commonly split into three subsets:

1. **Training Set**  
   Used to train the model. The model learns by comparing its predictions to the true labels and adjusting parameters to minimize error.

2. **Validation Set**  
   Used during training to tune hyperparameters and prevent overfitting. It serves as a "reality check" to ensure the model generalizes well to unseen data.

3. **Test Set**  
   Used only after model training and validation to evaluate final performance. It measures how well the model performs on completely new, unseen data.

---

### Loss Functions

Loss functions measure how far off a model's predictions are from the actual values. Two common types are:

- **L1 Loss (Mean Absolute Error, MAE)**  
  $$
  \text{Loss} = \left| y_{\text{true}} - y_{\text{pred}} \right|
  $$  
  Emphasizes robustness to outliers by taking the absolute difference.

- **L2 Loss (Mean Squared Error, MSE)**  
  $$
  \text{Loss} = \left( y_{\text{true}} - y_{\text{pred}} \right)^2
  $$  
  Penalizes larger errors more heavily by squaring the differences.

- **Binary Cross-Entropy Loss**  
  $$
  \text{Loss} = -\left[ y \cdot \log(p) + (1 - y) \cdot \log(1 - p) \right]
  $$  
  Used when the model outputs probabilities between 0 and 1. Ideal for binary classification tasks where the target labels are 0 or 1. Encourages confident, accurate predictions by penalizing incorrect certainty.

---

**Which loss function should you use?**

- Use **L1 Loss (MAE)** for regression tasks where you want a model that is more robust to outliers.
- Use **L2 Loss (MSE)** for regression tasks where you want to penalize large errors more strongly.
- Use **Binary Cross-Entropy Loss** for binary classification problems where the model outputs a probability (e.g., logistic regression, binary neural networks).


---

### Metrics of Performance

Metrics are used to evaluate how well a machine learning model performs on a given task. Different tasks (e.g., classification vs regression) use different evaluation metrics.

---

- **Accuracy**  
  $$
  \text{Accuracy} = \frac{\text{Number of Correct Predictions}}{\text{Total Number of Predictions}}
  $$  
  Accuracy measures the proportion of correctly predicted instances out of all predictions made.  
  - Commonly used in **classification tasks**.  
  - Works well when classes are **balanced**, but may be misleading if classes are **imbalanced**.

---

- **Precision**  
  $$
  \text{Precision} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Positives}}
  $$  
  Measures how many of the predicted positives are actually correct.  
  - Useful when **false positives are costly** (e.g., spam detection).

---

- **Recall (Sensitivity or True Positive Rate)**  
  $$
  \text{Recall} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Negatives}}
  $$  
  Measures how many actual positives were correctly predicted.  
  - Useful when **false negatives are costly** (e.g., medical diagnosis).

---

- **F1 Score**  
  $$
  \text{F1 Score} = 2 \cdot \frac{\text{Precision} \cdot \text{Recall}}{\text{Precision} + \text{Recall}}
  $$  
  Harmonic mean of precision and recall.  
  - Balances both false positives and false negatives.  
  - Preferred metric when you need to balance **precision and recall**.

---

- **Mean Absolute Error (MAE)**  
  $$
  \text{MAE} = \frac{1}{n} \sum_{i=1}^{n} \left| y_i - \hat{y}_i \right|
  $$  
  Average of the absolute differences between actual and predicted values.  
  - Used in **regression tasks**.  
  - Less sensitive to outliers.

---

- **Mean Squared Error (MSE)**  
  $$
  \text{MSE} = \frac{1}{n} \sum_{i=1}^{n} \left( y_i - \hat{y}_i \right)^2
  $$  
  Average of the squared differences between actual and predicted values.  
  - Used in **regression tasks**.  
  - **Penalizes larger errors** more heavily than MAE.

---


