# Class 3 Notebook – Machine Learning & Deep Learning Basics

This notebook extends **Class 2 – Machine Learning Basics** by going beyond data preprocessing into **training and evaluating simple models**, and then introducing **deep learning basics**.

We will reuse the same preprocessing ideas from `class-2-machine-learning-basics/` and follow a common step structure:

1. Define the objective
2. Install/import libraries
3. Load or create a dataset
4. Separate features and target
5. Train/test split
6. Train model
7. Make predictions
8. Evaluate model
9. (Optional) Visualize and iterate

Run the first code cell to confirm your environment works.

## Run in the browser (no local setup)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adzuci/ai-fundamentals/blob/class-3/class-3-machine-learning-deep-learning-basics/class-3-ml-dl-basics.ipynb)

> Tip: Make sure you have walked through the notebooks in `class-2-machine-learning-basics/` first, especially `data-preprocessing.ipynb` and `scaling-data.ipynb`.

In [51]:
# Environment sanity check
import platform

print("Python:", platform.python_version())
print("OS:", platform.system(), platform.release())

# Core ML / visualization / model libraries
try:
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    from sklearn.model_selection import train_test_split
    from sklearn.linear_model import LinearRegression
    from sklearn.metrics import mean_squared_error, r2_score

    print("NumPy:", np.__version__, "| Pandas:", pd.__version__)
except ModuleNotFoundError as exc:
    print("Missing dependency:", exc)
    print("Install with: python -m pip install numpy pandas scikit-learn matplotlib")
    raise

# TODO: In later steps we will add a simple deep learning library import
# (e.g., PyTorch or Keras) and build a tiny feed-forward network on top
# of the same preprocessing pipeline used in Class 2.

Python: 3.10.14
OS: Darwin 25.2.0
NumPy: 2.2.6 | Pandas: 2.3.3


In [52]:
# Create a simple house-prices dataset (size vs price)
X = np.array([300, 700, 950, 1300, 1500, 1900, 2000, 2500, 2800, 3200])
y = np.array([15, 35, 50, 57, 80, 95, 100, 125, 150, 170])

mydata = pd.DataFrame({
    "Size": X,
    "Price": y,
})

print(mydata)

   Size  Price
0   300     15
1   700     35
2   950     50
3  1300     57
4  1500     80
5  1900     95
6  2000    100
7  2500    125
8  2800    150
9  3200    170


In [53]:
print(X)
print(y)


[ 300  700  950 1300 1500 1900 2000 2500 2800 3200]
[ 15  35  50  57  80  95 100 125 150 170]


In [None]:
# 5. 
X = mydata[["Size"]]
y = mydata["Price"]

#6. 
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Checking values
print(X_train)
print(y_train)




   Size
5  1900
0   300
7  2500
2   950
9  3200
4  1500
3  1300
6  2000
5     95
0     15
7    125
2     50
9    170
4     80
3     57
6    100
Name: Price, dtype: int64


In [55]:
# 7. Train the Linear Regression model
lin_reg = LinearRegression()
lin_reg.fit(X_train, y_train)

# (Optional) check learned parameters
print("Intercept:", lin_reg.intercept_)
print("Slope:", lin_reg.coef_)

Intercept: -3.348571428571475
Slope: [0.0526585]


In [56]:
# 7. Train the Linear Regression model
lin_reg = LinearRegression()
lin_reg.fit(X_train, y_train)

# (Optional) check learned parameters
print("Intercept:", lin_reg.intercept_)
print("Slope:", lin_reg.coef_)

Intercept: -3.348571428571475
Slope: [0.0526585]


In [None]:
# 8–13. Inspect model, predict, evaluate, and visualize

# 8. View equation values
print("Slope (m):", lin_reg.coef_[0])
print("Intercept (b):", lin_reg.intercept_)
print("Equation: Price = m * Size + b")

# 9. Make predictions on the test set
y_pred = lin_reg.predict(X_test)
print("Actual:", list(y_test))
print("Predicted:", list(y_pred))

# 10. Evaluate model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("MSE:", mse)
print("R^2 Score:", r2)

# 11. Predict a new house price (e.g., size = 2200)
new_size = np.array([[2200]])
new_price = lin_reg.predict(new_size)
print("Predicted price for size 2200:", new_price[0])

# 12–13. Visualization: data points + best-fit line
plt.figure(figsize=(8, 6))
plt.scatter(X, y, label="Data points")
plt.plot(X, lin_reg.predict(X), color="red", linewidth=2, label="Best-fit line")
plt.xlabel("House Size")
plt.ylabel("House Price")
plt.title("Linear Regression")
plt.legend()
plt.show()

### 14. Key learning

- **Linear Regression finds a best-fit line** for the relationship between input (Size) and output (Price).
- **The model learns the relationship from data**, using the training set to estimate slope (`m`) and intercept (`b`).
- **The model can predict unseen values** (e.g., a new house size like 2200) by plugging into the learned equation.

In short: Linear Regression predicts continuous values by fitting a straight line that minimizes error between actual and predicted points.