# <font color="#418FDE" size="6.5" uppercase>**Loss And Error**</font>

>Last update: 20260201.
    
By the end of this Lecture, you will be able to:
- Define a loss function as a numerical measure of prediction error. 
- Compute simple loss values for individual predictions in regression and classification examples. 
- Explain why training focuses on reducing average loss across a dataset. 


## **1. Understanding Loss Functions**

### **1.1. Quantifying Prediction Error**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_05/Lecture_A/image_01_01.jpg?v=1769959794" width="250">



>* Loss is a number measuring prediction wrongness
>* Small loss means accurate predictions, large loss inaccurate

>* Different tasks assign different sizes to errors
>* Numeric loss guides models toward smaller, better errors

>* Loss is a penalty for bad predictions
>* Penalties guide training to reduce costly mistakes



### **1.2. Task Specific Loss**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_05/Lecture_A/image_01_02.jpg?v=1769959806" width="250">



>* Loss depends on the specific prediction task
>* Loss formula matches what counts as serious mistakes

>* Different tasks value errors very differently
>* Loss functions weight costly mistakes more heavily

>* Loss is shaped to match task success
>* Design loss to mirror real-world evaluation criteria



### **1.3. Sample Loss Calculations**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_05/Lecture_A/image_01_03.jpg?v=1769959823" width="250">



>* Use examples to turn prediction errors into numbers
>* Loss is the size of prediction-actual difference

>* Loss equals difference between predicted and actual price
>* Loss is one number showing mistake severity

>* Classification loss scores correct versus incorrect labels
>* Loss numbers make model performance comparable and visible



## **2. Understanding Regression Loss**

### **2.1. Comparing Predictions and Targets**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_05/Lecture_A/image_02_01.jpg?v=1769959839" width="250">



>* Loss compares model prediction with actual outcome
>* Smaller gaps mean small loss, larger gaps larger

>* Car mileage examples show prediction versus reality
>* Loss turns prediction gaps into single numeric scores

>* Compare prediction and target for every example
>* Loss gives shared scale for errors across domains



### **2.2. Why Squared Errors**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_05/Lecture_A/image_02_02.jpg?v=1769959858" width="250">



>* Squared error turns prediction differences into positives
>* It penalizes large mistakes much more than small

>* Squared error changes smoothly as predictions change
>* Smooth loss gives clear gradients for training

>* Large prediction errors can be especially harmful
>* Squaring errors penalizes big mistakes much more



### **2.3. Impact of Large Errors**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_05/Lecture_A/image_02_03.jpg?v=1769959871" width="250">



>* Squaring errors makes big mistakes count much more
>* Training updates focus strongly on fixing huge errors

>* Squared loss highlights rare, very large mistakes
>* Helps avoid dangerous errors in medicine and finance

>* Outliers and bad data can dominate training
>* Choose losses, cleaning, preprocessing for robustness



In [None]:
#@title Python Code - Impact of Large Errors

# This script shows squared error impact visually.
# We compare small and large regression prediction errors.
# Notice how large errors dominate the total loss.

# Import required numerical and plotting libraries.
import numpy as np
import matplotlib.pyplot as plt

# Create true house prices for a tiny toy dataset.
true_prices = np.array([200_000, 220_000, 250_000], dtype=float)

# Create mostly good predictions with one huge mistake.
pred_prices = np.array([205_000, 218_000, 400_000], dtype=float)

# Compute prediction errors as predicted minus true values.
errors = pred_prices - true_prices

# Compute squared errors to highlight large mistakes.
squared_errors = errors ** 2

# Print values to compare errors and squared errors.
print("True prices:", true_prices)
print("Predicted prices:", pred_prices)
print("Errors (pred - true):", errors)
print("Squared errors:", squared_errors)
print("Total squared loss:", squared_errors.sum())

# Prepare labels for each house example on x axis.
example_labels = ["House 1", "House 2", "House 3"]

# Create a bar chart comparing squared errors per house.
plt.figure(figsize=(6, 4))
plt.bar(example_labels, squared_errors, color=["green", "green", "red"])

# Add title and axis labels explaining large error impact.
plt.title("Squared error makes one large mistake dominate loss")
plt.ylabel("Squared error value")

# Adjust layout and display the single plot.
plt.tight_layout()
plt.show()



## **3. Understanding Classification Loss**

### **3.1. Correct vs Incorrect Labels**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_05/Lecture_A/image_03_01.jpg?v=1769959916" width="250">



>* Loss depends on matching predicted and true labels
>* This simple right-or-wrong view underlies classification loss

>* We judge models over many examples together
>* Average loss guides training to reduce overall mistakes

>* Average loss links training to real performance
>* Rewards consistent accuracy across many real situations



### **3.2. Counting Classification Errors**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_05/Lecture_A/image_03_02.jpg?v=1769959932" width="250">



>* Count how many predictions are classified incorrectly
>* Divide total errors by examples to get error rate

>* Average error links model performance to real impact
>* Training aims to lower this overall error rate

>* Optimize average error, not single mistakes
>* Global error signal improves overall, balanced accuracy



### **3.3. Confidence Weighted Penalties**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_05/Lecture_A/image_03_03.jpg?v=1769959944" width="250">



>* Loss considers prediction confidence, not just correctness
>* Confident mistakes get larger penalties than uncertain ones

>* High-confidence wrong predictions get large loss penalties
>* Average loss reflects how often and how dangerously wrong

>* Average confidence loss improves probabilities and calibration
>* This leads to safer, more reliable predictions



In [None]:
#@title Python Code - Confidence Weighted Penalties

# This script illustrates confidence weighted classification penalties.
# We compare simple accuracy with confidence sensitive loss values.
# Focus on average loss across several small prediction examples.

# import numpy for numerical arrays and calculations.
import numpy as np

# define true labels and predicted probabilities for three classes.
true_labels = np.array([0, 1, 2, 1, 0])

# define model predicted probabilities for each example row.
pred_probs = np.array([
    [0.7, 0.2, 0.1],
    [0.2, 0.6, 0.2],
    [0.1, 0.1, 0.8],
    [0.6, 0.3, 0.1],
    [0.4, 0.3, 0.3],
])

# check shapes to ensure labels and probabilities align correctly.
assert pred_probs.shape[0] == true_labels.shape[0]

# compute predicted labels using highest probability for each example.
pred_labels = np.argmax(pred_probs, axis=1)

# compute simple accuracy ignoring confidence information entirely.
accuracy = np.mean(pred_labels == true_labels)

# define a small epsilon to avoid logarithm of zero values.
epsilon = 1e-9

# gather probabilities assigned to the correct class for each example.
correct_class_probs = pred_probs[np.arange(true_labels.size), true_labels]

# clip probabilities to avoid taking log of zero values.
correct_class_probs = np.clip(correct_class_probs, epsilon, 1.0)

# compute negative log likelihood as confidence weighted loss.
loss_values = -np.log(correct_class_probs)

# compute average loss across all examples in this dataset.
average_loss = float(np.mean(loss_values))

# print accuracy and average loss to compare training objectives.
print("Simple accuracy ignoring confidence:", float(accuracy))

# print individual loss values to show confidence weighted penalties.
print("Confidence weighted loss values:", loss_values.tolist())

# print average loss summarizing overall confidence quality.
print("Average confidence weighted loss:", average_loss)




# <font color="#418FDE" size="6.5" uppercase>**Loss And Error**</font>


In this lecture, you learned to:
- Define a loss function as a numerical measure of prediction error. 
- Compute simple loss values for individual predictions in regression and classification examples. 
- Explain why training focuses on reducing average loss across a dataset. 

In the next Lecture (Lecture B), we will go over 'Fitting And Overfitting'