In machine learning, evaluating the performance of a model is crucial to understanding how well it generalizes to new data. Three common metrics used for classification tasks are precision, recall, and accuracy.

### 1. Accuracy
Definition: Accuracy is the ratio of correctly predicted instances to the total instances in the dataset.
Formula:



```
Accuracy = True Positives + True Negatives / Total Instances
```

### Usage: Accuracy is useful when the classes are balanced, meaning the number of instances of each class is roughly equal.

Example: In a spam detection system, if out of 100 emails, 90 are correctly classified (either as spam or not spam), the accuracy is 90%.

2. Precision
Definition: Precision is the ratio of correctly predicted positive observations to the total predicted positives.

####Usage: Precision is important when the cost of a false positive is high. For example, in a spam detection system, a high precision means that most of the emails classified as spam are actually spam.

Example: If the model predicts 30 emails as spam and 25 of them are actually spam, the precision is
25
30
=
0.83
30
25
​
 =0.83.


3. Recall
Definition: Recall (or Sensitivity) is the ratio of correctly predicted positive observations to the all observations in the actual class.
Formula:

Recall
=
True Positives
True Positives
+
False Negatives
Recall=
True Positives+False Negatives
True Positives
​


#### Usage: Recall is important when the cost of a false negative is high. For example, in a spam detection system, a high recall means that most of the actual spam emails are being correctly identified as spam.

Example: If there are 50 spam emails and the model correctly identifies 40 of them, the recall is
40
50
=
0.8
50
40
​
 =0.8.

Example Scenario
True Positives (TP): Emails correctly identified as spam.
True Negatives (TN): Emails correctly identified as not spam.
False Positives (FP): Emails incorrectly identified as spam.
False Negatives (FN): Emails incorrectly identified as not spam.

### Choosing the Right Metric
The choice between precision, recall, and accuracy depends on the specific problem and its context

* Accuracy is a good measure when classes are balanced and there is a similar cost for false positives and false negatives.
* Precision is preferred when the cost of false positives is high, such as in spam detection or fraud detection.
* Recall is important when the cost of false negatives is high, such as in medical diagnosis or fault detection systems.

In [1]:
y_pred = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0]
y_true = [1, 0, 1, 0, 0, 1, 0, 1, 1, 0]

In [2]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, confusion_matrix

accuracy = accuracy_score(y_true, y_pred)
print(f"Accuracy: {accuracy:.2f}")

precision = precision_score(y_true, y_pred)
print(f"Precision: {precision:.2f}")

recall = recall_score(y_true, y_pred)
print(f"Recall: {recall:.2f}")

conf_matrix = confusion_matrix(y_true, y_pred)
print("Confusion Matrix:")
print(conf_matrix)

Accuracy: 0.80
Precision: 0.80
Recall: 0.80
Confusion Matrix:
[[4 1]
 [1 4]]
