# 1. **Accuracy**
**Definition:** The ratio of correctly predicted instances to the total instances.

 $$ Accuracy = \frac{True\ Positives\ (TP) + True\ Negatives\ (TN)}{Total\ Samples} $$

**When to Use:** Suitable when the dataset is balanced.
Not recommended for **imbalanced** datasets.

#### **O'zbekcha:**
# Aniqlik (Accuracy):
To'g'ri taxmin qilingan holatlar sonining umumiy holatlar soniga nisbati.

$$ Aniqlik = \frac{To'g'ri\ Ijobiy\ (TP) + To'g'ri\ Salbiy\ (TN)}{Umumiy\ Namuna\ Soni} $$

**Qachon qo'llaniladi:**
Ma'lumotlar to'plami muvozanatli bo'lsa (sinflar teng taqsimlangan).
**Muvozanatsiz** ma'lumotlar to'plamlari uchun tavsiya etilmaydi.

# 2. Precision (Positive Predictive Value)
**Definition:**
The proportion of correctly predicted positive instances out of all predicted positive instances.

$$ Precision = \frac{True\ Positives\ (TP)}{True\ Positives\ (TP) + False\ Positives\ (FP)} $$

**When to Use:** When the cost of false positives is high (e.g., spam detection, where marking a legitimate email as spam is undesirable).

# **O'zbekcha:**
**Aniqlik (Precision):**
To'g'ri ijobiy deb topilgan holatlarning, barcha ijobiy deb taxmin qilingan holatlarga nisbati.

 $$ Aniqlik = \frac{To'g'ri\ Ijobiy\ (TP)}{To'g'ri\ Ijobiy\ (TP) + Noto'g'ri\ Ijobiy\ (FP)} $$

**Qachon qo'llaniladi:**
Noto'g'ri ijobiy natijalar narxi yuqori bo'lsa (masalan, spamni aniqlashda haqiqiy xat noto'g'ri spam deb belgilanmasligi kerak).

# 3. Recall (Sensitivity/True Positive Rate)
**Definition:**
The proportion of correctly predicted positive instances out of all actual positive instances.

$$ Recall = \frac{True\ Positives\ (TP)}{True\ Positives\ (TP) + False\ Negatives\ (FN)} $$

**When to Use:**
When the cost of false negatives is high (e.g., medical diagnosis, where failing to detect a disease can have severe consequences).

**O'zbekcha:**
**Xotira (Recall):**
To'g'ri ijobiy deb topilgan holatlarning, haqiqatan ham mavjud ijobiy holatlarga nisbati.

$$ Xotira = \frac{To'g'ri\ Ijobiy\ (TP)}{To'g'ri\ Ijobiy\ (TP) + Noto'g'ri\ Salbiy\ (FN)} $$

**Qachon qo'llaniladi:**
Noto'g'ri salbiy natijalar narxi yuqori bo'lsa (masalan, tibbiy tashxisda kasallikni aniqlay olmaslik jiddiy oqibatlarga olib kelishi mumkin).

# 4. F1 Score
**Definition:**
The harmonic mean of precision and recall, balancing the trade-off between the two.

$$ F1\ Score = 2 \times \frac{Precision \times Recall}{Precision + Recall} $$

**When to Use:**
When there is a class imbalance, and both precision and recall are important.

# **O'zbekcha:**
**F1 Bahosi (F1 Score):**
 Aniqlik va xotiraning garmoniyaviy o'rtacha qiymati, ikkalasi o'rtasidagi balansni saqlash uchun ishlatiladi.

$$ F1\ Bahosi = 2 \times \frac{Aniqlik \times Xotira}{Aniqlik + Xotira} $$

**Qachon qo'llaniladi:**
Sinflar muvozanatsiz bo'lsa va aniqlik hamda xotira birdek muhim bo'lsa.

# 5. Confusion Matrix
**Definition:**
A table that summarizes prediction results for a classification problem.

| Predicted Positive | Predicted Negative |
|---------------------|---------------------|
| Actual Positive     | True Positive (TP) | False Negative (FN) |
| Actual Negative     | False Positive (FP)| True Negative (TN)  |

**Usefulness:**
Provides a complete picture of model performance.

# **O'zbekcha:**
**Qaror Matritsasi (Confusion Matrix):**
 Klassifikatsiya masalalari uchun taxmin natijalarini jamlashga mo'ljallangan jadval.

| Bashorat Ijobiy | Bashorat Salbiy |
|-----------------|-----------------|
| Haqiqiy Ijobiy  | To'g'ri Ijobiy (TP) | Noto'g'ri Salbiy (FN) |
| Haqiqiy Salbiy  | Noto'g'ri Ijobiy (FP)| To'g'ri Salbiy (TN)  |

**Foydasi:**
Modelning ishlashini to'liq tasavvur qilish imkonini beradi.

In [2]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

In [5]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC


df = pd.read_csv("./data/Rainfall.csv")
df["rainfall"] = df["rainfall"].apply(lambda x: 1 if x == "yes" else 0)
df = df.dropna()


scaler = StandardScaler()

X = scaler.fit_transform(df.drop(["maxtemp", "mintemp","temparature","rainfall"], axis=1))
y = df["rainfall"]

X_train , X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=42)

model = SVC(random_state=42)
model.fit(X_train, y_train)

In [6]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

y_pred = model.predict(X_test)


accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1_sc = f1_score(y_test, y_pred)
con_matrix = confusion_matrix(y_test, y_pred)


print("Accuracy: ", accuracy)
print("Precision: ", precision)
print("Recall: " ,recall)
print("F1 Score: ", f1_sc)
print("Confusion Matrix: ", con_matrix)

Accuracy:  0.7454545454545455
Precision:  0.813953488372093
Recall:  0.8536585365853658
F1 Score:  0.8333333333333334
Confusion Matrix:  [[ 6  8]
 [ 6 35]]
