Q 42. What evaluation metrics are more suitable for logistic regression than accuracy?

Accuracy alone is not a dependable measure for Logistic Regression, especially with imbalanced datasets or when the cost of different errors differs. Logistic Regression gives probabilities, so metrics that assess class performance and the quality of those probabilities work better.
Precision measures how many predicted positives are actually positive, which is helpful when false positives are a concern. Recall measures how many actual positives the model correctly identifies, which is crucial when false negatives are expensive. The F1-Score combines precision and recall and is useful for imbalanced data.These metrics provide a more informative and robust evaluation than accuracy.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import precision_score, recall_score, f1_score, roc_auc_score, log_loss

# Load dataset
data = pd.read_csv("weather_classification_data.csv")

# Encode all categorical columns
label_encoders = {}
for col in data.select_dtypes(include=['object']).columns:
    le = LabelEncoder()
    data[col] = le.fit_transform(data[col])
    label_encoders[col] = le

# Split features and target
X = data.drop("Weather Type", axis=1)
y = data["Weather Type"]

# Train-test split
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Model
model = LogisticRegression(max_iter=5000)
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)
y_prob = model.predict_proba(X_test)[:, 1]

# Metrics
print("Precision:", precision_score(y_test, y_pred, average='weighted'))
print("Recall:", recall_score(y_test, y_pred, average='weighted'))
print("F1 Score:", f1_score(y_test, y_pred, average='weighted'))



Precision: 0.8456989220316136
Recall: 0.8462121212121212
F1 Score: 0.8455368812137377


STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT

Increase the number of iterations to improve the convergence (max_iter=5000).
You might also want to scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
