## **ASSIGNMENT - LOGISTIC REGRESSION :**

---

**Question 1**: What is Logistic Regression, and how does it differ from Linear
Regression?

- Logistic Regression is used to predict categories (like yes/no).
It is different from Linear Regression, which predicts numbers (like prices or marks).
---
**Question 2**: Explain the role of the Sigmoid function in Logistic Regression.

- The Sigmoid function changes any number into a value between 0 and 1.
It helps turn the model’s output into a probability for class prediction.
---
**Question 3**: What is Regularization in Logistic Regression and why is it needed?

- Regularization helps stop the model from overfitting. It does this by adding a penalty for large weights in the model.
---
**Question 4**: What are some common evaluation metrics for classification models, and why are they important?

- Accuracy: All correct predictions
- Precision: Correct positive predictions
- Recall: Founds all actual positives
---




**Question 5**: Write a Python program that loads a CSV file into a Pandas DataFrame, splits into train/test sets, trains a Logistic Regression model, and prints its accuracy.

In [5]:
import pandas as pd
import seaborn as sns
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

data = load_iris()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=1)

model = LogisticRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print('Accuracy: ',accuracy)

Accuracy:  0.9666666666666667


**Question 6**: Write a Python program to train a Logistic Regression model using L2 regularization (Ridge) and print the model coefficients and accuracy.


In [8]:
import pandas as pd
import seaborn as sns
from sklearn.linear_model import RidgeClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

data = load_iris()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=1)

model = RidgeClassifier()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

print('Model Coefficients: ', model.coef_)
print('Accuracy: ',accuracy)

Model Coefficients:  [[ 0.13553884  0.47299258 -0.45202482 -0.1122233 ]
 [ 0.13213034 -0.96637265  0.21522235 -0.66136199]
 [-0.26766918  0.49338008  0.23680247  0.77358529]]
Accuracy:  0.7666666666666667


**Question 7**: Write a Python program to train a Logistic Regression model for multiclass classification using multi_class='ovr' and print the classification report

In [10]:
import pandas as pd
import seaborn as sns
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

data = load_iris()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=1)

model = LogisticRegression(multi_class='ovr')
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

print('Classification report: ',classification_report(y_test, y_pred))

Classification report:                precision    recall  f1-score   support

           0       1.00      1.00      1.00        11
           1       1.00      0.69      0.82        13
           2       0.60      1.00      0.75         6

    accuracy                           0.87        30
   macro avg       0.87      0.90      0.86        30
weighted avg       0.92      0.87      0.87        30





**Question 8**: Write a Python program to apply GridSearchCV to tune C and penalty hyperparameters for Logistic Regression and print the best parameters and validation accuracy.

In [14]:
import pandas as pd
import seaborn as sns
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.datasets import load_iris
import warnings
warnings.filterwarnings('ignore')

data = load_iris()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=1)

params = {
    'penalty': ['l1', 'l2', 'elasticnet', 'none'],
    'C': [0.1, 1, 10, 20, 30],
    'solver': ['saga']
}

model = LogisticRegression()
classifier = GridSearchCV(model, param_grid=params, cv=5, verbose=2, n_jobs=-1)
classifier.fit(X_train, y_train)

print('Best parameters: ',classifier.best_params_)
print('Best score', classifier.best_score_)

Fitting 5 folds for each of 20 candidates, totalling 100 fits
Best parameters:  {'C': 1, 'penalty': 'l2', 'solver': 'saga'}
Best score 0.9833333333333334


**Question 9**: Write a Python program to standardise the features before training Logistic Regression and compare the model's accuracy with and without scaling.

In [2]:
import pandas as pd
import seaborn as sns
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.datasets import load_iris
import warnings
warnings.filterwarnings('ignore')

data = load_iris()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=1)

# without scaling
model1 = LogisticRegression()
model1.fit(X_train, y_train)
y_pred1 = model1.predict(X_test)
accuracy1 = accuracy_score(y_test, y_pred1)

# with scaling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
model2 = LogisticRegression()
model2.fit(X_train_scaled, y_train)
y_pred2 = model2.predict(X_test_scaled)
accuracy2 = accuracy_score(y_test, y_pred2)

print(f"Accuracy without scaling: {accuracy1}")
print(f"Accuracy with scaling:    {accuracy2}")

Accuracy without scaling: 0.9666666666666667
Accuracy with scaling:    0.9666666666666667
