# Logistic Regression for Binary Classification
**Author:** Magudeshwaran and Senthilkumaran

**Goal:** Build a Logistic Regression model to classify data into two categories.

### Step 1: Import Libraries
We need `pandas` for data, `numpy` for numbers, `matplotlib` for plotting, and `sklearn` for data generation, splitting, scaling, and the Logistic Regression model.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

### Step 2: Generate the Dataset
Since Logistic Regression is for classification, we'll create a synthetic dataset with two features and two classes using `make_classification`. This makes it easy to visualize.

In [None]:
X, y = make_classification(n_samples=100, n_features=2, n_informative=2, n_redundant=0, random_state=42)

# Convert to DataFrame for easier handling (optional but good practice)
df = pd.DataFrame(X, columns=['Feature 1', 'Feature 2'])
df['Target'] = y
df.head()

### Step 3: Split the Data
We split our data into training and testing sets to evaluate the model's performance on unseen data.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

### Step 4: Scale the Features
Scaling features is important for many machine learning algorithms, including Logistic Regression, to ensure all features contribute equally.

In [None]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

### Step 5: Train the Logistic Regression Model
We create and train our `LogisticRegression` model using the scaled training data.

In [None]:
model = LogisticRegression(random_state=42)
model.fit(X_train_scaled, y_train)

### Step 6: Make Predictions
We use the trained model to predict the class labels for our scaled test data.

In [None]:
y_pred = model.predict(X_test_scaled)

### Step 7: Evaluate the Model
We evaluate our model's performance using accuracy, a classification report, and a confusion matrix.

In [None]:
print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print("
Classification Report:")
print(classification_report(y_test, y_pred))
print("
Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))

### Step 8: Visualize the Decision Boundary
For 2D data, we can visualize how the Logistic Regression model separates the two classes.

In [None]:
# Create a meshgrid for plotting the decision boundary
x_min, x_max = X_scaled[:, 0].min() - 1, X_scaled[:, 0].max() + 1
y_min, y_max = X_scaled[:, 1].min() - 1, X_scaled[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01),
                     np.arange(y_min, y_max, 0.01))

# Predict probabilities for each point in the meshgrid
Z = model.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:, 1]
Z = Z.reshape(xx.shape)

plt.figure(figsize=(10, 7))
plt.contourf(xx, yy, Z, alpha=0.8, cmap='RdBu')
plt.scatter(X_scaled[:, 0], X_scaled[:, 1], c=y, cmap='RdBu', edgecolor='k', s=80)
plt.title('Logistic Regression Decision Boundary')
plt.xlabel('Scaled Feature 1')
plt.ylabel('Scaled Feature 2')
plt.colorbar(label='Probability of Class 1')
plt.show()