<a href="https://colab.research.google.com/github/Francis-Njuguna/30_Days_Of_Python/blob/main/Logistic_Regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

### 1. Load the Dataset

We'll use the Iris dataset, which is a classic dataset for classification problems. It contains measurements of iris flowers and their species.

In [2]:
# Load the Iris dataset
iris = load_iris()
X = iris.data  # Features
y = iris.target  # Target variable (species)

print("Features (X) shape:", X.shape)
print("Target (y) shape:", y.shape)

# Display first 5 rows of features
df_features = pd.DataFrame(X, columns=iris.feature_names)
display(df_features.head())

Features (X) shape: (150, 4)
Target (y) shape: (150,)


Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


### 2. Split Data into Training and Testing Sets

It's crucial to split your data to evaluate the model's performance on unseen data. We'll use 80% for training and 20% for testing.

In [3]:
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print(f"Training features shape: {X_train.shape}")
print(f"Testing features shape: {X_test.shape}")
print(f"Training target shape: {y_train.shape}")
print(f"Testing target shape: {y_test.shape}")

Training features shape: (120, 4)
Testing features shape: (30, 4)
Training target shape: (120,)
Testing target shape: (30,)


### 3. Initialize and Train the Model

We'll use Logistic Regression, a simple yet effective model for classification.

In [4]:
# Initialize the Logistic Regression model
model = LogisticRegression(max_iter=200) # Increased max_iter for convergence

# Train the model using the training data
model.fit(X_train, y_train)

print("Model trained successfully!")

Model trained successfully!


### 4. Evaluate the Model

We'll make predictions on the test set and calculate the accuracy, which is the proportion of correctly classified instances.

In [5]:
# Make predictions on the test set
y_pred = model.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)

print(f"Model Accuracy on the test set: {accuracy:.2f}")

Model Accuracy on the test set: 1.00


This example demonstrates the fundamental steps of loading data, splitting it, training a model, and evaluating its performance.