## Steps in the Code:

1. **Data Loading and Exploration**:
   - Loads the credit card transactions dataset (`credit_card_transactions.csv`).
   - Checks the class balance between fraudulent and genuine transactions.

2. **Data Preprocessing**:
   - Separates features (`X`) and target variable (`y`).
   - Normalizes the features using `StandardScaler` to scale them to zero mean and unit variance.

3. **Handling Class Imbalance**:
   - Uses `SMOTE` (Synthetic Minority Over-sampling Technique) to oversample the minority class (fraudulent transactions) to balance the dataset.

4. **Model Training**:
   - Initializes a `LogisticRegression` model.
   - Trains the model on the resampled training data (`X_train_resampled`, `y_train_resampled`).

5. **Model Evaluation**:
   - Makes predictions on the test set (`X_test`).
   - Evaluates the model's performance using `classification_report`, which includes metrics like precision, recall, and F1-score.

In [3]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from imblearn.over_sampling import SMOTE
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

# Load the dataset
data = pd.read_csv('creditcard.csv')

# Check for missing values
print("Missing values in dataset:")
print(data.isnull().sum())

# Drop rows with missing values (if any)
data.dropna(inplace=True)

# Separate features and target variable
X = data.drop('Class', axis=1)
y = data['Class']

# Normalize features using StandardScaler
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Apply SMOTE to handle class imbalance
smote = SMOTE(random_state=42)
X_train_resampled, y_train_resampled = smote.fit_resample(X_train, y_train)

# Initialize the model
model = LogisticRegression(max_iter=1000)

# Train the model on resampled data
model.fit(X_train_resampled, y_train_resampled)

# Make predictions on the test set
y_pred = model.predict(X_test)

# Evaluate model performance
print(classification_report(y_test, y_pred))


Missing values in dataset:
Time      0
V1        0
V2        0
V3        0
V4        0
V5        0
V6        0
V7        0
V8        0
V9        0
V10       0
V11       1
V12       1
V13       1
V14       1
V15       1
V16       1
V17       1
V18       1
V19       1
V20       1
V21       1
V22       1
V23       1
V24       1
V25       1
V26       1
V27       1
V28       1
Amount    1
Class     1
dtype: int64
              precision    recall  f1-score   support

         0.0       1.00      0.99      0.99      3966
         1.0       0.23      1.00      0.37        14

    accuracy                           0.99      3980
   macro avg       0.61      0.99      0.68      3980
weighted avg       1.00      0.99      0.99      3980

