# Final Assignment — Data Science Methodology (CRISP-DM)
### Topic: **Credit Cards — Fraud Detection**
**Business Goal:** To detect fraudulent transactions using historical credit card data.

**Dataset File Name:** `credit_card_fraud.csv`

---
### Overview
In this notebook, we will demonstrate the CRISP-DM (Cross Industry Standard Process for Data Mining) methodology to address the defined business problem.


## 1. Business Understanding
As a client, the business problem is defined as follows:

- For the **Credit Cards — Fraud Detection** domain, our goal is: To detect fraudulent transactions using historical credit card data.

As a data scientist, our objective is to use data-driven methods to solve this business challenge.


## 2. Data Understanding
In this stage, we explore the available data to understand its structure and contents.
We assume the dataset `credit_card_fraud.csv` contains relevant features.


In [None]:
# Example: Load and preview dataset
import pandas as pd

df = pd.read_csv('credit_card_fraud.csv')
df.head()

## 3. Data Preparation
We clean and prepare the data for modeling — handling missing values, encoding categories, and splitting data.


In [None]:
# Example: Data cleaning and splitting
df = df.dropna()
from sklearn.model_selection import train_test_split
X = df.drop('target', axis=1)
y = df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


## 4. Modeling
We build a model to address the business problem.


In [None]:
# Example: Train a simple model
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)


## 5. Evaluation
We evaluate the model performance using accuracy and other metrics.


In [None]:
# Example: Evaluate model
from sklearn.metrics import accuracy_score, classification_report
y_pred = model.predict(X_test)
print('Accuracy:', accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))

## 6. Deployment
Finally, we outline how this model could be deployed into production — e.g., integrated with business systems or exposed via an API.

### Example:
- For Emails: integrate spam detection with an email filtering system.
- For Hospitals: integrate patient risk prediction with hospital record systems.
- For Credit Cards: integrate fraud detection model with real-time transaction monitoring.

---
**End of Notebook**