# Fraud Detection: Exploratory Data Analysis (EDA)

In this notebook, we'll perform exploratory data analysis on the fraud detection dataset. We'll explore the data, visualize distributions, and check for patterns or correlations in the features.

In [None]:
# Step 1: Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

## Step 2: Load the Data

Here, we load the simulated dataset. In real-world applications, we would load this from a file or a database.

In [None]:
data = {
    'age': [23, 45, 34, 56, 22, 43, 36, 33, 54, 28],
    'transaction_amount': [100, 250, 75, 400, 200, 180, 300, 150, 350, 90],
    'location': ['NY', 'CA', 'CA', 'TX', 'TX', 'NY', 'NY', 'TX', 'CA', 'NY'],
    'is_fraud': [0, 1, 0, 1, 0, 0, 1, 0, 1, 0]  # 1 = Fraud, 0 = Non-Fraud
}

df = pd.DataFrame(data)

## Step 3: Inspect the Data

Let's take a quick look at the first few rows of the dataset to understand its structure and check for any missing values.

In [None]:
df.head()  # Display first few rows of the dataset

## Step 4: Visualize the Transaction Amount Distribution

Here, we'll plot the distribution of the transaction amounts to get a sense of the data.

In [None]:
# Visualize transaction amount distribution
plt.figure(figsize=(8, 6))
sns.histplot(df['transaction_amount'], kde=True, color='blue')
plt.title('Transaction Amount Distribution')
plt.xlabel('Transaction Amount')
plt.ylabel('Frequency')
plt.show()

## Step 6: Save the Notebook

Don't forget to save your work! You can do this by selecting "File" â†’ "Save and Checkpoint" in the Jupyter notebook interface.