## Main Goal
The main goal of this project is identifying unusual or unexpected patterns within transactions or related activities.  
**Anomaly detection**, sometimes called outlier detection, is a process of finding patterns or instances in a dataset that deviate significantly from the expected or “normal behavior.”

### Data Understanding
The dataset contains information about various financial transactions, each represented by several features:

1. **Transaction_ID**: Unique identifier for each transaction.
2. **Transaction_Amount**: The monetary value of the transaction.
3. **Transaction_Volume**: The quantity or number of items/actions involved in the transaction.
4. **Average_Transaction_Amount**: The historical average transaction amount for the account.
5. **Frequency_of_Transactions**: How often transactions are typically performed by the account.
6. **Time_Since_Last_Transaction**: Time elapsed since the last transaction.
7. **Day_of_Week**: The day of the week when the transaction occurred.
8. **Time_of_Day**: The time of day when the transaction occurred.
9. **Age**: Age of the account holder.
10. **Gender**: Gender of the account holder.
11. **Income**: Income of the account holder.
12. **Account_Type**: Type of account (e.g., personal, business).

In [1]:
import pandas as pd

In [3]:
dataset = pd.read_csv("datasets/transaction_anomalies_dataset.csv")

In [5]:
dataset.head(2)

Unnamed: 0,Transaction_ID,Transaction_Amount,Transaction_Volume,Average_Transaction_Amount,Frequency_of_Transactions,Time_Since_Last_Transaction,Day_of_Week,Time_of_Day,Age,Gender,Income,Account_Type
0,TX0,1024.835708,3,997.234714,12,29,Friday,06:00,36,Male,1436074,Savings
1,TX1,1013.952065,4,1020.210306,7,22,Friday,01:00,41,Female,627069,Savings


In [6]:
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 12 columns):
 #   Column                       Non-Null Count  Dtype  
---  ------                       --------------  -----  
 0   Transaction_ID               1000 non-null   object 
 1   Transaction_Amount           1000 non-null   float64
 2   Transaction_Volume           1000 non-null   int64  
 3   Average_Transaction_Amount   1000 non-null   float64
 4   Frequency_of_Transactions    1000 non-null   int64  
 5   Time_Since_Last_Transaction  1000 non-null   int64  
 6   Day_of_Week                  1000 non-null   object 
 7   Time_of_Day                  1000 non-null   object 
 8   Age                          1000 non-null   int64  
 9   Gender                       1000 non-null   object 
 10  Income                       1000 non-null   int64  
 11  Account_Type                 1000 non-null   object 
dtypes: float64(2), int64(5), object(5)
memory usage: 93.9+ KB


In [None]:
dataset.isnull().sum()

Transaction_ID                 0
Transaction_Amount             0
Transaction_Volume             0
Average_Transaction_Amount     0
Frequency_of_Transactions      0
Time_Since_Last_Transaction    0
Day_of_Week                    0
Time_of_Day                    0
Age                            0
Gender                         0
Income                         0
Account_Type                   0
dtype: int64