## OVERVIEW OF DATASET
### 📊 Nigerian Financial Transactions and Fraud Detection Dataset

This synthetic dataset simulates **5 million Nigerian financial transactions** over a 12-month period, tailored for fraud detection and behavioral analysis. It includes **45 features** spanning transaction details, user behavior, device/IP intelligence, temporal patterns, and risk scores.

---

### 🔍 Key Characteristics

- **Fraud Rate**: ~15% (realistic for emerging markets)  
- **Unique Sender Accounts**: ~500,000  
- **Geographic Coverage**: 20 major cities across 6 Nigerian regions  
- **Payment Channels**: USSD, Mobile App, Card, Bank Transfer  
- **User Personas**: Salary Earner, Student, Trader  
- **Fraud Types**:
  - Account Takeover  
  - Identity Fraud  
  - SIM Swap  
  - Money Laundering  
  - Deposit Fraud  
  - Card-Not-Present  
  - Impossible Travel Fraud  

---

### ⚙️ Feature Engineering

- **No Data Leakage**: Expanding windows with `.shift(1)` logic  
- **Rolling Statistics**: Merchant fraud rate, user behavior metrics  
- **Temporal Consistency**: Chronologically computed features  
- **Realistic Patterns**: Based on Nigerian fintech behaviors  

---

### 🎯 Use Cases

- Supervised fraud detection model training  
- Real-time risk scoring system development  
- Behavioral analytics and segmentation  
- Unsupervised anomaly detection  
- Nigerian fintech research and benchmarking  

---

### ✅ Data Quality

- No missing values in critical fields  
- Logical consistency across features  
- Realistic transaction amounts and timing  
- Balanced fraud-to-legitimate ratio  

---

### ⚠️ Limitations

- Synthetic data — not based on real transactions  
- Limited to a 12-month simulation  
- May not capture evolving fraud patterns  
- Region-specific — tailored to Nigerian context  

---

### 💡 Recommended Usage

- Use as a training supplement, not a replacement for real data  
- Validate models on actual Nigerian transaction data  
- Retrain models regularly to adapt to evolving fraud tactics  
- Combine with external sources for production-grade systems

### Summary: Downloading the Nigerian Financial Transactions and Fraud Detection Dataset Locally

#### Steps Taken

1. **Installed Git LFS**
   ```powershell
   git lfs install
2. **generate a huggingface token(give read access)**   
3. **huggingface-cli login**   
4. **git clone https://huggingface.co/datasets/electricsheepafrica/  Nigerian-Financial-Transactions-and-Fraud-Detection-Dataset**
5. **cd Nigerian-Financial-Transactions-and-Fraud-Detection-Dataset**
6. **git lfs pull**



In [1]:
%pip install datasets psycopg2-binary sqlalchemy hf_xet





In [2]:
import pandas as pd
import numpy as np

In [3]:


df1 = pd.read_csv(r"C:\Users\PAMELA\Desktop\altschool-hackathon\Nigerian-Financial-Transactions-and-Fraud-Detection-Dataset\V1-nigerian-financial-transactions-and-fraud-detection-dataset.csv")
#df2 = pd.read_csv("C:/Users/PAMELA/Desktop/altschool-hackathon/Nigerian-Financial-Transactions-and-Fraud-Detection-Dataset/V2-nigerian-financial-transactions-and-fraud-detection-dataset-for-model-training.csv")

print(df1.head())


  transaction_id                   timestamp  sender_account  \
0        T100000  2023-08-27 09:03:17.516168      9899691027   
1        T100001  2023-08-25 14:11:12.606711      2194178774   
2        T100002  2023-05-19 09:30:37.742963      4193666484   
3        T100003  2023-10-28 07:26:44.195112      9174692071   
4        T100004  2023-09-16 17:58:14.700162      8722569311   

   receiver_account transaction_type           merchant_category location  \
0        5792850510       withdrawal  Data Subscription (Airtel)    Lagos   
1        7275770518       withdrawal            Paystack Payment      Aba   
2        7538320427          deposit              ATM Withdrawal   Ibadan   
3        9091723192          deposit                 Konga Order   Ibadan   
4        5128595934         transfer        Airtime Top-up (MTN)   Ibadan   

  device_used  is_fraud fraud_type  ...  spending_deviation_score  \
0      mobile     False        NaN  ...                     -0.21   
1         atm 

In [4]:
df1.isnull()

Unnamed: 0,transaction_id,timestamp,sender_account,receiver_account,transaction_type,merchant_category,location,device_used,is_fraud,fraud_type,...,spending_deviation_score,velocity_score,geo_anomaly_score,payment_channel,ip_address,device_hash,amount_ngn,bvn_linked,new_device_transaction,sender_persona
0,False,False,False,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4999995,False,False,False,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False
4999996,False,False,False,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False
4999997,False,False,False,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False
4999998,False,False,False,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False


In [5]:
df1.isnull().value_counts()

transaction_id  timestamp  sender_account  receiver_account  transaction_type  merchant_category  location  device_used  is_fraud  fraud_type  time_since_last_transaction  spending_deviation_score  velocity_score  geo_anomaly_score  payment_channel  ip_address  device_hash  amount_ngn  bvn_linked  new_device_transaction  sender_persona
False           False      False           False             False             False              False     False        False     True        False                        False                     False           False              False            False       False        False       False       False                   False             3923934
                                                                                                                                               True                         False                     False           False              False            False       False        False       False       False         

In [6]:
df1.isnull().sum()


transaction_id                       0
timestamp                            0
sender_account                       0
receiver_account                     0
transaction_type                     0
merchant_category                    0
location                             0
device_used                          0
is_fraud                             0
fraud_type                     4820447
time_since_last_transaction     896513
spending_deviation_score             0
velocity_score                       0
geo_anomaly_score                    0
payment_channel                      0
ip_address                           0
device_hash                          0
amount_ngn                           0
bvn_linked                           0
new_device_transaction               0
sender_persona                       0
dtype: int64

In [10]:
df1.columns

Index(['transaction_id', 'timestamp', 'sender_account', 'receiver_account',
       'transaction_type', 'merchant_category', 'location', 'device_used',
       'is_fraud', 'fraud_type', 'time_since_last_transaction',
       'spending_deviation_score', 'velocity_score', 'geo_anomaly_score',
       'payment_channel', 'ip_address', 'device_hash', 'amount_ngn',
       'bvn_linked', 'new_device_transaction', 'sender_persona'],
      dtype='object')

In [13]:
df1.sample(10)

Unnamed: 0,transaction_id,timestamp,sender_account,receiver_account,transaction_type,merchant_category,location,device_used,is_fraud,fraud_type,...,spending_deviation_score,velocity_score,geo_anomaly_score,payment_channel,ip_address,device_hash,amount_ngn,bvn_linked,new_device_transaction,sender_persona
2959801,T3059801,2023-03-25 18:58:30.886696,8766496791,4739102957,transfer,Jumia Purchase,Lagos,mobile,False,,...,-0.74,19,0.02,USSD,102.89.37.137,D1852403,37322.16,True,True,Salary Earner
208061,T308061,2023-09-18 01:56:47.177576,7438598957,4454111348,withdrawal,Bolt Ride,Kano,web,False,,...,1.36,17,0.97,Card,105.113.110.216,D7110979,2429858.63,True,False,Trader
3347500,T3447500,2023-07-26 13:46:59.364248,1906271286,2031702380,withdrawal,Airtime Top-up (MTN),Lagos,mobile,False,,...,0.26,12,0.69,Mobile App,41.58.235.57,D1507151,548938.08,True,True,Salary Earner
28507,T128507,2023-03-06 06:10:26.704781,2773648643,2515953714,deposit,Ikeja Electric Bill,Onitsha,pos,False,,...,-1.26,19,0.24,Bank Transfer,197.210.25.197,D7031748,1188872.85,True,False,Salary Earner
3392635,T3492635,2023-02-13 10:20:14.987032,1169258684,6021089401,deposit,Filmhouse Cinemas Ticket,Abuja,mobile,True,Account Takeover,...,0.99,16,0.33,Bank Transfer,197.210.109.255,D1144421,733175.73,True,True,Student
586105,T686105,2023-07-10 07:30:29.261109,8241002954,6904842381,payment,SPAR Purchase,Port Harcourt,web,True,Account Takeover,...,0.55,18,0.16,Card,197.210.99.221,D3865912,34058.09,True,True,Salary Earner
4388469,T4488469,2023-11-17 15:34:23.519677,3409274688,8968114944,withdrawal,Church Offering,Ibadan,mobile,False,,...,0.67,20,0.29,Mobile App,41.58.213.75,D9188706,439508.23,True,True,Salary Earner
2930589,T3030589,2023-05-24 11:31:08.685386,5079121320,3472634069,withdrawal,Airtime Top-up (MTN),Aba,web,False,,...,1.71,2,0.23,Bank Transfer,105.112.158.198,D1233168,1362369.95,True,True,Trader
878945,T978945,2023-11-03 14:25:33.550839,4126288859,6361428441,payment,SPAR Purchase,Ibadan,pos,False,,...,0.18,5,0.69,Mobile App,105.113.235.57,D6479100,59648.53,True,True,Student
3771098,T3871098,2023-05-02 13:51:51.294976,3587048643,7859554477,payment,Konga Order,Aba,pos,False,,...,0.33,4,0.04,Bank Transfer,105.112.9.196,D2816892,39857.96,False,True,Student
