<a href="https://colab.research.google.com/github/sna-ds/Fraudulent-Transaction-Analysis-in-E-Commerce/blob/main/Final_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# About Project

## Business Objective
Fraudulent e-commerce transactions dapat merugikan perusahaan secara finansial dan merusak kepercayaan pelanggan. Perusahaan ingin mengetahui faktor apa saja yang berkontribusi terhadap transaksi penipuan, menganalisis pola transaksi berisiko, kelompok pelanggan yang memiliki risiko fraud tinggi serta membuat strategi pencegahan fraud.

Hasil project ini dapat membantu:
- Perusahaan menurunkan tingkat kerugian
- Mendeteksi secara dini transaksi fraud
- Menentukan kebijakan verifikasi yang berbeda per segmen
- Mengalokasikan sumber daya risk management dengan lebih efektif.

## Relevansi bisnis:
- Mengurangi financial loss akibat chargeback dan penipuan.
- Meningkatkan brand trust dan loyalitas pelanggan.
- Memberikan insight kepada tim risk management serta memberikan rekomendasi berdasarkan pola transaksi untuk membuat kebijakan pencegahan.

## Metode analisis:
**Customer Segmentation**: Mengelompokkan pelanggan/transaksi berdasarkan tingkat risiko fraud.

# Explore and Prepare Dataset

In [None]:
import pandas as pd

In [None]:
fraud = pd.read_csv('/content/Fraudulent_E-Commerce_Transaction_Data_2.csv')

In [None]:
fraud.head()

Unnamed: 0,Transaction ID,Customer ID,Transaction Amount,Transaction Date,Payment Method,Product Category,Quantity,Customer Age,Customer Location,Device Used,IP Address,Shipping Address,Billing Address,Is Fraudulent,Account Age Days,Transaction Hour
0,c12e07a0-8a06-4c0d-b5cc-04f3af688570,8ca9f102-02a4-4207-ab63-484e83a1bdf0,42.32,2024-03-24 23:42:43,PayPal,electronics,1,40,East Jameshaven,desktop,110.87.246.85,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,0,282,23
1,7d187603-7961-4fce-9827-9698e2b6a201,4d158416-caae-4b09-bd5b-15235deb9129,301.34,2024-01-22 00:53:31,credit card,electronics,3,35,Kingstad,tablet,14.73.104.153,"5230 Stephanie Forge\nCollinsbury, PR 81853","5230 Stephanie Forge\nCollinsbury, PR 81853",0,223,0
2,f2c14f9d-92df-4aaf-8931-ceaf4e63ed72,ccae47b8-75c7-4f5a-aa9e-957deced2137,340.32,2024-01-22 08:06:03,debit card,toys & games,5,29,North Ryan,desktop,67.58.94.93,"195 Cole Oval\nPort Larry, IA 58422","4772 David Stravenue Apt. 447\nVelasquezside, ...",0,360,8
3,e9949bfa-194d-486b-84da-9565fca9e5ce,b04960c0-aeee-4907-b1cd-4819016adcef,95.77,2024-01-16 20:34:53,credit card,electronics,5,45,Kaylaville,mobile,202.122.126.216,"7609 Cynthia Square\nWest Brenda, NV 23016","7609 Cynthia Square\nWest Brenda, NV 23016",0,325,20
4,7362837c-7538-434e-8731-0df713f5f26d,de9d6351-b3a7-4bc7-9a55-8f013eb66928,77.45,2024-01-16 15:47:23,credit card,clothing,5,42,North Edwardborough,desktop,96.77.232.76,"2494 Robert Ramp Suite 313\nRobinsonport, AS 5...","2494 Robert Ramp Suite 313\nRobinsonport, AS 5...",0,116,15


In [None]:
fraud.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23634 entries, 0 to 23633
Data columns (total 16 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Transaction ID      23634 non-null  object 
 1   Customer ID         23634 non-null  object 
 2   Transaction Amount  23634 non-null  float64
 3   Transaction Date    23634 non-null  object 
 4   Payment Method      23634 non-null  object 
 5   Product Category    23634 non-null  object 
 6   Quantity            23634 non-null  int64  
 7   Customer Age        23634 non-null  int64  
 8   Customer Location   23634 non-null  object 
 9   Device Used         23634 non-null  object 
 10  IP Address          23634 non-null  object 
 11  Shipping Address    23634 non-null  object 
 12  Billing Address     23634 non-null  object 
 13  Is Fraudulent       23634 non-null  int64  
 14  Account Age Days    23634 non-null  int64  
 15  Transaction Hour    23634 non-null  int64  
dtypes: f

In [None]:
fraud["Is Fraudulent"] = fraud["Is Fraudulent"].map({0: "Legitimate", 1: "Fraudulent"}).astype("category")

In [None]:
fraud['Transaction Date'] = pd.to_datetime(fraud['Transaction Date'])
fraud.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23634 entries, 0 to 23633
Data columns (total 16 columns):
 #   Column              Non-Null Count  Dtype         
---  ------              --------------  -----         
 0   Transaction ID      23634 non-null  object        
 1   Customer ID         23634 non-null  object        
 2   Transaction Amount  23634 non-null  float64       
 3   Transaction Date    23634 non-null  datetime64[ns]
 4   Payment Method      23634 non-null  object        
 5   Product Category    23634 non-null  object        
 6   Quantity            23634 non-null  int64         
 7   Customer Age        23634 non-null  int64         
 8   Customer Location   23634 non-null  object        
 9   Device Used         23634 non-null  object        
 10  IP Address          23634 non-null  object        
 11  Shipping Address    23634 non-null  object        
 12  Billing Address     23634 non-null  object        
 13  Is Fraudulent       23634 non-null  category  

In [None]:
fraud.describe()

Unnamed: 0,Transaction Amount,Transaction Date,Quantity,Customer Age,Account Age Days,Transaction Hour
count,23634.0,23634,23634.0,23634.0,23634.0,23634.0
mean,229.367099,2024-02-18 15:17:19.427942912,3.00055,34.56021,178.660531,11.266015
min,10.0,2024-01-01 00:01:19,1.0,-2.0,1.0,0.0
25%,69.07,2024-01-24 21:05:15.750000128,2.0,28.0,84.0,5.0
50%,151.415,2024-02-18 21:18:54,3.0,35.0,178.0,11.0
75%,296.1275,2024-03-14 00:01:59.500000,4.0,41.0,272.0,17.0
max,9716.5,2024-04-07 08:54:03,5.0,73.0,365.0,23.0
std,282.046669,,1.419663,10.009471,107.388682,6.980659


In [None]:
fraud.describe(include=["object","category"])

Unnamed: 0,Transaction ID,Customer ID,Payment Method,Product Category,Customer Location,Device Used,IP Address,Shipping Address,Billing Address,Is Fraudulent
count,23634,23634,23634,23634,23634,23634,23634,23634,23634,23634
unique,23634,23634,4,5,14868,3,23634,23634,23634,2
top,23e3c107-f2fc-48c2-abbc-7b809bf6f102,d8d7a64e-8419-4421-910a-a7cf709a900b,debit card,home & garden,North Michael,desktop,116.188.254.162,"289 Adams Wells\nWest Joeltown, LA 69190","289 Adams Wells\nWest Joeltown, LA 69190",Legitimate
freq,1,1,5952,4786,30,7923,1,1,1,22412


In [None]:
categorical_cols = fraud[['Payment Method', 'Product Category', 'Customer Location', 'Device Used', 'Is Fraudulent']].columns

for col in categorical_cols:
    print(f"--- {col} ---")
    print(fraud[col].value_counts(dropna=False))
    print()

--- Payment Method ---
Payment Method
debit card       5952
credit card      5923
PayPal           5899
bank transfer    5860
Name: count, dtype: int64

--- Product Category ---
Product Category
home & garden      4786
electronics        4748
toys & games       4730
clothing           4699
health & beauty    4671
Name: count, dtype: int64

--- Customer Location ---
Customer Location
North Michael        30
East Michael         24
West Christopher     21
East David           20
Lake Michael         20
                     ..
North Walterhaven     1
West Jeff             1
New Karen             1
Phillipville          1
Lake Lindaton         1
Name: count, Length: 14868, dtype: int64

--- Device Used ---
Device Used
desktop    7923
mobile     7881
tablet     7830
Name: count, dtype: int64

--- Is Fraudulent ---
Is Fraudulent
Legitimate    22412
Fraudulent     1222
Name: count, dtype: int64



## 📌 Observation


Terdapat 23.634 data dalam 16 kolom (variabel) yang berisi informasi transaksi pada e-commerce. Data tidak menunjukkan adanya duplikasi dan memiliki nial kosong (missing value). Data terlihat memiliki outline. Hal ini dapat berguna untuk analisis risiko fraud, tetapi perlu di perhatikan pada ouliner eror (nilai tidak wajar).

- `Transaction Amount` -> Nilai `max = 9.716,5` memiliki rentan yang jauh dengan `mean = 229,36`, bahkan jauh di atas nilai Q3 = 296,1. Hal ini wajar dan dapat mengindikasi fraud.
- `Quantity` -> Tidak tampak outliner karena `mean = 3` dan `median = 3`.
- `Customer Age` -> Memiliki nilai yang tidak wajar, `min = -2` **Harus di handling**
- `Account Age Days` -> Tidak tampak outlier (`mean = 178,66` dan `median = 178`) dan memiliki nilai yang wajar (`min = 1 dan max = 365`)
- `Transaction Hour` -> Tidak tampak outliner karena median mirip, yaitu `mean = 11,26` dan `median = 11`, serta memiliki nilai wajar dilihat dari nilai `min = 0` dan `max = 23`

Kolomnya terdiri dari 6 kategori, yaitu:
- **Identitas (ID Unik)**
  - Transaction ID
  - Customer ID
- **Detail Transaksi**
  - Transaction Amount
  - Transaction Date
  - Transaction Hour
  - Payment Method
  - Product Category
  - Quantity
- **Detail User**
  - Customer Age
  - Customer Location
  - Account Age Days
  - Device Used
  - IP Address
- **Detail Alamat**
  - Shipping Address
  - Billing Address
- **Indikator Fraud**
  - Is Fraudulent

Data Dictionary:

| Kolom                 | Deskripsi |
|-----------------------|-----------|
| Transaction ID        | ID unik untuk setiap transaksi. |
| Customer ID           | ID unik untuk setiap pelanggan. |
| Transaction Amount    | Jumlah total uang yang dibayarkan dalam transaksi. |
| Transaction Date      | Tanggal dan waktu terjadinya transaksi. |
| Payment Method        | Metode pembayaran yang digunakan (misalnya: kartu kredit, PayPal, dll.). |
| Product Category      | Kategori produk yang terlibat dalam transaksi. |
| Quantity              | Jumlah produk yang dibeli dalam transaksi. |
| Customer Age          | Usia pelanggan yang melakukan transaksi. |
| Customer Location     | Lokasi geografis pelanggan. |
| Device Used           | Jenis perangkat yang digunakan untuk melakukan transaksi (misalnya: ponsel, desktop). |
| IP Address            | Alamat IP perangkat yang digunakan untuk transaksi. |
| Shipping Address      | Alamat tujuan pengiriman produk. |
| Billing Address       | Alamat yang terkait dengan metode pembayaran. |
| Is Fraudulent         | Indikator apakah transaksi bersifat penipuan (1 untuk penipuan, 0 untuk aman). |
| Account Age Days      | Usia akun pelanggan dalam hitungan hari pada saat transaksi. |
| Transaction Hour      | Jam terjadinya transaksi. |


## Data Cleaning

In [None]:
fraud.duplicated().sum()

np.int64(0)

In [None]:
fraud.isnull().sum()

Unnamed: 0,0
Transaction ID,0
Customer ID,0
Transaction Amount,0
Transaction Date,0
Payment Method,0
Product Category,0
Quantity,0
Customer Age,0
Customer Location,0
Device Used,0


In [None]:
for column in fraud.columns:
    print(f"============= {column} =================")
    display(fraud[column].value_counts())
    print()



Unnamed: 0_level_0,count
Transaction ID,Unnamed: 1_level_1
23e3c107-f2fc-48c2-abbc-7b809bf6f102,1
c12e07a0-8a06-4c0d-b5cc-04f3af688570,1
e39e6277-3906-43ef-8496-59c2a1d6ee15,1
86e9ef16-82a3-4728-8c44-08f649e49f32,1
d01496b3-ef2f-4b8f-9fc5-28bcb35d5d84,1
...,...
47b35c5d-d4c9-4a7d-a354-cd41596abf67,1
5da506fe-d4df-474a-b773-146333f06cfe,1
7362837c-7538-434e-8731-0df713f5f26d,1
e9949bfa-194d-486b-84da-9565fca9e5ce,1





Unnamed: 0_level_0,count
Customer ID,Unnamed: 1_level_1
d8d7a64e-8419-4421-910a-a7cf709a900b,1
8ca9f102-02a4-4207-ab63-484e83a1bdf0,1
2164f671-4e48-4301-9d28-6853428726f0,1
71db5799-e3bf-488a-907f-21f59baae5a9,1
6b0d3cac-9c92-49f6-a03d-c57c0a27df0a,1
...,...
6a5305a3-b47c-4bdb-91d7-3bf126530e01,1
03033baf-2bcc-4608-b5b8-9c86976f4948,1
de9d6351-b3a7-4bc7-9a55-8f013eb66928,1
b04960c0-aeee-4907-b1cd-4819016adcef,1





Unnamed: 0_level_0,count
Transaction Amount,Unnamed: 1_level_1
121.71,6
34.85,6
16.20,6
10.42,6
15.26,6
...,...
117.28,1
891.62,1
96.75,1
274.05,1





Unnamed: 0_level_0,count
Transaction Date,Unnamed: 1_level_1
2024-01-06 14:38:19,2
2024-01-15 21:55:37,2
2024-03-16 06:56:48,2
2024-01-05 14:57:33,2
2024-02-08 00:11:30,2
...,...
2024-03-19 04:28:35,1
2024-03-01 15:19:09,1
2024-01-22 04:16:31,1
2024-03-29 04:44:01,1





Unnamed: 0_level_0,count
Payment Method,Unnamed: 1_level_1
debit card,5952
credit card,5923
PayPal,5899
bank transfer,5860





Unnamed: 0_level_0,count
Product Category,Unnamed: 1_level_1
home & garden,4786
electronics,4748
toys & games,4730
clothing,4699
health & beauty,4671





Unnamed: 0_level_0,count
Quantity,Unnamed: 1_level_1
5,4816
2,4764
1,4743
3,4680
4,4631





Unnamed: 0_level_0,count
Customer Age,Unnamed: 1_level_1
33,954
32,936
37,924
36,916
39,913
...,...
68,2
69,2
-2,1
70,1





Unnamed: 0_level_0,count
Customer Location,Unnamed: 1_level_1
North Michael,30
East Michael,24
West Christopher,21
East David,20
Lake Michael,20
...,...
North Walterhaven,1
West Jeff,1
New Karen,1
Phillipville,1





Unnamed: 0_level_0,count
Device Used,Unnamed: 1_level_1
desktop,7923
mobile,7881
tablet,7830





Unnamed: 0_level_0,count
IP Address,Unnamed: 1_level_1
116.188.254.162,1
110.87.246.85,1
175.5.50.240,1
40.175.57.155,1
123.91.78.104,1
...,...
93.54.173.138,1
158.48.161.135,1
96.77.232.76,1
202.122.126.216,1





Unnamed: 0_level_0,count
Shipping Address,Unnamed: 1_level_1
"289 Adams Wells\nWest Joeltown, LA 69190",1
"5399 Rachel Stravenue Suite 718\nNorth Blakeburgh, IL 78600",1
Unit 2963 Box 5735\nDPO AP 53008,1
"63204 Kelly Squares\nAlexandermouth, HI 65282",1
"28515 Phillips Manor\nWest Jenniferfurt, MT 34058",1
...,...
"272 Tammy Isle Apt. 969\nNorth Michaelmouth, MH 02615",1
"PSC 3832, Box 5265\nAPO AE 85694",1
"2494 Robert Ramp Suite 313\nRobinsonport, AS 52039",1
"7609 Cynthia Square\nWest Brenda, NV 23016",1





Unnamed: 0_level_0,count
Billing Address,Unnamed: 1_level_1
"289 Adams Wells\nWest Joeltown, LA 69190",1
"5399 Rachel Stravenue Suite 718\nNorth Blakeburgh, IL 78600",1
Unit 2963 Box 5735\nDPO AP 53008,1
"63204 Kelly Squares\nAlexandermouth, HI 65282",1
"28515 Phillips Manor\nWest Jenniferfurt, MT 34058",1
...,...
"272 Tammy Isle Apt. 969\nNorth Michaelmouth, MH 02615",1
"PSC 3832, Box 5265\nAPO AE 85694",1
"2494 Robert Ramp Suite 313\nRobinsonport, AS 52039",1
"7609 Cynthia Square\nWest Brenda, NV 23016",1





Unnamed: 0_level_0,count
Is Fraudulent,Unnamed: 1_level_1
Legitimate,22412
Fraudulent,1222





Unnamed: 0_level_0,count
Account Age Days,Unnamed: 1_level_1
12,103
4,99
8,95
20,94
22,93
...,...
112,44
267,44
237,44
189,42





Unnamed: 0_level_0,count
Transaction Hour,Unnamed: 1_level_1
0,1102
4,1067
3,1048
1,1048
2,1037
5,1029
8,1017
19,1016
16,1014
13,1002





In [None]:
fraud['Customer Age'].unique()

array([40, 35, 29, 45, 42,  9, 41, 39, 19, 24, 53, 51, 15, 33, 37, 34, 18,
       27, 48, 22, 25, 14, 28, 44, 32, 46, 17, 60, 23, 30, 26, 31, 20, 43,
       38, 47, 36, 50, 49, 16, 61, 21, 58, 13, 54, 52,  8, 56, 12, 73, 55,
       57,  6,  5, 11, 65,  7,  3, 10,  4, 59, 62, 64, 63, 67,  0, 66,  2,
        1, 68, -2, 69, 70, 71])

In [None]:
fraud[fraud['Customer Age'] < 1]

Unnamed: 0,Transaction ID,Customer ID,Transaction Amount,Transaction Date,Payment Method,Product Category,Quantity,Customer Age,Customer Location,Device Used,IP Address,Shipping Address,Billing Address,Is Fraudulent,Account Age Days,Transaction Hour
4157,99d127ce-995e-49a1-94ed-da7046822cf7,7a58a020-bbb0-40aa-851f-c7212058ead4,150.7,2024-01-16 15:10:45,debit card,toys & games,5,0,North Ariana,tablet,41.228.150.194,33082 Gallegos Stream Apt. 311\nLake Robertber...,33082 Gallegos Stream Apt. 311\nLake Robertber...,Legitimate,357,15
11200,c8d09263-fce5-42aa-be68-0267eb304b5b,b92b8b5e-46c0-4757-a99a-a7cbeeddb91c,103.83,2024-01-18 02:26:44,debit card,electronics,5,0,Lake Kimberly,tablet,173.210.217.148,"8560 Gregory Mountains\nWest Stephanie, AK 79134","8560 Gregory Mountains\nWest Stephanie, AK 79134",Legitimate,81,2
12919,7fab7db0-e4d2-4b51-98f3-bb17b5153224,e0345e5b-d91c-4851-b9ae-5170c9b8e7cc,190.76,2024-01-05 15:02:49,PayPal,health & beauty,3,-2,Cohenmouth,desktop,134.95.208.38,Unit 2353 Box 9474\nDPO AA 79121,Unit 2353 Box 9474\nDPO AA 79121,Legitimate,253,15
13728,840ef430-f43f-478b-8e7d-be28f2e9e07e,1b03b1d1-f7f1-4cd9-97a3-6edb1545fb65,10.44,2024-03-25 19:36:57,debit card,toys & games,4,0,Gibsonmouth,desktop,182.71.142.114,"764 Alex Ridge Apt. 785\nKarenchester, WA 65308","764 Alex Ridge Apt. 785\nKarenchester, WA 65308",Legitimate,33,19
14869,d7421b58-7184-499f-bc79-3ce923d0563d,dccee636-66ab-4128-98e2-37b6753c1ca4,798.07,2024-04-04 17:07:59,credit card,home & garden,1,0,Jasonmouth,desktop,77.212.47.73,"573 Angela Walk Apt. 802\nGomezchester, IN 10610","573 Angela Walk Apt. 802\nGomezchester, IN 10610",Legitimate,340,17
20224,c542eac6-5520-4c44-b895-4013f2e650d8,d557a548-d5c0-4a1b-9509-7fae9e560346,106.67,2024-03-17 13:00:27,credit card,health & beauty,4,0,Port Guyfurt,mobile,4.147.205.233,"21374 David Prairie\nDarrentown, NE 49917","21374 David Prairie\nDarrentown, NE 49917",Legitimate,222,13
20755,25adc05f-1048-4457-a66c-75029221cf7d,b02c67ab-e6bc-4987-a1f0-183ee93f8398,22.26,2024-02-20 17:38:48,bank transfer,health & beauty,2,0,New Laura,mobile,57.137.156.160,"2358 Thomas Skyway Suite 843\nBowenside, CO 92786","2358 Thomas Skyway Suite 843\nBowenside, CO 92786",Legitimate,324,17
22065,05c46572-1a4b-4038-b4f7-cf096b5ba274,544e5689-559f-4a6d-b16f-1c96347831fa,172.29,2024-03-13 21:31:30,debit card,clothing,3,0,East Hector,mobile,132.74.219.70,"1551 Thornton Forges\nRobersonchester, FM 32443","1551 Thornton Forges\nRobersonchester, FM 32443",Legitimate,222,21
22268,d5c01367-f97d-461f-b238-32a1cee8a2b6,ca6e300f-b084-4f14-8ed4-06db3ebf83b3,177.24,2024-03-15 15:21:30,debit card,toys & games,2,0,South Samanthaview,mobile,146.99.51.124,"6512 Shaw Cliff Apt. 079\nWest Joseph, MO 28555","6512 Shaw Cliff Apt. 079\nWest Joseph, MO 28555",Legitimate,330,15


### 📌 Observation
Setelah observasi data untuk duplikasi dan missing value, data tidak menunjukkan adanya duplikasi dan misiing value.
> Terdapat nilai yang tidak wajar `Customer Age = -2` sebaiknya di buang.

In [None]:
fraud = fraud[fraud['Customer Age'] >= 1]

In [None]:
fraud['Customer Age'].unique()

array([40, 35, 29, 45, 42,  9, 41, 39, 19, 24, 53, 51, 15, 33, 37, 34, 18,
       27, 48, 22, 25, 14, 28, 44, 32, 46, 17, 60, 23, 30, 26, 31, 20, 43,
       38, 47, 36, 50, 49, 16, 61, 21, 58, 13, 54, 52,  8, 56, 12, 73, 55,
       57,  6,  5, 11, 65,  7,  3, 10,  4, 59, 62, 64, 63, 67, 66,  2,  1,
       68, 69, 70, 71])

## Data Manipulation & Feature Engineering

### Kolom Age Segmentation

In [None]:
def customer_age_group(age):
    if age < 25:
        return 'Youth (<25)'
    elif age <= 40:
        return 'Adult (25–40)'
    else:
        return 'Senior (40+)'

fraud['Customer_Age_Group'] = fraud['Customer Age'].apply(customer_age_group)
fraud.head()

Unnamed: 0,Transaction ID,Customer ID,Transaction Amount,Transaction Date,Payment Method,Product Category,Quantity,Customer Age,Customer Location,Device Used,IP Address,Shipping Address,Billing Address,Is Fraudulent,Account Age Days,Transaction Hour,Customer_Age_Group
0,c12e07a0-8a06-4c0d-b5cc-04f3af688570,8ca9f102-02a4-4207-ab63-484e83a1bdf0,42.32,2024-03-24 23:42:43,PayPal,electronics,1,40,East Jameshaven,desktop,110.87.246.85,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,Legitimate,282,23,Adult (25–40)
1,7d187603-7961-4fce-9827-9698e2b6a201,4d158416-caae-4b09-bd5b-15235deb9129,301.34,2024-01-22 00:53:31,credit card,electronics,3,35,Kingstad,tablet,14.73.104.153,"5230 Stephanie Forge\nCollinsbury, PR 81853","5230 Stephanie Forge\nCollinsbury, PR 81853",Legitimate,223,0,Adult (25–40)
2,f2c14f9d-92df-4aaf-8931-ceaf4e63ed72,ccae47b8-75c7-4f5a-aa9e-957deced2137,340.32,2024-01-22 08:06:03,debit card,toys & games,5,29,North Ryan,desktop,67.58.94.93,"195 Cole Oval\nPort Larry, IA 58422","4772 David Stravenue Apt. 447\nVelasquezside, ...",Legitimate,360,8,Adult (25–40)
3,e9949bfa-194d-486b-84da-9565fca9e5ce,b04960c0-aeee-4907-b1cd-4819016adcef,95.77,2024-01-16 20:34:53,credit card,electronics,5,45,Kaylaville,mobile,202.122.126.216,"7609 Cynthia Square\nWest Brenda, NV 23016","7609 Cynthia Square\nWest Brenda, NV 23016",Legitimate,325,20,Senior (40+)
4,7362837c-7538-434e-8731-0df713f5f26d,de9d6351-b3a7-4bc7-9a55-8f013eb66928,77.45,2024-01-16 15:47:23,credit card,clothing,5,42,North Edwardborough,desktop,96.77.232.76,"2494 Robert Ramp Suite 313\nRobinsonport, AS 5...","2494 Robert Ramp Suite 313\nRobinsonport, AS 5...",Legitimate,116,15,Senior (40+)


### Kolom Account Age Group

In [None]:
def account_age_group(account):
    if account < 30:
        return 'New Account (<30)'
    elif account <= 90:
        return 'Mid Age Account (30-90)'
    else:
        return 'Established Account (90+)'

fraud['Account_Age_Group'] = fraud['Account Age Days'].apply(account_age_group)
fraud.head()

Unnamed: 0,Transaction ID,Customer ID,Transaction Amount,Transaction Date,Payment Method,Product Category,Quantity,Customer Age,Customer Location,Device Used,IP Address,Shipping Address,Billing Address,Is Fraudulent,Account Age Days,Transaction Hour,Customer_Age_Group,Account_Age_Group
0,c12e07a0-8a06-4c0d-b5cc-04f3af688570,8ca9f102-02a4-4207-ab63-484e83a1bdf0,42.32,2024-03-24 23:42:43,PayPal,electronics,1,40,East Jameshaven,desktop,110.87.246.85,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,Legitimate,282,23,Adult (25–40),Established Account (90+)
1,7d187603-7961-4fce-9827-9698e2b6a201,4d158416-caae-4b09-bd5b-15235deb9129,301.34,2024-01-22 00:53:31,credit card,electronics,3,35,Kingstad,tablet,14.73.104.153,"5230 Stephanie Forge\nCollinsbury, PR 81853","5230 Stephanie Forge\nCollinsbury, PR 81853",Legitimate,223,0,Adult (25–40),Established Account (90+)
2,f2c14f9d-92df-4aaf-8931-ceaf4e63ed72,ccae47b8-75c7-4f5a-aa9e-957deced2137,340.32,2024-01-22 08:06:03,debit card,toys & games,5,29,North Ryan,desktop,67.58.94.93,"195 Cole Oval\nPort Larry, IA 58422","4772 David Stravenue Apt. 447\nVelasquezside, ...",Legitimate,360,8,Adult (25–40),Established Account (90+)
3,e9949bfa-194d-486b-84da-9565fca9e5ce,b04960c0-aeee-4907-b1cd-4819016adcef,95.77,2024-01-16 20:34:53,credit card,electronics,5,45,Kaylaville,mobile,202.122.126.216,"7609 Cynthia Square\nWest Brenda, NV 23016","7609 Cynthia Square\nWest Brenda, NV 23016",Legitimate,325,20,Senior (40+),Established Account (90+)
4,7362837c-7538-434e-8731-0df713f5f26d,de9d6351-b3a7-4bc7-9a55-8f013eb66928,77.45,2024-01-16 15:47:23,credit card,clothing,5,42,North Edwardborough,desktop,96.77.232.76,"2494 Robert Ramp Suite 313\nRobinsonport, AS 5...","2494 Robert Ramp Suite 313\nRobinsonport, AS 5...",Legitimate,116,15,Senior (40+),Established Account (90+)


### Time Category

In [None]:
fraud['Transaction Hour'] = fraud['Transaction Date'].dt.hour
fraud.head()

Unnamed: 0,Transaction ID,Customer ID,Transaction Amount,Transaction Date,Payment Method,Product Category,Quantity,Customer Age,Customer Location,Device Used,IP Address,Shipping Address,Billing Address,Is Fraudulent,Account Age Days,Transaction Hour,Customer_Age_Group,Account_Age_Group
0,c12e07a0-8a06-4c0d-b5cc-04f3af688570,8ca9f102-02a4-4207-ab63-484e83a1bdf0,42.32,2024-03-24 23:42:43,PayPal,electronics,1,40,East Jameshaven,desktop,110.87.246.85,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,Legitimate,282,23,Adult (25–40),Established Account (90+)
1,7d187603-7961-4fce-9827-9698e2b6a201,4d158416-caae-4b09-bd5b-15235deb9129,301.34,2024-01-22 00:53:31,credit card,electronics,3,35,Kingstad,tablet,14.73.104.153,"5230 Stephanie Forge\nCollinsbury, PR 81853","5230 Stephanie Forge\nCollinsbury, PR 81853",Legitimate,223,0,Adult (25–40),Established Account (90+)
2,f2c14f9d-92df-4aaf-8931-ceaf4e63ed72,ccae47b8-75c7-4f5a-aa9e-957deced2137,340.32,2024-01-22 08:06:03,debit card,toys & games,5,29,North Ryan,desktop,67.58.94.93,"195 Cole Oval\nPort Larry, IA 58422","4772 David Stravenue Apt. 447\nVelasquezside, ...",Legitimate,360,8,Adult (25–40),Established Account (90+)
3,e9949bfa-194d-486b-84da-9565fca9e5ce,b04960c0-aeee-4907-b1cd-4819016adcef,95.77,2024-01-16 20:34:53,credit card,electronics,5,45,Kaylaville,mobile,202.122.126.216,"7609 Cynthia Square\nWest Brenda, NV 23016","7609 Cynthia Square\nWest Brenda, NV 23016",Legitimate,325,20,Senior (40+),Established Account (90+)
4,7362837c-7538-434e-8731-0df713f5f26d,de9d6351-b3a7-4bc7-9a55-8f013eb66928,77.45,2024-01-16 15:47:23,credit card,clothing,5,42,North Edwardborough,desktop,96.77.232.76,"2494 Robert Ramp Suite 313\nRobinsonport, AS 5...","2494 Robert Ramp Suite 313\nRobinsonport, AS 5...",Legitimate,116,15,Senior (40+),Established Account (90+)


In [None]:
def categorize_time(hour):
    if 0 <= hour <= 4:
        return "Late Night"      # 00:00–04:59
    elif 5 <= hour <= 8:
        return "Early Morning"   # 05:00–08:59
    elif 9 <= hour <= 11:
        return "Morning"         # 09:00–11:59
    elif 12 <= hour <= 16:
        return "Afternoon"       # 12:00–16:59
    elif 17 <= hour <= 20:
        return "Evening"         # 17:00–20:59
    else:
        return "Night"           # 21:00–23:59

fraud["Time Category"] = fraud["Transaction Hour"].apply(categorize_time)
fraud.head()

Unnamed: 0,Transaction ID,Customer ID,Transaction Amount,Transaction Date,Payment Method,Product Category,Quantity,Customer Age,Customer Location,Device Used,IP Address,Shipping Address,Billing Address,Is Fraudulent,Account Age Days,Transaction Hour,Customer_Age_Group,Account_Age_Group,Time Category
0,c12e07a0-8a06-4c0d-b5cc-04f3af688570,8ca9f102-02a4-4207-ab63-484e83a1bdf0,42.32,2024-03-24 23:42:43,PayPal,electronics,1,40,East Jameshaven,desktop,110.87.246.85,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,Legitimate,282,23,Adult (25–40),Established Account (90+),Night
1,7d187603-7961-4fce-9827-9698e2b6a201,4d158416-caae-4b09-bd5b-15235deb9129,301.34,2024-01-22 00:53:31,credit card,electronics,3,35,Kingstad,tablet,14.73.104.153,"5230 Stephanie Forge\nCollinsbury, PR 81853","5230 Stephanie Forge\nCollinsbury, PR 81853",Legitimate,223,0,Adult (25–40),Established Account (90+),Late Night
2,f2c14f9d-92df-4aaf-8931-ceaf4e63ed72,ccae47b8-75c7-4f5a-aa9e-957deced2137,340.32,2024-01-22 08:06:03,debit card,toys & games,5,29,North Ryan,desktop,67.58.94.93,"195 Cole Oval\nPort Larry, IA 58422","4772 David Stravenue Apt. 447\nVelasquezside, ...",Legitimate,360,8,Adult (25–40),Established Account (90+),Early Morning
3,e9949bfa-194d-486b-84da-9565fca9e5ce,b04960c0-aeee-4907-b1cd-4819016adcef,95.77,2024-01-16 20:34:53,credit card,electronics,5,45,Kaylaville,mobile,202.122.126.216,"7609 Cynthia Square\nWest Brenda, NV 23016","7609 Cynthia Square\nWest Brenda, NV 23016",Legitimate,325,20,Senior (40+),Established Account (90+),Evening
4,7362837c-7538-434e-8731-0df713f5f26d,de9d6351-b3a7-4bc7-9a55-8f013eb66928,77.45,2024-01-16 15:47:23,credit card,clothing,5,42,North Edwardborough,desktop,96.77.232.76,"2494 Robert Ramp Suite 313\nRobinsonport, AS 5...","2494 Robert Ramp Suite 313\nRobinsonport, AS 5...",Legitimate,116,15,Senior (40+),Established Account (90+),Afternoon


### Extract Day

In [None]:
fraud['DayOfWeek'] = fraud['Transaction Date'].dt.day_name()
fraud.head()

Unnamed: 0,Transaction ID,Customer ID,Transaction Amount,Transaction Date,Payment Method,Product Category,Quantity,Customer Age,Customer Location,Device Used,IP Address,Shipping Address,Billing Address,Is Fraudulent,Account Age Days,Transaction Hour,Customer_Age_Group,Account_Age_Group,Time Category,DayOfWeek
0,c12e07a0-8a06-4c0d-b5cc-04f3af688570,8ca9f102-02a4-4207-ab63-484e83a1bdf0,42.32,2024-03-24 23:42:43,PayPal,electronics,1,40,East Jameshaven,desktop,110.87.246.85,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,5399 Rachel Stravenue Suite 718\nNorth Blakebu...,Legitimate,282,23,Adult (25–40),Established Account (90+),Night,Sunday
1,7d187603-7961-4fce-9827-9698e2b6a201,4d158416-caae-4b09-bd5b-15235deb9129,301.34,2024-01-22 00:53:31,credit card,electronics,3,35,Kingstad,tablet,14.73.104.153,"5230 Stephanie Forge\nCollinsbury, PR 81853","5230 Stephanie Forge\nCollinsbury, PR 81853",Legitimate,223,0,Adult (25–40),Established Account (90+),Late Night,Monday
2,f2c14f9d-92df-4aaf-8931-ceaf4e63ed72,ccae47b8-75c7-4f5a-aa9e-957deced2137,340.32,2024-01-22 08:06:03,debit card,toys & games,5,29,North Ryan,desktop,67.58.94.93,"195 Cole Oval\nPort Larry, IA 58422","4772 David Stravenue Apt. 447\nVelasquezside, ...",Legitimate,360,8,Adult (25–40),Established Account (90+),Early Morning,Monday
3,e9949bfa-194d-486b-84da-9565fca9e5ce,b04960c0-aeee-4907-b1cd-4819016adcef,95.77,2024-01-16 20:34:53,credit card,electronics,5,45,Kaylaville,mobile,202.122.126.216,"7609 Cynthia Square\nWest Brenda, NV 23016","7609 Cynthia Square\nWest Brenda, NV 23016",Legitimate,325,20,Senior (40+),Established Account (90+),Evening,Tuesday
4,7362837c-7538-434e-8731-0df713f5f26d,de9d6351-b3a7-4bc7-9a55-8f013eb66928,77.45,2024-01-16 15:47:23,credit card,clothing,5,42,North Edwardborough,desktop,96.77.232.76,"2494 Robert Ramp Suite 313\nRobinsonport, AS 5...","2494 Robert Ramp Suite 313\nRobinsonport, AS 5...",Legitimate,116,15,Senior (40+),Established Account (90+),Afternoon,Tuesday


# Export to Excel

In [None]:
fraud.to_excel('fraud_featured.xlsx', index=False)

In [None]:
just_fraud = fraud[fraud['Is Fraudulent'] == 'Fraudulent']
just_fraud.info()

<class 'pandas.core.frame.DataFrame'>
Index: 1222 entries, 36 to 23601
Data columns (total 20 columns):
 #   Column              Non-Null Count  Dtype         
---  ------              --------------  -----         
 0   Transaction ID      1222 non-null   object        
 1   Customer ID         1222 non-null   object        
 2   Transaction Amount  1222 non-null   float64       
 3   Transaction Date    1222 non-null   datetime64[ns]
 4   Payment Method      1222 non-null   object        
 5   Product Category    1222 non-null   object        
 6   Quantity            1222 non-null   int64         
 7   Customer Age        1222 non-null   int64         
 8   Customer Location   1222 non-null   object        
 9   Device Used         1222 non-null   object        
 10  IP Address          1222 non-null   object        
 11  Shipping Address    1222 non-null   object        
 12  Billing Address     1222 non-null   object        
 13  Is Fraudulent       1222 non-null   category      


In [None]:
just_fraud.head()

Unnamed: 0,Transaction ID,Customer ID,Transaction Amount,Transaction Date,Payment Method,Product Category,Quantity,Customer Age,Customer Location,Device Used,IP Address,Shipping Address,Billing Address,Is Fraudulent,Account Age Days,Transaction Hour,Customer_Age_Group,Account_Age_Group,Time Category,DayOfWeek
36,01784f08-338e-461f-9685-6bcb507ab1f4,d5010ab2-ceff-4617-b0bc-735409c64104,222.0,2024-03-25 19:30:56,bank transfer,home & garden,1,51,Robinton,tablet,138.3.124.205,"5531 Sharp Squares Apt. 982\nNew Rachel, GU 76203","5531 Sharp Squares Apt. 982\nNew Rachel, GU 76203",Fraudulent,194,19,Senior (40+),Established Account (90+),Evening,Monday
115,45fa1005-7ae4-4183-96c3-18e9a078bfed,ce489933-dd4f-455f-a2a2-fe576abf147c,307.37,2024-01-20 19:27:38,bank transfer,home & garden,2,32,North Hollystad,desktop,223.202.233.44,"52360 Bell Crossing\nEdwinchester, MH 89729","52360 Bell Crossing\nEdwinchester, MH 89729",Fraudulent,166,19,Adult (25–40),Established Account (90+),Evening,Saturday
169,59691090-4f05-416a-8c41-4ca68386cf19,1a83682f-3737-4487-ba76-6e547b7af139,94.23,2024-03-11 14:30:07,bank transfer,health & beauty,4,26,Scotthaven,tablet,52.12.114.19,"17598 Vanessa Shores Suite 221\nNorth Ashley, ...","17598 Vanessa Shores Suite 221\nNorth Ashley, ...",Fraudulent,27,14,Adult (25–40),New Account (<30),Afternoon,Monday
204,a858dc67-2d9a-48d9-b89a-1a14cefa26c6,516fc8a2-1acc-4af3-b75c-4f06608b50d8,70.93,2024-02-21 21:02:11,PayPal,home & garden,4,52,Stephenfort,tablet,172.110.189.188,"0899 Jonathan Islands\nKellermouth, CO 07405","0899 Jonathan Islands\nKellermouth, CO 07405",Fraudulent,182,21,Senior (40+),Established Account (90+),Night,Wednesday
206,710aef60-ff0b-4767-8f0a-fa4a72044a3a,519d9b03-5049-47fa-bce9-15d159bf856f,1465.65,2024-02-21 15:04:45,debit card,toys & games,5,58,East John,desktop,7.31.74.212,"8805 David Union Suite 461\nStaceyshire, MP 32513","8805 David Union Suite 461\nStaceyshire, MP 32513",Fraudulent,141,15,Senior (40+),Established Account (90+),Afternoon,Wednesday


In [None]:
just_fraud.to_excel('fraud table.xlsx', index=False)