In [1]:
import pandas as pd

#Loading Data
df = pd.read_csv('/content/FraudOrders_sample.csv')

In [7]:
#Understanding Data
df.head()
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 27930 entries, 1527 to 8599
Data columns (total 15 columns):
 #   Column                                Non-Null Count  Dtype  
---  ------                                --------------  -----  
 0   OrderID                               27930 non-null  object 
 1   DeliverymanID                         27930 non-null  object 
 2   TripID                                27930 non-null  object 
 3   AbuseType                             27930 non-null  int64  
 4   CreatedDate                           27930 non-null  object 
 5   PickingupTime                         27930 non-null  object 
 6   DeliveryTime                          27930 non-null  object 
 7   UpdatedDate                           27930 non-null  object 
 8   IsDeliverInZone                       27930 non-null  int64  
 9   IsAllowedDeliverOutOfZone             27930 non-null  int64  
 10  CustomerHasLocation                   27930 non-null  bool   
 11  DeliveryDistanceFr

In [8]:
df['AbuseType'].value_counts()  # counting As grouing by abusetype

Unnamed: 0_level_0,count
AbuseType,Unnamed: 1_level_1
0,10000
1,8992
8,4349
2,2010
3,1149
5,751
6,679


In [10]:
df = df.sort_values(by=['DeliverymanID', 'DeliveryTime'])

df['PrevDeliveryTime'] = df.groupby('DeliverymanID')['DeliveryTime'].shift(1) # gets the previous delivery time (from the previous row in the group)

In [4]:
 #Calculates the time difference in minutes between the current and previous delivery and convert into minutes
df['TimeDiff'] = (pd.to_datetime(df['DeliveryTime'], format='ISO8601')
 - pd.to_datetime(df['PrevDeliveryTime'], format='ISO8601')).dt.total_seconds() / 60

#Filters only the rows where the time difference is less than 5 minutes as they are suspicious cases with "unrealistically short time
short_diff = df[df['TimeDiff'] < 5]
short_diff['AbuseType'].value_counts(normalize=True)

Unnamed: 0_level_0,proportion
AbuseType,Unnamed: 1_level_1
1,0.701254
8,0.18206
2,0.054352
3,0.04637
0,0.009882
6,0.003421
5,0.002661


***Conclusion:***  
**AbuseType = 1 represents "Unrealistically short time between consecutive deliveries"**



In [13]:
df['DeliveryDistanceFromCustomerLocation'].describe()

far_deliveries = df[df['DeliveryDistanceFromCustomerLocation'] > 1000]
far_deliveries['AbuseType'].value_counts(normalize=True)


Unnamed: 0_level_0,proportion
AbuseType,Unnamed: 1_level_1
8,0.978731
1,0.011014
0,0.006077
2,0.003038
6,0.00076
3,0.00038


***Conclusion***

**AbuseType = 8 represents Deliveryman delivered the order away from the customer's address**

In [6]:
df[df['AbuseType'] != 0].groupby('AbuseType').agg({
    'DeliveryDistanceFromCustomerLocation': 'mean',
    'DeliveryDistanceFromBranch': 'mean',
    'IsDeliverInZone': 'mean',
    'IsAllowedDeliverOutOfZone': 'mean'
})


Unnamed: 0_level_0,DeliveryDistanceFromCustomerLocation,DeliveryDistanceFromBranch,IsDeliverInZone,IsAllowedDeliverOutOfZone
AbuseType,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,786.367271,270.087406,0.937055,1.0
2,896.376487,596.511146,0.974129,1.0
3,801.116143,1459.960494,0.974761,1.0
5,,924.555346,0.998668,1.0
6,332.85725,1276.775148,0.0,0.001473
8,1467.027375,881.142388,0.954012,1.0


AbuseType 8: Deliveries are made far from the customer’s location (avg. 1467m), indicating possible falsification or incorrect reporting of delivery location.

AbuseType 1: Deliveries occur within unrealistically short time intervals (98% under 5 minutes), suggesting fake or rushed deliveries.

AbuseType 6: Deliveries happen outside the allowed delivery zone (0% in zone, and not permitted), showing unauthorized zone breaches.

AbuseType 5: Missing customer location data, implying deliveries made without proper location recording.

AbuseType 3: Delivery distances from the branch are unusually long (avg. 1459m), which may indicate misuse or fake delivery addresses.

AbuseType 2: Mixed characteristics that require further investigation to determine the exact type of fraud.




