# Real-Time Order Risk Prediction System


### BUSINESS CONTEXT:
<p> You work for "QuickBite Delivery", a food delivery platform operating in metro cities.</p>
The company is facing:
<ol>
<li>High cancellation rates (orders not reaching customers)
<li>Inefficient rider allocation leading to delayed deliveries</li>
<li>Rider performance issues affecting customer satisfaction</li>
<li>High operational costs due to manual reassignments</li>
</ol>
<b><i>Your Mission: Build predictive model to identify which orders will get cancel.</i></b> <br>
Dataset : <a>https://www.kaggle.com/datasets/cbhavik/swiggyzomato-order-information/data</a><br><br>
Description:
<ul>
<li><b>order_id:</b> unique id for each order
<li><b>order_time:</b> time of the creation of order by the client
<li><b>order_date:</b> date of the order
<li><b>allot_time:</b> time of allocation of order to the rider
<li><b>accept_time:</b> time of acceptance of the order by the rider (if available)
<li><b>pickup_time:</b> time of pickup of the order (if available)
<li><b>delivered_time:</b> time of delivery of the order (if available)
<li><b>cancelled_time:</b> time of cancellation of order (if the order was cancelled)
<li><b>cancelled:</b> whether the order was cancelled
<li><b>rider_id:</b> unique id for each rider
<li><b>first_miledistance:</b> road distance from riderâ€™s location to the pickup location
<li><b>last_miledistance:</b> road distance from pickup location to the delivery location
<li><b>allotted_orders:</b> total number of orders allotted to the rider in the 30 days before (not
including) orderdate
<li><b>delivered_orders:</b> total number of orders delivered by the rider in the 30 days before (not
including) order_date -date
<li><b>undelivered_orders:</b> total number of orders allotted to but not delivered by the rider (i.e.
cancelled) in the 30 days before (not including) orderdate
<li><b>lifetime_ordercount:</b> total number of orders delivered by the rider at any time before
orderdate
<li><b>reassigned_order:</b> whether the order was reassigned to this rider
<li><b>reassignment_method:</b> if the order was reassigned, whether the reassignment was done
manually (by the ops team) or automatically
<li><b>reassignment_reason:</b> a more detailed reason for the reassignment
<li><b>session_time:</b> total time the rider had been online on orderdate before ordertime</li></ul>

## Data Loading & Basic Checks
<ul>
<li>Load the dataset into pandas
<li>Check for missing values in each column
<li>Identify data types and convert datetime columns
<li>Check for duplicates in  order_id</ul>

## Delivery Timeline Analysis
### Calculate time intervals:
<ul>
<li>order_to_allot: allot_time - order_time</li>
<li>allot_to_accept: accept_time - allot_time (if available)</li>
<li>accept_to_pickup: pickup_time - accept_time (if available)</li>
<li>pickup_to_delivery: delivered_time - pickup_time (if available)</li>
<li>total_delivery_time: delivered_time - order_time (for delivered orders)</ul>

### Create visualizations:
<ul><li>Distribution of each time interval (histograms)</li>
<li>Average time intervals by hour of day</li>
<li>Weekday vs weekend comparisons</ul>

## Cancellation Analysis
<ul><li>What percentage of orders are cancelled?</li>
<li>At what stage do cancellations happen most? (before accept, after accept, after pickup)</li>
<li>Is cancellation rate higher for certain riders?</li></ul>

## Rider Performance Metrics
<ul>
<li>Delivery success rate = delivered_orders / (delivered_orders + undelivered_orders)</li>
<li>Average delivery time (for completed deliveries)</li>
<li>Identify: Top 10% performers vs Bottom 10% performers</li>
</ul>

## Distance Analysis
<ul><li>Analyze  first_mile_distance vs  last_mile_distance</li>
<li>Is there correlation between distance and cancellation?</li>
<li>Is there correlation between distance and delivery time?</li></ul>

#### Build model to predict weather the order will get cancel or not..?

Target : will_cancel (binary)
- 1 if cancelled_time exists
- 0 if delivered_time exists