# Data Exploration

Begin by examining the relationship between delivery distance and the time it takes to deliver the food. Take a look at what's in the dataset. 

In [None]:
from keras.models import Sequential
from keras.layers import LSTM, Dense
import plotly.express as px
import numpy as np
import pandas as pd
import tensorflow as tf
import tensorflow_io as tfio

data = pd.read_csv("deliverytime-small.txt")
display(data.head())



Begin by examining the relationship between delivery distance and the time it takes to deliver the food

In [None]:
# Set the earth's radius (in kilometers)
R = 6371

# Convert degrees to radians
def deg_to_rad(degrees):
    return degrees * (np.pi/180)

# Function to calculate the distance between two points using the haversine formula
def distcalculate(lat1, lon1, lat2, lon2):
    d_lat = deg_to_rad(lat2-lat1)
    d_lon = deg_to_rad(lon2-lon1)
    a = np.sin(d_lat/2)**2 + np.cos(deg_to_rad(lat1)) * np.cos(deg_to_rad(lat2)) * np.sin(d_lon/2)**2
    c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1-a))
    return R * c
  
# Calculate the distance between each pair of points
data['distance'] = np.nan

for i in range(len(data)):
    data.loc[i, 'distance'] = distcalculate(data.loc[i, 'Restaurant_latitude'], 
                                        data.loc[i, 'Restaurant_longitude'], 
                                        data.loc[i, 'Delivery_location_latitude'], 
                                        data.loc[i, 'Delivery_location_longitude'])
    
display(data.head())

Let's turn our attention to exploring how the age of the delivery partner influences the time taken to deliver the food:

In [None]:
figure = px.scatter(data_frame = data, 
                    x="Delivery_person_Age",
                    y="Time_taken(min)", 
                    size="Time_taken(min)", 
                    color = "distance",
                    trendline="ols", 
                    title = "Relationship Between Time Taken and Age")
figure.show()

There appears to be a linear correlation between the delivery time and the age of the delivery partner, indicating that younger partners typically deliver food faster than their older counterparts.

Next, let's examine how the delivery partner's ratings affect the time taken to deliver the food:

In [None]:
figure = px.scatter( data_frame = data, 
                    x="Delivery_person_Ratings", 
                    y="Time_taken(min)", 
                    size="Time_taken(min)", 
                    color = "distance", 
                    trendline="ols", 
                    title = "Relationship Between Time Taken and Ratings") 
figure.show()

There is an inverse linear relationship between the delivery time and the delivery partner's ratings, suggesting that partners with higher ratings tend to deliver food more quickly than those with lower ratings.

Now, let's explore whether the type of food ordered by the customer and the type of vehicle used by the delivery partner have any impact on the delivery time:

In [None]:
fig = px.box(data, 
             x="Type_of_vehicle",
             y="Time_taken(min)", 
             color="Type_of_order")
fig.show()

Based on our findings, The type of vehicle used by the delivery driver and the kind of food being delivered do not significantly affect the delivery time.

The factors that most influence food delivery time are:

- The age of the delivery partner
- The ratings of the delivery partner
- The distance between the restaurant and the delivery location