### Some background:
Food Delivery services like Zomato and Swiggy need to show the accurate time it will take to deliver your order to keep transparency with their customers.

### Problem Statement:
Food Delivery Time Prediction
To predict the food delivery time in real-time, we need to calculate the distance between the food preparation point and the point of food consumption. 
After finding the distance between the restaurant and the delivery locations, we need to find relationships between the time taken by delivery partners to deliver the food in the past for the same distance.

In [None]:
import os

current_directory = os.getcwd()
print(current_directory)

In [None]:
import pandas as pd
import numpy as np
import plotly.express as px

data = pd.read_csv("deliverytime.txt")
print(data.head())

In [None]:
data.info()

## Now let’s check for any null values in the dataset:

In [None]:
data.isnull().sum()

## Wow!! Dataset doesn't have any null values. Let's get going then!

### Calculating Distance Between Two Latitudes and Longitudes
The dataset doesn’t show the difference between the restaurant and the delivery location. All we have are the latitude and longitude points of the restaurant and the delivery location.

### Hence, we can use the haversine formula to calculate the distance between two locations based on their latitudes and longitudes.

In [None]:
# Seting the earth's radius (in kilometers)
R = 6371

# Converting degrees to radians
def deg_to_rad(degrees):
    return degrees * (np.pi/180)

# Function to calculate the distance between two points using the haversine formula
def distcalculate(lat1, lon1, lat2, lon2):
    d_lat = deg_to_rad(lat2-lat1)
    d_lon = deg_to_rad(lon2-lon1)
    a = np.sin(d_lat/2)**2 + np.cos(deg_to_rad(lat1)) * np.cos(deg_to_rad(lat2)) * np.sin(d_lon/2)**2
    c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1-a))
    return R * c
  
# Calculate the distance between each pair of points
data['distance'] = np.nan

for i in range(len(data)):
    data.loc[i, 'distance'] = distcalculate(data.loc[i, 'Restaurant_latitude'], 
                                        data.loc[i, 'Restaurant_longitude'], 
                                        data.loc[i, 'Delivery_location_latitude'], 
                                        data.loc[i, 'Delivery_location_longitude'])

## Let's have a look at the new dataset:

In [None]:
print(data.head())

## Let's explore relationship between the distance and time taken to deliver the food:

In [None]:
figure = px.scatter(data_frame = data, 
                    x="distance",
                    y="Time_taken(min)", 
                    size="Time_taken(min)", 
                    trendline="ols", 
                    title = "Relationship Between Distance and Time Taken")
figure.show()

There is a consistent relationship between the time taken and the distance travelled to deliver the food. It means that most delivery partners deliver food within 25-30 minutes, regardless of distance.

## Is there any relationship between the age of delivery partner and time taken to deliver the food?

In [None]:
figure = px.scatter(data_frame = data, 
                    x="Delivery_person_Age",
                    y="Time_taken(min)", 
                    size="Time_taken(min)", 
                    color = "distance",
                    trendline="ols", 
                    title = "Relationship Between Time Taken and Age")
figure.show()

## Seems there is! The age of the delivery partner is directly proportional to the time taken for delivery.

## Next let's check the time taken for delivery vs ratings of delivery partners:

In [None]:
figure = px.scatter(data_frame = data, 
                    x="Delivery_person_Ratings",
                    y="Time_taken(min)", 
                    size="Time_taken(min)", 
                    color = "distance",
                    trendline="ols", 
                    title = "Relationship Between Time Taken and Ratings")
figure.show()

## So, the relationship here is inverse = delivery partners with higher ratings take less time to deliver the food compared to partners with low ratings.
### So much obvious!!

## Can there be something between the type of vehicle used to deliver the food and the type of food?

In [None]:
fig = px.box(data, 
             x="Type_of_vehicle",
             y="Time_taken(min)", 
             color="Type_of_order")
fig.show()

### Fortunately, there isn't much! 

## Hence, from the EDA, the evident features that contribute most to the food delivery time based on our analysis are:
## - Age of the delivery partner
## - Ratings of the delivery partner
## - Distance between the restaurant and the delivery location.

## Let's predict the time to be taken for delivery:

### We will use a LSTM neural network model for predictions:

In [None]:
pip install keras

In [None]:
pip install tensorflow

In [None]:
from sklearn.model_selection import train_test_split
x = np.array(data[["Delivery_person_Age", 
                   "Delivery_person_Ratings", 
                   "distance"]])
y = np.array(data[["Time_taken(min)"]])
xtrain, xtest, ytrain, ytest = train_test_split(x, y, 
                                                test_size=0.20, 
                                                random_state=42)

# creating the LSTM neural network model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape= (xtrain.shape[1], 1)))
model.add(LSTM(64, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))
model.summary()

In [None]:
# training the model
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(xtrain, ytrain, batch_size=1, epochs=9)

## It's time to test the model's performance:

In [None]:
print("Food Delivery Time Prediction")
a = int(input("Age of Delivery Partner: "))
b = float(input("Ratings of Previous Deliveries: "))
c = int(input("Total Distance: "))

features = np.array([[a, b, c]])
print("Predicted Delivery Time in Minutes = ", model.predict(features))

# Conclusion:

## In order to make real-time predictions for food delivery time, it is important to determine the distance between the food preparation location and the delivery destination. Once the distance between the restaurant and the delivery locations is obtained, the next step is to identify the correlations between the historical delivery times for the same distance, as recorded by the delivery partners. This analysis will help establish patterns and relationships that can be used to predict future food delivery times accurately.