# FSA Data Science Challenge

## Introduction
We at Just Eat Takeaway (JET) have prepared an interesting case for you based on your experience in the intersection of finance and data science. The case contains mock data and is a simplified version of the kind of questions you might expect.

First, some background about JET: We are currently operating in 20 countries and are one of the leading delivery food platforms in the world. Since the market operates on tight margins, it is crucial for us to be the market leader in a region to be profitable.

We have discovered that we are losing orders to our competitor in a particular region, and we want to find out the optimal payment fee for that region, given our growth and revenue goals. The revenue for an order consists of three components: the delivery fee, the payment fee (shown at the end of an order), and the commission fee, which is a percentage of the total food price.

## The data
Unfortunately, we do not have experimental data for this region. However, we do know that some payment methods do not charge an additional payment fee. Specifically, bank card payments charge this fee, while cash payments do not. Based on this observation, we have decided to collect weekly data for payment categories A and B. Payment category A (card) had three changes on `2021-11-8`, `2022-5-30` and `2023-1-9`, while payment category B (cash) had no changes. In the code below we also provided a visualization of these changes.

The dataset we give you consists of the following columns: 
- `payment_category`: The payment category as explained above (A/B). 
- `orderdatetime`: The week over which the numbers are determined.
- `order_count`: The number of orders in that week. 
- `mean_payment_fee`: The mean payment fee for that week per order.
- `mean_delivery_fee`: The mean delivery fee for that week per order. 
- `mean_commission_fee`: The mean commision fee for that week per order. 
- `mean_food_price`: The mean food price for that week per order.

It is important to note that A and B are not randomized groups; thus, this is a quasi-experiment. Moreover, A and B have different magnitudes of order counts, making them individually different.

## The challenge
The revenue per order is calculated by adding the delivery fee, payment fee, and commission fee:

`revenue = delivery_fee + payment_fee + commission_fee`

The question we want you to solve is whether you can recommend the optimal payment fee for us. Should we raise it from the current 1.99 or lower it?

A lower payment fee should lead to higher demand but lower revenue per order, while a higher payment fee should lead to lower demand but higher revenue per order. If we solely want to optimize revenue, what would be the optimum value? Notice that we are not focussing on profits, only on revenue.

One possible plan of attack could be to create a [demand function](https://www.core-econ.org/the-economy/book/text/leibniz-07-08-01.html) that maps the payment fee (and possibly other variables) to the expected demand using the data. Once we have the function, we can transform it to a function that gives us the total expected revenue, given a payment fee. We can then optimize this function (find the maximum). However, there are multiple approaches that you could take, and other econometric or data science research methods might be viable.

## What we expect
We expect you to provide a presentation highlighting your analysis, methodology, conclusion for the business, possible next steps, and limitations and risks. We will judge your presentation based on whether your methodology makes sense, your conclusions make sense, and whether you correctly thought about the dangers and limitations.

Good luck, and feel free to ask individual questions.

In [None]:
import pandas as pd

data = pd.read_csv('fsa_data.csv')

data

Unnamed: 0,payment_category,orderdatetime,order_count,mean_payment_fee,mean_delivery_fee,mean_commission_fee,mean_food_price
0,A,2021-01-04,31925,0.0,0.42,4.42,29.25
1,A,2021-01-11,58812,0.0,0.45,4.45,28.80
2,A,2021-01-18,61065,0.0,0.44,4.43,28.86
3,A,2021-01-25,65067,0.0,0.45,4.48,29.03
4,A,2021-02-01,67950,0.0,0.44,4.52,29.50
...,...,...,...,...,...,...,...
231,B,2023-03-06,12240,0.0,0.23,3.88,27.92
232,B,2023-03-13,12138,0.0,0.22,3.88,27.84
233,B,2023-03-20,11520,0.0,0.21,3.93,28.17
234,B,2023-03-27,9881,0.0,0.22,3.91,28.07


In [None]:
import plotly.express as px

px.line(data, x='orderdatetime', y=['mean_payment_fee'], color='payment_category')

In below line graph, the dates where the payment fee was changes (`2021-11-8`, `2022-5-30` and `2023-1-9`) are shown with dotted lines.

In [None]:
import plotly.express as px

fig = px.line(data, x='orderdatetime', y=['order_count'], color='payment_category')
fig.add_vline(x='2021-11-8', line_width=2, line_dash="dash", line_color="black")
fig.add_vline(x='2022-5-30', line_width=2, line_dash="dash", line_color="black")
fig.add_vline(x='2023-1-9', line_width=2, line_dash="dash", line_color="black")