## Importing Libraries

In [0]:
from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql.window import Window

**2686. Immediate Food Delivery III (Medium)**

**Table: Delivery**

| Column Name                 | Type    |
|-----------------------------|---------|
| delivery_id                 | int     |
| customer_id                 | int     |
| order_date                  | date    |
| customer_pref_delivery_date | date    |

delivery_id is the column with unique values of this table.
Each row contains information about food delivery to a customer that makes an order at some date and specifies a preferred delivery date (on the order date or after it).
If the customer's preferred delivery date is the same as the order date, then the order is called immediate, otherwise, it is scheduled.

**Write a solution to find the percentage of immediate orders on each unique order_date, rounded to 2 decimal places.** 

Return the result table ordered by order_date in ascending order.

The result format is in the following example.

**Example 1:**

**Input:** 

**Delivery table:**

| delivery_id | customer_id | order_date | customer_pref_delivery_date |
|-------------|-------------|------------|-----------------------------|
| 1           | 1           | 2019-08-01 | 2019-08-02                  |
| 2           | 2           | 2019-08-01 | 2019-08-01                  |
| 3           | 1           | 2019-08-01 | 2019-08-01                  |
| 4           | 3           | 2019-08-02 | 2019-08-13                  |
| 5           | 3           | 2019-08-02 | 2019-08-02                  |
| 6           | 2           | 2019-08-02 | 2019-08-02                  |
| 7           | 4           | 2019-08-03 | 2019-08-03                  |
| 8           | 1           | 2019-08-03 | 2019-08-03                  |
| 9           | 5           | 2019-08-04 | 2019-08-08                  |
| 10          | 2           | 2019-08-04 | 2019-08-18                  |

**Output:** 
| order_date | immediate_percentage |
|------------|----------------------|
| 2019-08-01 | 66.67                |
| 2019-08-02 | 66.67                |
| 2019-08-03 | 100.00               |
| 2019-08-04 | 0.00                 |

**Explanation:** 
- On 2019-08-01 there were three orders, out of those, two were immediate and one was scheduled. So, immediate percentage for that date was 66.67.
- On 2019-08-02 there were three orders, out of those, two were immediate and one was scheduled. So, immediate percentage for that date was 66.67.
- On 2019-08-03 there were two orders, both were immediate. So, the immediate percentage for that date was 100.00.
- On 2019-08-04 there were two orders, both were scheduled. So, the immediate percentage for that date was 0.00.

order_date is sorted in ascending order.

In [0]:
delivery_data_2686 = [
    (1, 1, "2019-08-01", "2019-08-02"),
    (2, 2, "2019-08-01", "2019-08-01"),
    (3, 1, "2019-08-01", "2019-08-01"),
    (4, 3, "2019-08-02", "2019-08-13"),
    (5, 3, "2019-08-02", "2019-08-02"),
    (6, 2, "2019-08-02", "2019-08-02"),
    (7, 4, "2019-08-03", "2019-08-03"),
    (8, 1, "2019-08-03", "2019-08-03"),
    (9, 5, "2019-08-04", "2019-08-08"),
    (10, 2, "2019-08-04", "2019-08-18"),
]

delivery_columns_2686 = ["delivery_id", "customer_id", "order_date", "customer_pref_delivery_date"]
delivery_df_2686 = spark.createDataFrame(delivery_data_2686, delivery_columns_2686)
delivery_df_2686.show()

+-----------+-----------+----------+---------------------------+
|delivery_id|customer_id|order_date|customer_pref_delivery_date|
+-----------+-----------+----------+---------------------------+
|          1|          1|2019-08-01|                 2019-08-02|
|          2|          2|2019-08-01|                 2019-08-01|
|          3|          1|2019-08-01|                 2019-08-01|
|          4|          3|2019-08-02|                 2019-08-13|
|          5|          3|2019-08-02|                 2019-08-02|
|          6|          2|2019-08-02|                 2019-08-02|
|          7|          4|2019-08-03|                 2019-08-03|
|          8|          1|2019-08-03|                 2019-08-03|
|          9|          5|2019-08-04|                 2019-08-08|
|         10|          2|2019-08-04|                 2019-08-18|
+-----------+-----------+----------+---------------------------+



In [0]:
delivery_df_2686 = delivery_df_2686\
                        .withColumn( 
                                    "is_immediate",
                                    when(col("order_date") == col("customer_pref_delivery_date"), 1).otherwise(0)
                                    )

In [0]:
delivery_df_2686\
    .groupBy("order_date")\
        .agg(
            (round((100 * count(when(col("is_immediate") == 1, True))) / count("*"), 2)).alias("immediate_percentage")
            ).orderBy("order_date").display()

order_date,immediate_percentage
2019-08-01,66.67
2019-08-02,66.67
2019-08-03,100.0
2019-08-04,0.0
