# Part 4: The Early Bird

The neighbor passed the rug along to a woman he met on Tinder.

> Thankfully, this woman I met on Tinder came over at **5am** with her bike chain repair kit and some pastries from Noah’s. Apparently she liked to get up before dawn and **claim the first pastries that came out of the oven**.

So I need to find the customer who most often is the first to order pastries before 5am.

In [1]:
import polars as pl

customers = pl.read_csv("data/noahs-customers.csv",try_parse_dates=True)
orders_items = pl.read_csv("data/noahs-orders_items.csv",try_parse_dates=True)
orders = pl.read_csv("data/noahs-orders.csv",try_parse_dates=True)
products = pl.read_csv("data/noahs-products.csv",try_parse_dates=True)

It seems that the sku in the "products" table corresponds the department of the store it's from.
In this case, bakery items start with 'BKY'.

In [8]:
bakery_items = products.filter(pl.col("sku").str.starts_with("BKY"))
print(bakery_items)

shape: (20, 4)
┌─────────┬─────────────────────────┬────────────────┬────────────────┐
│ sku     ┆ desc                    ┆ wholesale_cost ┆ dims_cm        │
│ ---     ┆ ---                     ┆ ---            ┆ ---            │
│ str     ┆ str                     ┆ f64            ┆ str            │
╞═════════╪═════════════════════════╪════════════════╪════════════════╡
│ BKY0303 ┆ Raspberry Sufganiah     ┆ 0.89           ┆ 14.4|13.1|1.9  │
│ BKY0542 ┆ Caraway Bagel           ┆ 1.1            ┆ 14.7|9.0|3.2   │
│ BKY1021 ┆ Raspberry Rugelach      ┆ 1.12           ┆ 14.5|9.6|2.6   │
│ BKY1115 ┆ Caraway Bialy           ┆ 1.13           ┆ 14.4|2.2|1.9   │
│ BKY3490 ┆ Raspberry Linzer Cookie ┆ 1.0            ┆ 18.6|13.1|10.4 │
│ …       ┆ …                       ┆ …              ┆ …              │
│ BKY8370 ┆ Sesame Puff             ┆ 0.92           ┆ 17.7|13.3|3.2  │
│ BKY8445 ┆ Poppyseed Hamentash     ┆ 0.96           ┆ 12.3|9.1|6.1   │
│ BKY9158 ┆ Poppyseed Rugelach      ┆ 1.03       

Now I'll find the customer who places bakery orders before 5am the most often.

In [17]:
bakery_items_orders = orders_items.join(bakery_items, on="sku", how="inner")
bakery_orders = orders.join(bakery_items_orders, on="orderid", how="inner")

early_bird_customer_id = (
    bakery_orders.filter(pl.col("ordered").dt.hour() < 5)
    .group_by("customerid")
    .agg(pl.len())
    .top_k(1, by="len")
    .item(0,0)
)

print(
    customers.filter(pl.col("customerid") == early_bird_customer_id).select(
        "customerid",
        "name", "phone"
    )
)

shape: (1, 3)
┌────────────┬─────────────────┬──────────────┐
│ customerid ┆ name            ┆ phone        │
│ ---        ┆ ---             ┆ ---          │
│ i64        ┆ str             ┆ str          │
╞════════════╪═════════════════╪══════════════╡
│ 6455       ┆ Brittany Harmon ┆ 716-789-4433 │
└────────────┴─────────────────┴──────────────┘
