# Aggregate events from different indexes

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/google/temporian/blob/last-release/docs/src/recipes/aggregate_index.ipynb)

This recipe applies when you have events indexed by one or more features, and you want to drop some index levels and unify the events with the same timestamps.

In this example, we aggregate daily sales by store and product, into daily revenue for each individual store (i.e., the total sales for each day).

## Example data

Let's define 2 stores, each one with 2 products. The product IDs might be the same across stores or not.

For each store/product, we'll create the sales (in USD) for the same 3 days (1, 2 and 3 of January, 2020).

In [None]:
import pandas as pd
import temporian as tp


sales_data = pd.DataFrame(
    data=[
        # date,    store ID (1),  product ID, sales (USD)  
        ["2020-01-01", "store_1", "product_1", 300.0],
        ["2020-01-02", "store_1", "product_1", 450.0],
        ["2020-01-03", "store_1", "product_1", 600.0],
        ["2020-01-01", "store_1", "product_2", 100.0],
        ["2020-01-02", "store_1", "product_2", 250.0],
        ["2020-01-03", "store_1", "product_2", 100.0],
        # date,    store ID (2),  product ID, sales (USD)  
        ["2020-01-01", "store_2", "product_1", 900.0],
        ["2020-01-02", "store_2", "product_1", 750.0],
        ["2020-01-03", "store_2", "product_1", 750.0],
        ["2020-01-01", "store_2", "product_3", 20.0],
        ["2020-01-02", "store_2", "product_3", 40.0],
        ["2020-01-03", "store_2", "product_3", 30.0],
    ],
    columns=[
        "timestamp",
        "store_id",
        "product_id",
        "sales_usd",
    ],
)

# Load data indexed by store/product
sales_evset = tp.from_pandas(sales_data, indexes=["store_id", "product_id"])
sales_evset.plot()

## Solution
We want to aggregate all product sales per store, so this is what we can do:
1. Drop the `product_id` index, and ignore it.
2. Unify sales with the same timestamp and same store, adding them up.

### 1. Drop index

We don't care about the different `product_id` that we're adding up in each store.

In [None]:
store_sales = sales_evset.drop_index("product_id")

store_sales["sales_usd"].plot()

As you can see, now we've each timestamp duplicated, one for each product.

### 2. Unify events

We want to unify the events with the same timestamps, adding up their sales.

In [None]:
unique_days = store_sales.unique_timestamps()

store_daily_sales = store_sales["sales_usd"].moving_sum(window_length=tp.duration.days(1), sampling=unique_days)

store_daily_sales.plot()