 # Task: Calculate a running cumulative total

An sqlite database is stored in the same directory as this notebook called `orders.db`. The names of the tables and their columns are provided below. Your task is to calculate a cumulative running total for each customer's purchases. 


**For example...if I have the following table:**

|  person  |  date | amount_spent |
|:--------:|:-----:|:------------:|
| person 1 | day 1 |       1      |
| person 2 | day 1 |       5      |
| person 1 | day 2 |       2      |
| person 2 | day 2 |       8      |
| person 1 | day 3 |       9      |
| person 2 | day 3 |       2      |


**A cumulative total for the above table would produce the following table:**


|  person  |  date | cumulative_total |
|:--------:|:-----:|:------------:|
| person 1 | day 1 |       1      |
| person 2 | day 1 |       5      |
| person 1 | day 2 |       3      |
| person 2 | day 2 |       13     |
| person 1 | day 3 |       12     |
| person 2 | day 3 |       15     |


## SQL Tables

### Table name: `orders`

**Columns:**
- `order_id`
- `amount_spent`

### Table name: `customer_activity`

**Columns:**
- `date`
- `customer_id`
- `order_id`

**Some starter code has been provided to set up a connection to the database and make querying the database easier...**

Below, I define a helper function to make it easy for you to write your query. 

In [1]:
# Run this code unchanged
import pandas as pd
import sqlite3
conn = sqlite3.connect('orders.db')

def run_query(query_string):
    
    return pd.read_sql(query_string, conn)

In [2]:
# Run this code unchanged
import pandas as pd
import sqlite3
conn = sqlite3.connect('orders.db')

def run_query(query_string):
    
    return pd.read_sql(query_string, conn)

**Here is an example of writing an sql query in a jupyter notebook...**

In [3]:
# Run this code unchanged

## Triple quotations are used
## to allow for a multiline string
query = """

select *
from example_table
order by col3 desc

"""

## Pass the query into the `run_query` function
run_query(query)

Unnamed: 0,col1,col2,col3
0,4,5,6
1,1,2,3


In [4]:
# Run this code unchanged

## Triple quotations are used
## to allow for a multiline string
query = """

select *
from example_table
order by col3 desc

"""

## Pass the query into the `run_query` function
run_query(query)

Unnamed: 0,col1,col2,col3
0,4,5,6
1,1,2,3


# Write your query

Your query should produce a table with the following columns
- date
- customer_id
- cumulative_total

**Sort the results by `date` in ascending order**

In [5]:
# YOUR CODE GOES HERE

query = """




"""

In [6]:

query = """

select date
     , customer_id
     , sum(amount_spent) 
       over(partition by customer_id 
            order by date 
            rows unbounded preceding) cumulative_total
from orders o
join customer_activity c
on o.order_id = c.order_id
order by date asc

"""

In [7]:
# Inspect the output of your query here
run_query(query)

Unnamed: 0,date,customer_id,cumulative_total
0,2021-05-09 20:36:10.930989,d945201f-e671-4404-9ed7-8cd5930909a6,2.052925
1,2021-05-10 20:36:10.930989,f1ef9d1e-75df-403f-ae2c-b3615248f3dc,26.615839
2,2021-05-11 20:36:10.930989,a2d9c4c3-1081-4a9c-b9af-ac162a87ef64,13.575465
3,2021-05-12 20:36:10.930989,306bed34-5643-412d-9bf8-80c446774419,3.836420
4,2021-05-13 20:36:10.930989,ba2216d6-4cbc-471d-b62d-4128af46099e,24.635933
...,...,...,...
295,2022-02-28 20:36:10.930989,d945201f-e671-4404-9ed7-8cd5930909a6,403.691951
296,2022-03-01 20:36:10.930989,d945201f-e671-4404-9ed7-8cd5930909a6,421.599754
297,2022-03-02 20:36:10.930989,f1ef9d1e-75df-403f-ae2c-b3615248f3dc,781.334980
298,2022-03-03 20:36:10.930989,ba2216d6-4cbc-471d-b62d-4128af46099e,616.290965


In [8]:
# Inspect the output of your query here
run_query(query)

Unnamed: 0,date,customer_id,cumulative_total
0,2021-05-09 20:36:10.930989,d945201f-e671-4404-9ed7-8cd5930909a6,2.052925
1,2021-05-10 20:36:10.930989,f1ef9d1e-75df-403f-ae2c-b3615248f3dc,26.615839
2,2021-05-11 20:36:10.930989,a2d9c4c3-1081-4a9c-b9af-ac162a87ef64,13.575465
3,2021-05-12 20:36:10.930989,306bed34-5643-412d-9bf8-80c446774419,3.836420
4,2021-05-13 20:36:10.930989,ba2216d6-4cbc-471d-b62d-4128af46099e,24.635933
...,...,...,...
295,2022-02-28 20:36:10.930989,d945201f-e671-4404-9ed7-8cd5930909a6,403.691951
296,2022-03-01 20:36:10.930989,d945201f-e671-4404-9ed7-8cd5930909a6,421.599754
297,2022-03-02 20:36:10.930989,f1ef9d1e-75df-403f-ae2c-b3615248f3dc,781.334980
298,2022-03-03 20:36:10.930989,ba2216d6-4cbc-471d-b62d-4128af46099e,616.290965


In [9]:
# Run this code to test the results of your query
from tests import test
a, r = test(query)

✅ Correct!


In [10]:
# Run this code to test the results of your query
from tests import test
a, r = test(query)

✅ Correct!
