## 2021: Week 37 - Re-looking at Phone Contract Revenue

The latest version of Tableau Prep was released this week! This included a great new feature that allows you to generate new rows that aren't already available within your data set. If you want to know more then check out Carl's blog post where he has this covered!

As there's a new feature that makes our lives that little bit easier, I thought it would be a great opportunity to revisit one of the older challenges to see how things have changed. Therefore this week we are nearly going all the way back to where Preppin' Data began... Challenge 2019 Week 3! 

If you haven't completed this challenge then you can find the original post here, but for the challenge we are going to calculate the recurring revenue based on a mobile phone contract length. This can be completed in the new version of Prep (2021.3 onwards) or an older version if you haven't had the opportunity to download it yet (better yet, why not try both!). 

### Scenario
You work for a mobile / cell phone company. You boss asks you to pull together the revenue report of your current batch of contracts (sadly there are only four contracts!). They need to know how much revenue is generated each month from these contracts whilst they are 'live' (ie from their start date until 'x' months in the future when the contract runs out).

### Requirement
- Input the Data
- Calculate the End Date for each contract
- Create a Row for each month a person will hold the contract
- Calculate the monthly cumulative cost of each person's contract
- Output the Data

### Output
![img](https://1.bp.blogspot.com/-QLr15tWCcxQ/YUBuX_t2CRI/AAAAAAAAHZM/loe_8PMYy8kcofoEYXeC6f5EquPDKJmqACLcBGAsYHQ/s320/Output.png)

In [213]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Input the Data

In [214]:
data = pd.read_excel("./data/2021 Week 37 Input.xlsx", sheet_name=[0, 1])
contract = data[0].copy()

In [215]:
contract

Unnamed: 0,Name,Monthly Cost,Contract Length (months),Start Date
0,Carl,20,24,2018-12-13
1,Jonathan,15,6,2019-02-22
2,Andy,45,12,2018-10-17
3,Sophie,30,12,2018-11-19


In [216]:
contract.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
 #   Column                    Non-Null Count  Dtype         
---  ------                    --------------  -----         
 0   Name                      4 non-null      object        
 1   Monthly Cost              4 non-null      int64         
 2   Contract Length (months)  4 non-null      int64         
 3   Start Date                4 non-null      datetime64[ns]
dtypes: datetime64[ns](1), int64(2), object(1)
memory usage: 256.0+ bytes


### Calculate the End Date for each contract

In [217]:
dt = pd.Timestamp(contract["Start Date"][0])
dt + pd.DateOffset(months=24)

Timestamp('2020-12-13 00:00:00')

In [218]:
contract["End Date"] = contract.apply(lambda x: x["Start Date"] + pd.DateOffset(months=x["Contract Length (months)"]), axis=1)
contract["End Date"]

0   2020-12-13
1   2019-08-22
2   2019-10-17
3   2019-11-19
Name: End Date, dtype: datetime64[ns]

### Create a Row for each month a person will hold the contract

In [219]:
contract = contract.melt(id_vars=["Name", "Monthly Cost", "Contract Length (months)"], 
                         var_name="Period Indication", value_name="Contract period")
contract

Unnamed: 0,Name,Monthly Cost,Contract Length (months),Period Indication,Contract period
0,Carl,20,24,Start Date,2018-12-13
1,Jonathan,15,6,Start Date,2019-02-22
2,Andy,45,12,Start Date,2018-10-17
3,Sophie,30,12,Start Date,2018-11-19
4,Carl,20,24,End Date,2020-12-13
5,Jonathan,15,6,End Date,2019-08-22
6,Andy,45,12,End Date,2019-10-17
7,Sophie,30,12,End Date,2019-11-19


In [220]:
contract["day"] = contract["Contract period"].dt.day
contract

Unnamed: 0,Name,Monthly Cost,Contract Length (months),Period Indication,Contract period,day
0,Carl,20,24,Start Date,2018-12-13,13
1,Jonathan,15,6,Start Date,2019-02-22,22
2,Andy,45,12,Start Date,2018-10-17,17
3,Sophie,30,12,Start Date,2018-11-19,19
4,Carl,20,24,End Date,2020-12-13,13
5,Jonathan,15,6,End Date,2019-08-22,22
6,Andy,45,12,End Date,2019-10-17,17
7,Sophie,30,12,End Date,2019-11-19,19


In [221]:
new_rows = contract.set_index("Contract period").groupby(["Name"]).resample("M", origin="start").size().reset_index()
drop_idx = new_rows.drop_duplicates(subset=["Name"], keep="last").index.tolist()
new_rows = new_rows.drop(drop_idx, axis=0)
new_rows.shape

(54, 3)

In [222]:
new_rows["year"] = new_rows["Contract period"].dt.year
new_rows["month"] = new_rows["Contract period"].dt.month
new_rows = new_rows.merge(contract[["Name", "day"]].drop_duplicates(), how="left", on="Name")
new_rows

Unnamed: 0,Name,Contract period,0,year,month,day
0,Andy,2018-10-31,1,2018,10,17
1,Andy,2018-11-30,0,2018,11,17
2,Andy,2018-12-31,0,2018,12,17
3,Andy,2019-01-31,0,2019,1,17
4,Andy,2019-02-28,0,2019,2,17
5,Andy,2019-03-31,0,2019,3,17
6,Andy,2019-04-30,0,2019,4,17
7,Andy,2019-05-31,0,2019,5,17
8,Andy,2019-06-30,0,2019,6,17
9,Andy,2019-07-31,0,2019,7,17


In [223]:
new_rows[["year", "month", "day"]] = new_rows[["year", "month", "day"]].astype(str)
new_rows["Contract period"] = new_rows["day"] + "/" + new_rows["month"] + "/" + new_rows["year"]

In [224]:
new_rows["Contract period"] = new_rows["Contract period"].map(lambda x: pd.to_datetime(x))
new_rows = new_rows.drop([0, "year", "month", "day"], axis=1)

### Calculate the monthly cumulative cost of each person's contract

In [225]:
cost_info = contract[["Name", "Monthly Cost"]].drop_duplicates()
new_rows = new_rows.merge(cost_info, how="left", on="Name")
new_rows.head()

Unnamed: 0,Name,Contract period,Monthly Cost
0,Andy,2018-10-17,45
1,Andy,2018-11-17,45
2,Andy,2018-12-17,45
3,Andy,2019-01-17,45
4,Andy,2019-02-17,45


In [231]:
cumulative_cost = new_rows.groupby(["Name"])["Monthly Cost"].cumsum().apply(pd.Series)
new_rows["Cumulative Monthly Cost"] = cumulative_cost
new_rows = new_rows.rename(columns={"Contract period": "Payment Date"})
new_rows

Unnamed: 0,Name,Payment Date,Monthly Cost,Cumulative Monthly Cost
0,Andy,2018-10-17,45,45
1,Andy,2018-11-17,45,90
2,Andy,2018-12-17,45,135
3,Andy,2019-01-17,45,180
4,Andy,2019-02-17,45,225
5,Andy,2019-03-17,45,270
6,Andy,2019-04-17,45,315
7,Andy,2019-05-17,45,360
8,Andy,2019-06-17,45,405
9,Andy,2019-07-17,45,450


### Output the Data

In [232]:
new_rows.to_csv("./output/Week37_output.csv")