## 2021: Week 33 Excelling at adding one more row

If you've spent as long as I have in the data world, you will inevitably have had moments when your sophisticated tools are actually a lot harder to solve a challenge with than Excel. The people you work with are likely to describe challenges to you in Excel terms and expect your solutions to be able to follow the same process as their logic. It's not always that easy though. 

Last week when working with some client data (I've converted this to an Allchains example), my team was challenged to look at Orders captured in a weekly snapshot that was then exported into Excel. 

Each week the file would show any order that was still opened that hadn't been fulfilled (ie delivered to the customer). The challenge is to classify when an order is new (the first report it has appeared in), unfulfilled (when it appears in any subsequent reports) or completed (the week after the order last appears in a report). But what if we needed to know whether the order was fulfilled and when? 

In Excel, we'd stack of those rows of data on top of each other and just INSERT an extra row for each order after the last time it appears in a weekly snapshot. We don't have that ability to right-click and add the additional row in Prep so we need to think of some alternate logic. 

### Input
5 worksheets in one Excel file with the same format
![img](https://1.bp.blogspot.com/-ciSacUA9Css/YRLDuvVsyxI/AAAAAAAACPY/9htDlXATbpojtlX4mNHQmqgdBbaYvBKiwCLcBGAsYHQ/s320/Screenshot%2B2021-08-10%2Bat%2B19.18.54.png)

### Requirement
- Input the data
- Create one complete data set
- Use the Table Names field to create the Reporting Date
- Find the Minimum and Maximum date where an order appeared in the reports
- Add one week on to the maximum date to show when an order was fulfilled by
- Apply this logic:
    - The first time an order appears it should be classified as a 'New Order'
    - The week after the last time an order appears in a report (the maximum date) is when the order is classed as 'Fulfilled' 
    - Any week between 'New Order' and 'Fulfilled' status is classed as an 'Unfulfilled Order' 
- Pull of the data sets together 
- Remove any unnecessary fields
- Output the data

### Output
![img](https://1.bp.blogspot.com/-QAVqr4bOUQk/YRLTJYSNtyI/AAAAAAAACPg/uPmCoQale7cXWUbBzAdHveNsQ8Fxz4uQACLcBGAsYHQ/w640-h472/Screenshot%2B2021-08-10%2Bat%2B20.27.21.png)

4 data fields:
- Order status
- Orders
- Sales Date
- Reporting Date
35 Rows (36 rows including headers)

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [7]:
data = pd.read_excel("./data/Allchains Weekly Orders.xlsx", sheet_name=[0, 1, 2, 3, 4])

In [9]:
results = []
reporting_date = ["01/01/2021", "01/08/2021", "01/15/2021", "22/01/2021", "29/01/2021"]

for i in range(len(data.keys())):
    df = data[i].copy()
    df["Reporting Date"] = reporting_date[i]
    results.append(df)
df = pd.concat(results, axis=0)
df

Unnamed: 0,Orders,Sale Date,Reporting Date
0,A,2020-12-29,01/01/2021
1,B,2020-12-31,01/01/2021
2,C,2021-01-01,01/01/2021
0,B,2020-12-31,01/08/2021
1,C,2021-01-01,01/08/2021
2,D,2021-01-04,01/08/2021
3,E,2021-01-07,01/08/2021
4,F,2021-01-08,01/08/2021
0,B,2020-12-31,01/15/2021
1,D,2021-01-04,01/15/2021


In [6]:
data[0]

Unnamed: 0,Orders,Sale Date
0,A,2020-12-29
1,B,2020-12-31
2,C,2021-01-01
