## 2021: Week 19 Prep Air Project Details

Continuing on our fundamental skills theme in May, this week's challenge is all about string calculations. If you want to know more about what String data is and how you might want to work with it, you might want to check out our how to... deal with strings post. Pay particular attention to split, right, mid and find if you're getting stuck with this challenge! I've written another guide to the common functions that you might find use when working with Strings in Tableau. 

### The Challenge
This week we are trying to find out more detail on what is going on with the project over runs in Prep Air (every data prepper's favourite airline). To get more detail than just what was shared last week we've uncovered the commentary log that sits behind our project management system. Like any system that holds the detail shown on the programme's interface in a log file, it has great detail but held in an unfriendly way. 

We need your help to get stuck into the messy data and extract out the useful details.

### Input
One Excel file with 5 sheets of data.

![img](https://1.bp.blogspot.com/-_RJxcg2g448/YJUmTjDio2I/AAAAAAAACKg/yS9RP61aW5kT_ns68lqarxS17Bc15u03QCLcBGAsYHQ/w640-h126/Screenshot%2B2021-05-07%2Bat%2B12.36.27.png)

There is one main page of data with four lookup tables that will help you change some abbreviations and codes to be full words.

### Requirement
- Input the data

There are lots of different ways you can do this challenge so rather than a step-by-step set of requirements, feel free to create each of these data fields in whatever order you like:

- 'Week' with the word week and week number together 'Week x' 
- 'Project' with the full project name
- 'Sub-Project' with the full sub-project name
- 'Task' with the full type of task
- 'Name' with the owner of the task's full name (Week 18's output can help you check these if needed) 
- 'Days Noted' some fields have comments that say how many days tasks might take. This field should note the number of days mentioned if said in the comment otherwise leave as a null. 
- 'Detail' the description from the system output with the project details in the [ ] 
- Output the file

### Output

![img](https://1.bp.blogspot.com/-0kJkW2UFz6c/YJVDcvzWB_I/AAAAAAAACKo/7bPo-kni-agXtQhj5a7noeGeri2oDCo_wCLcBGAsYHQ/w640-h136/Screenshot%2B2021-05-07%2Bat%2B14.40.44.png)

One file
- 7 data fields:
    - Week
    - Project
    - Sub-Project
    - Task
    - Name
    - Days Needed
    - Detail

18 rows of data (19 including headers)


In [384]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [385]:
### Input the data
data = pd.read_excel("./data/PD 2021 Week 19 Input.xlsx", 
                     sheet_name=["Project Schedule Updates", "Project Lookup Table", "Sub-Project Lookup Table",
                                 "Task Lookup Table", "Owner Lookup Table"])

In [386]:
project_schedule = data["Project Schedule Updates"].copy()
project_schedule

Unnamed: 0,Week,Commentary
0,16.0,[NLS/Op-Sc] Delivered scope for the project. R...
1,17.0,[NLS/Op-Bu] Build kickoff but long project. je...
2,18.0,[NLS/Op-De] Long delivery process has begun at...
3,19.0,[NTI/Mar-Bu] Project build commences. Will be ...
4,20.0,[NTI/Mar-De] Delivery next week around 8 days....


In [387]:
tmp = project_schedule["Commentary"].map(lambda x: x.split("[")).apply(pd.Series)
tmp = pd.concat([project_schedule["Week"], tmp], axis=1)
tmp = tmp.melt(id_vars="Week").dropna().loc[5:, ["Week", "value"]]

project_schedule = tmp.copy()
project_schedule = project_schedule.rename(columns={"value": "Commentary"})
project_schedule

Unnamed: 0,Week,Commentary
5,16.0,NLS/Op-Sc] Delivered scope for the project. Re...
6,17.0,NLS/Op-Bu] Build kickoff but long project. jen.
7,18.0,NLS/Op-De] Long delivery process has begun at ...
8,19.0,NTI/Mar-Bu] Project build commences. Will be c...
9,20.0,NTI/Mar-De] Delivery next week around 8 days. ...
11,17.0,NLS/Mar-Sc] Scope completed. tom.
12,18.0,NLS/Mar-De] Similar to the operations team. 8 ...
13,19.0,NTI/Op-Bu] Longer build than the easy marketin...
14,20.0,NTI/Ops-De] Delivery also next week. Same as t...
16,17.0,NLS/Mar-Bu] Marketing Build complete. tom.


In [388]:
### Week' with the word week and week number together 'Week x' 

In [389]:
project_schedule["Week"] = project_schedule["Week"].astype(int).astype(str)
project_schedule["Week"] = "Week " + project_schedule["Week"]
project_schedule

Unnamed: 0,Week,Commentary
5,Week 16,NLS/Op-Sc] Delivered scope for the project. Re...
6,Week 17,NLS/Op-Bu] Build kickoff but long project. jen.
7,Week 18,NLS/Op-De] Long delivery process has begun at ...
8,Week 19,NTI/Mar-Bu] Project build commences. Will be c...
9,Week 20,NTI/Mar-De] Delivery next week around 8 days. ...
11,Week 17,NLS/Mar-Sc] Scope completed. tom.
12,Week 18,NLS/Mar-De] Similar to the operations team. 8 ...
13,Week 19,NTI/Op-Bu] Longer build than the easy marketin...
14,Week 20,NTI/Ops-De] Delivery also next week. Same as t...
16,Week 17,NLS/Mar-Bu] Marketing Build complete. tom.


In [390]:
### 'Project' with the full project name

In [391]:
project_code = data["Project Lookup Table"].copy()

In [392]:
project_schedule["project_sub_task"] = project_schedule["Commentary"].map(lambda x: x.split(" ")[0])
project_schedule

Unnamed: 0,Week,Commentary,project_sub_task
5,Week 16,NLS/Op-Sc] Delivered scope for the project. Re...,NLS/Op-Sc]
6,Week 17,NLS/Op-Bu] Build kickoff but long project. jen.,NLS/Op-Bu]
7,Week 18,NLS/Op-De] Long delivery process has begun at ...,NLS/Op-De]
8,Week 19,NTI/Mar-Bu] Project build commences. Will be c...,NTI/Mar-Bu]
9,Week 20,NTI/Mar-De] Delivery next week around 8 days. ...,NTI/Mar-De]
11,Week 17,NLS/Mar-Sc] Scope completed. tom.,NLS/Mar-Sc]
12,Week 18,NLS/Mar-De] Similar to the operations team. 8 ...,NLS/Mar-De]
13,Week 19,NTI/Op-Bu] Longer build than the easy marketin...,NTI/Op-Bu]
14,Week 20,NTI/Ops-De] Delivery also next week. Same as t...,NTI/Ops-De]
16,Week 17,NLS/Mar-Bu] Marketing Build complete. tom.,NLS/Mar-Bu]


In [393]:
project_schedule["Project"] = project_schedule["project_sub_task"].map(lambda x: x.split("/")[0])

In [394]:
project_schedule = (project_schedule
                        .merge(project_code, how="left", left_on="Project", right_on="Project Code")
                        .drop(["Project_x"], axis=1)
                        .rename(columns={"Project_y" : "Project"}))
project_schedule

Unnamed: 0,Week,Commentary,project_sub_task,Project Code,Project
0,Week 16,NLS/Op-Sc] Delivered scope for the project. Re...,NLS/Op-Sc],NLS,New Loyalty Scheme
1,Week 17,NLS/Op-Bu] Build kickoff but long project. jen.,NLS/Op-Bu],NLS,New Loyalty Scheme
2,Week 18,NLS/Op-De] Long delivery process has begun at ...,NLS/Op-De],NLS,New Loyalty Scheme
3,Week 19,NTI/Mar-Bu] Project build commences. Will be c...,NTI/Mar-Bu],NTI,New Trolley Inventory
4,Week 20,NTI/Mar-De] Delivery next week around 8 days. ...,NTI/Mar-De],NTI,New Trolley Inventory
5,Week 17,NLS/Mar-Sc] Scope completed. tom.,NLS/Mar-Sc],NLS,New Loyalty Scheme
6,Week 18,NLS/Mar-De] Similar to the operations team. 8 ...,NLS/Mar-De],NLS,New Loyalty Scheme
7,Week 19,NTI/Op-Bu] Longer build than the easy marketin...,NTI/Op-Bu],NTI,New Trolley Inventory
8,Week 20,NTI/Ops-De] Delivery also next week. Same as t...,NTI/Ops-De],NTI,New Trolley Inventory
9,Week 17,NLS/Mar-Bu] Marketing Build complete. tom.,NLS/Mar-Bu],NLS,New Loyalty Scheme


In [395]:
### Sub-Project' with the full sub-project name

In [396]:
project_schedule["Sub-Project"] = project_schedule["project_sub_task"].map(lambda x: x.split("/")[1].split("-")[0])

In [397]:
project_schedule["Sub-Project"][8] = project_schedule["Sub-Project"][8].replace("s", "")
project_schedule

Unnamed: 0,Week,Commentary,project_sub_task,Project Code,Project,Sub-Project
0,Week 16,NLS/Op-Sc] Delivered scope for the project. Re...,NLS/Op-Sc],NLS,New Loyalty Scheme,Op
1,Week 17,NLS/Op-Bu] Build kickoff but long project. jen.,NLS/Op-Bu],NLS,New Loyalty Scheme,Op
2,Week 18,NLS/Op-De] Long delivery process has begun at ...,NLS/Op-De],NLS,New Loyalty Scheme,Op
3,Week 19,NTI/Mar-Bu] Project build commences. Will be c...,NTI/Mar-Bu],NTI,New Trolley Inventory,Mar
4,Week 20,NTI/Mar-De] Delivery next week around 8 days. ...,NTI/Mar-De],NTI,New Trolley Inventory,Mar
5,Week 17,NLS/Mar-Sc] Scope completed. tom.,NLS/Mar-Sc],NLS,New Loyalty Scheme,Mar
6,Week 18,NLS/Mar-De] Similar to the operations team. 8 ...,NLS/Mar-De],NLS,New Loyalty Scheme,Mar
7,Week 19,NTI/Op-Bu] Longer build than the easy marketin...,NTI/Op-Bu],NTI,New Trolley Inventory,Op
8,Week 20,NTI/Ops-De] Delivery also next week. Same as t...,NTI/Ops-De],NTI,New Trolley Inventory,Op
9,Week 17,NLS/Mar-Bu] Marketing Build complete. tom.,NLS/Mar-Bu],NLS,New Loyalty Scheme,Mar


In [398]:
sub_project_code = data["Sub-Project Lookup Table"].copy()
sub_project_code["Sub-Project Code"] = sub_project_code["Sub-Project Code"].str.capitalize()
sub_project_code

Unnamed: 0,Sub-Project Code,Sub-Project
0,Mar,Marketing
1,Op,Operations


In [399]:
project_schedule = (project_schedule
                        .merge(sub_project_code, how="left", left_on="Sub-Project", right_on="Sub-Project Code")
                        .drop(["Sub-Project_x"], axis=1)
                        .rename(columns={"Sub-Project_y" : "Sub-Project"})
                   )
project_schedule

Unnamed: 0,Week,Commentary,project_sub_task,Project Code,Project,Sub-Project Code,Sub-Project
0,Week 16,NLS/Op-Sc] Delivered scope for the project. Re...,NLS/Op-Sc],NLS,New Loyalty Scheme,Op,Operations
1,Week 17,NLS/Op-Bu] Build kickoff but long project. jen.,NLS/Op-Bu],NLS,New Loyalty Scheme,Op,Operations
2,Week 18,NLS/Op-De] Long delivery process has begun at ...,NLS/Op-De],NLS,New Loyalty Scheme,Op,Operations
3,Week 19,NTI/Mar-Bu] Project build commences. Will be c...,NTI/Mar-Bu],NTI,New Trolley Inventory,Mar,Marketing
4,Week 20,NTI/Mar-De] Delivery next week around 8 days. ...,NTI/Mar-De],NTI,New Trolley Inventory,Mar,Marketing
5,Week 17,NLS/Mar-Sc] Scope completed. tom.,NLS/Mar-Sc],NLS,New Loyalty Scheme,Mar,Marketing
6,Week 18,NLS/Mar-De] Similar to the operations team. 8 ...,NLS/Mar-De],NLS,New Loyalty Scheme,Mar,Marketing
7,Week 19,NTI/Op-Bu] Longer build than the easy marketin...,NTI/Op-Bu],NTI,New Trolley Inventory,Op,Operations
8,Week 20,NTI/Ops-De] Delivery also next week. Same as t...,NTI/Ops-De],NTI,New Trolley Inventory,Op,Operations
9,Week 17,NLS/Mar-Bu] Marketing Build complete. tom.,NLS/Mar-Bu],NLS,New Loyalty Scheme,Mar,Marketing


In [400]:
### 'Task' with the full type of task

In [401]:
task_code = data["Task Lookup Table"].copy()
task_code

Unnamed: 0,Task Code,Task
0,Sc,Scope
1,Bu,Build
2,De,Deliver


In [402]:
project_schedule["Task"] = project_schedule["project_sub_task"].map(lambda x: x.split("-")[1].replace("]", ""))
project_schedule

Unnamed: 0,Week,Commentary,project_sub_task,Project Code,Project,Sub-Project Code,Sub-Project,Task
0,Week 16,NLS/Op-Sc] Delivered scope for the project. Re...,NLS/Op-Sc],NLS,New Loyalty Scheme,Op,Operations,Sc
1,Week 17,NLS/Op-Bu] Build kickoff but long project. jen.,NLS/Op-Bu],NLS,New Loyalty Scheme,Op,Operations,Bu
2,Week 18,NLS/Op-De] Long delivery process has begun at ...,NLS/Op-De],NLS,New Loyalty Scheme,Op,Operations,De
3,Week 19,NTI/Mar-Bu] Project build commences. Will be c...,NTI/Mar-Bu],NTI,New Trolley Inventory,Mar,Marketing,Bu
4,Week 20,NTI/Mar-De] Delivery next week around 8 days. ...,NTI/Mar-De],NTI,New Trolley Inventory,Mar,Marketing,De
5,Week 17,NLS/Mar-Sc] Scope completed. tom.,NLS/Mar-Sc],NLS,New Loyalty Scheme,Mar,Marketing,Sc
6,Week 18,NLS/Mar-De] Similar to the operations team. 8 ...,NLS/Mar-De],NLS,New Loyalty Scheme,Mar,Marketing,De
7,Week 19,NTI/Op-Bu] Longer build than the easy marketin...,NTI/Op-Bu],NTI,New Trolley Inventory,Op,Operations,Bu
8,Week 20,NTI/Ops-De] Delivery also next week. Same as t...,NTI/Ops-De],NTI,New Trolley Inventory,Op,Operations,De
9,Week 17,NLS/Mar-Bu] Marketing Build complete. tom.,NLS/Mar-Bu],NLS,New Loyalty Scheme,Mar,Marketing,Bu


In [403]:
project_schedule = (project_schedule
                        .merge(task_code, how="left", left_on="Task", right_on="Task Code")
                        .drop(["Task_x"], axis=1)
                        .rename(columns={"Task_y" : "Task"})
                   )
project_schedule

Unnamed: 0,Week,Commentary,project_sub_task,Project Code,Project,Sub-Project Code,Sub-Project,Task Code,Task
0,Week 16,NLS/Op-Sc] Delivered scope for the project. Re...,NLS/Op-Sc],NLS,New Loyalty Scheme,Op,Operations,Sc,Scope
1,Week 17,NLS/Op-Bu] Build kickoff but long project. jen.,NLS/Op-Bu],NLS,New Loyalty Scheme,Op,Operations,Bu,Build
2,Week 18,NLS/Op-De] Long delivery process has begun at ...,NLS/Op-De],NLS,New Loyalty Scheme,Op,Operations,De,Deliver
3,Week 19,NTI/Mar-Bu] Project build commences. Will be c...,NTI/Mar-Bu],NTI,New Trolley Inventory,Mar,Marketing,Bu,Build
4,Week 20,NTI/Mar-De] Delivery next week around 8 days. ...,NTI/Mar-De],NTI,New Trolley Inventory,Mar,Marketing,De,Deliver
5,Week 17,NLS/Mar-Sc] Scope completed. tom.,NLS/Mar-Sc],NLS,New Loyalty Scheme,Mar,Marketing,Sc,Scope
6,Week 18,NLS/Mar-De] Similar to the operations team. 8 ...,NLS/Mar-De],NLS,New Loyalty Scheme,Mar,Marketing,De,Deliver
7,Week 19,NTI/Op-Bu] Longer build than the easy marketin...,NTI/Op-Bu],NTI,New Trolley Inventory,Op,Operations,Bu,Build
8,Week 20,NTI/Ops-De] Delivery also next week. Same as t...,NTI/Ops-De],NTI,New Trolley Inventory,Op,Operations,De,Deliver
9,Week 17,NLS/Mar-Bu] Marketing Build complete. tom.,NLS/Mar-Bu],NLS,New Loyalty Scheme,Mar,Marketing,Bu,Build


In [404]:
### 'Name' with the owner of the task's full name (Week 18's output can help you check these if needed) 

In [405]:
owner_code = data["Owner Lookup Table"].copy()
owner_code

Unnamed: 0,Abbreviation,Name
0,Tom,Tom
1,Jen,Jenny
2,Jon,Jonathan
3,Car,Carl


In [406]:
project_schedule["Name"] = project_schedule["Commentary"].map(lambda x: x.split(".")[-2].strip())
project_schedule["Name"] = project_schedule["Name"].map(lambda x: x.capitalize())
project_schedule["Name"]

0     Jen
1     Jen
2     Jen
3     Tom
4     Tom
5     Tom
6     Tom
7     Jen
8     Jen
9     Tom
10    Jen
11    Car
12    Car
13    Tom
14    Jon
15    Jon
16    Car
17    Jon
Name: Name, dtype: object

### 'Days Noted' some fields have comments that say how many days tasks might take
- This field should note the number of days mentioned if said in the comment otherwise leave as a null

In [407]:
project_schedule = (project_schedule
                        .merge(owner_code, how="left", left_on="Name", right_on="Abbreviation")
                        .drop(["Name_x"], axis=1)
                        .rename(columns={"Name_y": "Name"}))
project_schedule

Unnamed: 0,Week,Commentary,project_sub_task,Project Code,Project,Sub-Project Code,Sub-Project,Task Code,Task,Abbreviation,Name
0,Week 16,NLS/Op-Sc] Delivered scope for the project. Re...,NLS/Op-Sc],NLS,New Loyalty Scheme,Op,Operations,Sc,Scope,Jen,Jenny
1,Week 17,NLS/Op-Bu] Build kickoff but long project. jen.,NLS/Op-Bu],NLS,New Loyalty Scheme,Op,Operations,Bu,Build,Jen,Jenny
2,Week 18,NLS/Op-De] Long delivery process has begun at ...,NLS/Op-De],NLS,New Loyalty Scheme,Op,Operations,De,Deliver,Jen,Jenny
3,Week 19,NTI/Mar-Bu] Project build commences. Will be c...,NTI/Mar-Bu],NTI,New Trolley Inventory,Mar,Marketing,Bu,Build,Tom,Tom
4,Week 20,NTI/Mar-De] Delivery next week around 8 days. ...,NTI/Mar-De],NTI,New Trolley Inventory,Mar,Marketing,De,Deliver,Tom,Tom
5,Week 17,NLS/Mar-Sc] Scope completed. tom.,NLS/Mar-Sc],NLS,New Loyalty Scheme,Mar,Marketing,Sc,Scope,Tom,Tom
6,Week 18,NLS/Mar-De] Similar to the operations team. 8 ...,NLS/Mar-De],NLS,New Loyalty Scheme,Mar,Marketing,De,Deliver,Tom,Tom
7,Week 19,NTI/Op-Bu] Longer build than the easy marketin...,NTI/Op-Bu],NTI,New Trolley Inventory,Op,Operations,Bu,Build,Jen,Jenny
8,Week 20,NTI/Ops-De] Delivery also next week. Same as t...,NTI/Ops-De],NTI,New Trolley Inventory,Op,Operations,De,Deliver,Jen,Jenny
9,Week 17,NLS/Mar-Bu] Marketing Build complete. tom.,NLS/Mar-Bu],NLS,New Loyalty Scheme,Mar,Marketing,Bu,Build,Tom,Tom


In [408]:
import re
days = project_schedule["Commentary"].map(lambda x: re.findall('[0-9]+', x))
days = days.str[0]
days

0     NaN
1     NaN
2      10
3       5
4       8
5     NaN
6       8
7       2
8       8
9     NaN
10      4
11    NaN
12      3
13    NaN
14    NaN
15    NaN
16      5
17    NaN
Name: Commentary, dtype: object

In [409]:
project_schedule["Days Needed"] = days
project_schedule

Unnamed: 0,Week,Commentary,project_sub_task,Project Code,Project,Sub-Project Code,Sub-Project,Task Code,Task,Abbreviation,Name,Days Needed
0,Week 16,NLS/Op-Sc] Delivered scope for the project. Re...,NLS/Op-Sc],NLS,New Loyalty Scheme,Op,Operations,Sc,Scope,Jen,Jenny,
1,Week 17,NLS/Op-Bu] Build kickoff but long project. jen.,NLS/Op-Bu],NLS,New Loyalty Scheme,Op,Operations,Bu,Build,Jen,Jenny,
2,Week 18,NLS/Op-De] Long delivery process has begun at ...,NLS/Op-De],NLS,New Loyalty Scheme,Op,Operations,De,Deliver,Jen,Jenny,10.0
3,Week 19,NTI/Mar-Bu] Project build commences. Will be c...,NTI/Mar-Bu],NTI,New Trolley Inventory,Mar,Marketing,Bu,Build,Tom,Tom,5.0
4,Week 20,NTI/Mar-De] Delivery next week around 8 days. ...,NTI/Mar-De],NTI,New Trolley Inventory,Mar,Marketing,De,Deliver,Tom,Tom,8.0
5,Week 17,NLS/Mar-Sc] Scope completed. tom.,NLS/Mar-Sc],NLS,New Loyalty Scheme,Mar,Marketing,Sc,Scope,Tom,Tom,
6,Week 18,NLS/Mar-De] Similar to the operations team. 8 ...,NLS/Mar-De],NLS,New Loyalty Scheme,Mar,Marketing,De,Deliver,Tom,Tom,8.0
7,Week 19,NTI/Op-Bu] Longer build than the easy marketin...,NTI/Op-Bu],NTI,New Trolley Inventory,Op,Operations,Bu,Build,Jen,Jenny,2.0
8,Week 20,NTI/Ops-De] Delivery also next week. Same as t...,NTI/Ops-De],NTI,New Trolley Inventory,Op,Operations,De,Deliver,Jen,Jenny,8.0
9,Week 17,NLS/Mar-Bu] Marketing Build complete. tom.,NLS/Mar-Bu],NLS,New Loyalty Scheme,Mar,Marketing,Bu,Build,Tom,Tom,


### 'Detail' the description from the system output with the project details in the [ ] 

In [419]:
project_schedule["Detail"] = project_schedule["Commentary"].map(lambda x: x.split("]")[1])
project_schedule = project_schedule.drop(["Commentary", "project_sub_task", "Project Code", "Sub-Project Code",
                               "Task Code", "Abbreviation"], axis=1).sort_values(by=["Project", "Sub-Project"])
project_schedule

Unnamed: 0,Week,Project,Sub-Project,Task,Name,Days Needed,Detail
5,Week 17,New Loyalty Scheme,Marketing,Scope,Tom,,Scope completed. tom.
6,Week 18,New Loyalty Scheme,Marketing,Deliver,Tom,8.0,Similar to the operations team. 8 days effort...
9,Week 17,New Loyalty Scheme,Marketing,Build,Tom,,Marketing Build complete. tom.
0,Week 16,New Loyalty Scheme,Operations,Scope,Jenny,,Delivered scope for the project. Resourcing f...
1,Week 17,New Loyalty Scheme,Operations,Build,Jenny,,Build kickoff but long project. jen.
2,Week 18,New Loyalty Scheme,Operations,Deliver,Jenny,10.0,Long delivery process has begun at least 10 d...
3,Week 19,New Trolley Inventory,Marketing,Build,Tom,5.0,Project build commences. Will be completed in...
4,Week 20,New Trolley Inventory,Marketing,Deliver,Tom,8.0,Delivery next week around 8 days. tom.
13,Week 18,New Trolley Inventory,Marketing,Scope,Tom,,Need to balance resourcing carefully with two...
7,Week 19,New Trolley Inventory,Operations,Build,Jenny,2.0,Longer build than the easy marketing project ...


In [421]:
project_schedule.to_csv("./output/Week19_output.csv")