## 2021: Week 30 - Lift Your Spirits

Inspiration for Preppin' challenges can come from anywhere. I've recently moved into a block of flats and let me tell you, I spend a lot of time waiting for a lift (or elevator if you're across the pond). It got me pondering whether the lift is operating optimally. Is it better to stay on the floor that you drop passengers until the next time someone calls a lift, or should the lift return to the most common starting floor?

### Input
There is one input this week, detailing the time of each trip the lift takes, including which floor the passengers enter the lift and which floor the passengers leave the lift. 

![img](https://1.bp.blogspot.com/-D_G-q9Ae5k8/YO3bDCzspiI/AAAAAAAAA2I/lYKXw8bKQ1g3Dg8o2h0i1rOTNtRudAnbwCLcBGAsYHQ/s0/2021W29%2BInput.png)

For simplicity, assume that the lift does not stop mid-journey to pick up new passengers, but completes its current trip before starting a new one.

### Requirements
- Input the data
- Create a TripID field based on the time of day
    - Assume all trips took place on 12th July 2021
- Calculate how many floors the lift has to travel between trips
    - The order of floors is B, G, 1, 2, 3, etc.
- Calculate which floor the majority of trips begin at - call this the Default Position
- If every trip began from the same floor, how many floors would the lift need to travel to begin each journey?
    - e.g. if the default position of the lift were floor 2 and the trip was starting from the 4th floor, this would be 2 floors that the lift would need to travel
- How does the average floors travelled between trips compare to the average travel from the default position?
- Output the data

### Output
![img](https://1.bp.blogspot.com/-SENWuTcDDMk/YO3d9WmSZ-I/AAAAAAAAA2U/23jgflhHGZMquBdFqRdTO4Fggrrys1hmgCLcBGAsYHQ/w400-h40/2021W29%2BOutput.png)

- 4 fields
    - Default Position
    - Avg travel from default position
    - Avg travel between trips currently
    - Difference
- 1 row (2 rows including headers)

In [284]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [285]:
# Input the data
df = pd.read_csv("./data/2021W30.csv")
df.shape

(1978, 4)

In [286]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1978 entries, 0 to 1977
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Hour    1978 non-null   int64 
 1   Minute  1978 non-null   int64 
 2   From    1978 non-null   object
 3   To      1978 non-null   object
dtypes: int64(2), object(2)
memory usage: 61.9+ KB


In [287]:
df.head()

Unnamed: 0,Hour,Minute,From,To
0,0,1,G,8
1,0,2,4,G
2,0,2,11,G
3,0,3,B,G
4,0,4,1,G


In [288]:
df.columns

Index(['Hour', 'Minute', 'From', 'To'], dtype='object')

### Create a TripID field based on the time of day

In [289]:
pd.Timestamp(year=2021, month=7, day=12, hour=df.loc[0, "Hour"], minute=df.loc[0, "Minute"])

Timestamp('2021-07-12 00:01:00')

In [290]:
def create_tripID(hour, minute):
    return pd.Timestamp(year=2021, month=7, day=12, hour=hour, minute=minute)

df["TripID"] = df.apply(lambda x: create_tripID(x["Hour"], x["Minute"]), axis=1)

### Calculate how many floors the lift has to travel between trips

In [292]:
df["From"] = df["From"].str.replace("G", "0").str.replace("B", "-1")
df["To"] = df["To"].str.replace("G", "0").str.replace("B", "-1")

In [293]:
df["From"] = df["From"].astype(int)
df["To"] = df["To"].astype(int)

In [294]:
# df = df.drop_duplicates(subset=["TripID"], keep="last")
# df = df.reset_index(drop=True)

In [295]:
differences = []
last_row = df["From"].index[-1]
for i in range(df["From"].shape[0]):
    if i == last_row:
        differences.append(0)
    else :
        diff = abs(df.iloc[i, -2] - df.iloc[i+1, -3])
        differences.append(diff)
df["Difference"] = differences
df.head()

Unnamed: 0,Hour,Minute,From,To,TripID,Difference
0,0,1,0,8,2021-07-12 00:01:00,4
1,0,2,4,0,2021-07-12 00:02:00,11
2,0,2,11,0,2021-07-12 00:02:00,1
3,0,3,-1,0,2021-07-12 00:03:00,1
4,0,4,1,0,2021-07-12 00:04:00,10


### Calculate which floor the majority of trips begin at - call this the Default Position

In [297]:
df["From"].value_counts() # 0 Floor -> "G" Floor

 0     665
 8     120
 4     119
 1     119
 11    118
 2     115
 7     108
 9     108
 10    107
 6     107
 5     104
-1     100
 3      88
Name: From, dtype: int64

In [298]:
df.loc[:, "Default Position"] = "G"
df

Unnamed: 0,Hour,Minute,From,To,TripID,Difference,Default Position
0,0,1,0,8,2021-07-12 00:01:00,4,G
1,0,2,4,0,2021-07-12 00:02:00,11,G
2,0,2,11,0,2021-07-12 00:02:00,1,G
3,0,3,-1,0,2021-07-12 00:03:00,1,G
4,0,4,1,0,2021-07-12 00:04:00,10,G
...,...,...,...,...,...,...,...
1973,23,56,9,0,2021-07-12 23:56:00,0,G
1974,23,56,0,1,2021-07-12 23:56:00,1,G
1975,23,58,2,7,2021-07-12 23:58:00,3,G
1976,23,58,4,0,2021-07-12 23:58:00,0,G


### If every trip began from the same floor, how many floors would the lift need to travel to begin each journey?
- e.g. if the default position of the lift were floor 2 and the trip was starting from the 4th floor, this would be 2 floors that the lift would need to travel

In [299]:
# Avg Travel from default position
avg_travel_default = abs(df["From"]).mean()
avg_travel_default

3.74469160768453

In [300]:
# Avg Travel between trip currently
avg_travel_current = df["Difference"].mean()
avg_travel_current

4.361981799797776

In [301]:
df.tail()

Unnamed: 0,Hour,Minute,From,To,TripID,Difference,Default Position
1973,23,56,9,0,2021-07-12 23:56:00,0,G
1974,23,56,0,1,2021-07-12 23:56:00,1,G
1975,23,58,2,7,2021-07-12 23:58:00,3,G
1976,23,58,4,0,2021-07-12 23:58:00,0,G
1977,23,59,0,5,2021-07-12 23:59:00,0,G


### How does the average floors travelled between trips compare to the average travel from the default position?

In [302]:
# Difference between default and current
difference = avg_travel_default - avg_travel_current
difference

-0.6172901921132459

In [303]:
result = pd.DataFrame(["G", avg_travel_default, avg_travel_current, difference],).T
result.columns = ["Default Position", "Avg travel from default position", "Avg Travel between trips currently",
                  "Difference"]
result

Unnamed: 0,Default Position,Avg travel from default position,Avg Travel between trips currently,Difference
0,G,3.744692,4.361982,-0.61729


### Output the data

In [304]:
result.to_csv("./output/Week30_output.csv")