## 2021: Week 35 - Picture Perfect

Data preppin' ideas really are all around! I was hanging some pictures the other week and I had some weird and wonderful sizes, so working out the perfect frames was a bit tiresome. If only I could use a data preppin' tool to speed up the process... 

### Input
We have 2 inputs this week:

1. Picture sizes 
![img](https://lh3.googleusercontent.com/-G8_8TINw7TY/YS591OZq5aI/AAAAAAAAA6o/n2-yPvhv2jUs30JKS4f2IZZ0NgHUmiarACLcBGAsYHQ/image.png)

2. Frame sizes
![img](https://lh3.googleusercontent.com/-AvDNsXSd9Bk/YS9tMox-bbI/AAAAAAAAA7I/9AE5n6GB91QV5xJuafIV6-NBdq67hymegCLcBGAsYHQ/image.png)

### Requirement
- Input the data
- Split up the sizes of the pictures and the frames into lengths and widths
    - Remember an inch is 2.54cm
- Frames can always be rotated, so make sure you know which is the min/max side
- See which pictures fit into which frames
- Work out the area of the frame vs the area of the picture and choose the frame with the smallest excess
- Output the data

### Output
![img](https://lh3.googleusercontent.com/-ccVraoOGt-I/YS6L0ikCBLI/AAAAAAAAA64/6NunkLPc7ywjecilgrlCmf_TFhzREMtWgCLcBGAsYHQ/image.png)

- 4 fields
    - Picture
    - Frame
    - Max Side
    - Min Side
- 14 rows (15 including headers)

In [339]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Input the data

In [340]:
data = pd.read_excel("./data/Pictures Input.xlsx", sheet_name=[0, 1])

In [341]:
pictures = data[0].copy()
frames = data[1].copy()

### Split up the sizes of the pictures and the frames into lengths and widths
- Remember an inch is 2.54cm

In [342]:
pictures

Unnamed: 0,Picture,Size
0,A,26cm x 23cm
1,B,30cm x 26cm
2,C,24cm2
3,D,25cm x 23cm
4,E,22cm x 19cm
5,F,28cm x 20cm
6,G,33cm x 23cm
7,H,23cm x 21cm
8,I,36cm x 25cm
9,J,26cm x 20cm


In [343]:
leng_wid = pictures["Size"].str.split("x").apply(pd.Series).fillna("24cm").rename(columns={0:"lengths", 1:"widths"})
leng_wid["lengths"] = leng_wid["lengths"].map(lambda x: x.split("c")[0])
leng_wid["widths"] = leng_wid["widths"].map(lambda x: x.split("c")[0])

In [344]:
pictures = pd.concat([pictures, leng_wid], axis=1)
pictures

Unnamed: 0,Picture,Size,lengths,widths
0,A,26cm x 23cm,26,23
1,B,30cm x 26cm,30,26
2,C,24cm2,24,24
3,D,25cm x 23cm,25,23
4,E,22cm x 19cm,22,19
5,F,28cm x 20cm,28,20
6,G,33cm x 23cm,33,23
7,H,23cm x 21cm,23,21
8,I,36cm x 25cm,36,25
9,J,26cm x 20cm,26,20


In [345]:
leng_wid = frames["Size"].str.split("x").apply(pd.Series).fillna(method="ffill", axis=1).rename(columns={0: "lengths", 1:"widths"})
inches = leng_wid.iloc[:3]
inches["lengths"] = inches["lengths"].map(lambda x: x.replace("\"", ""))
inches["widths"] = inches["widths"].map(lambda x: x.replace("\"", ""))
inches = inches.astype(int) * 2.54

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  after removing the cwd from sys.path.


In [346]:
leng_wid["lengths"] = leng_wid["lengths"].map(lambda x: x.split("c")[0])
leng_wid["widths"] = leng_wid["widths"].map(lambda x: x.split("c")[0])

In [347]:
leng_wid.iloc[:3] = inches

In [348]:
leng_wid = leng_wid.astype(float)

In [349]:
frames = pd.concat([frames, leng_wid], axis=1)
frames

Unnamed: 0,Size,lengths,widths
0,"8"" x 10""",20.32,25.4
1,"6"" x 4""",15.24,10.16
2,"8"" x 6""",20.32,15.24
3,30cm x 21cm,30.0,21.0
4,31cm x 25cm,31.0,25.0
5,30cm2,30.0,30.0
6,25cm2,25.0,25.0
7,20cm x 25cm,20.0,25.0
8,28cm x 36cm,28.0,36.0


In [350]:
pictures["lengths"] = pictures["lengths"].astype(int)
pictures["widths"] = pictures["widths"].astype(int)

In [351]:
frames["Area"] = frames["lengths"] * frames["widths"]
pictures["Area"] = pictures["lengths"] * pictures["widths"]

In [352]:
def max_slide(lengths_, widths_):
    compare_ = []
    compare_.append(lengths_)
    compare_.append(widths_)
    return np.max(compare_)

In [353]:
def min_slide(lengths_, widths_):
    compare_ = []
    compare_.append(lengths_)
    compare_.append(widths_)
    return np.min(compare_)

In [354]:
pictures["Max Slide"] = pictures.apply(lambda x: max_slide(x["lengths"], x["widths"]), axis=1)
pictures["Min Slide"] = pictures.apply(lambda x: min_slide(x["lengths"], x["widths"]), axis=1)

In [355]:
frames["Max Slide"] = frames.apply(lambda x: max_slide(x["lengths"], x["widths"]), axis=1)
frames["Min Slide"] = frames.apply(lambda x: min_slide(x["lengths"], x["widths"]), axis=1)

In [356]:
def check_excess():
    frames_min = frames["Min Slide"].tolist()
    frames_min_idx = frames["Min Slide"].index.tolist()
    frames_dict = dict(zip(frames_min_dix, frames_min))

In [357]:
frames_min = frames["Min Slide"].tolist()
frames_min_idx = frames["Min Slide"].index.tolist()

frames_max = frames["Max Slide"].tolist()
frames_max_idx = frames["Max Slide"].index.tolist()

In [358]:
def check_min_smallest_excess(x):
    result = 1000
    for i in frames_min:
        if (x <= i) & (i <= result):
            result = i
        else: pass
    return result

In [359]:
def check_max_smallest_excess(x):
    result = 1000
    for i in frames_max:
        if (x <= i) & (i <= result):
            result = i
        else: pass
    return result

In [360]:
pictures["Max Slide_compare"] = pictures["Max Slide"].map(lambda x: check_max_smallest_excess(x))
pictures["Min Slide_compare"] = pictures["Min Slide"].map(lambda x: check_min_smallest_excess(x))

In [361]:
pictures = pictures.drop(["lengths", "widths", "Size"], axis=1)
frames = frames.drop(["lengths", "widths"], axis=1)

In [362]:
def key_for_merge(max_slide, min_slide):
    idx_ = None
    max_slide_frame = 100
    min_slide_frame = 100
    for i, max_value, min_value in zip(frames_max_idx, frames_max, frames_min):
        if (max_slide <= max_value) & (min_slide <= min_value):
            if (max_value <= max_slide_frame) & (min_value <= min_slide_frame):
                idx_ = i
                max_slide_frame = max_value
                min_slide_frame = min_value
        else: pass
    return idx_

In [363]:
frames = frames.reset_index()

In [364]:
pictures["Merge_idx"] = pictures.apply(lambda x: key_for_merge(x["Max Slide_compare"], x["Min Slide_compare"]), axis=1)

In [365]:
pictures = pictures.merge(frames, how="left", left_on="Merge_idx", right_on="index")
pictures = pictures.drop(["Area_x", "Max Slide_compare", "Min Slide_compare",
                          "Merge_idx", "index", "Area_y", "Max Slide_y", "Min Slide_y"], axis=1)
pictures.columns = ["Picture", "Max Side", "Min Side", "Frame"]
pictures = pictures.loc[:, ["Picture", "Frame", "Max Side", "Min Side"]]

In [367]:
pictures.to_csv("./output/Week35_output.csv")