## 2021: Week 35 - Picture Perfect

Data preppin' ideas really are all around! I was hanging some pictures the other week and I had some weird and wonderful sizes, so working out the perfect frames was a bit tiresome. If only I could use a data preppin' tool to speed up the process... 

### Input
We have 2 inputs this week:

1. Picture sizes 
![img](https://lh3.googleusercontent.com/-G8_8TINw7TY/YS591OZq5aI/AAAAAAAAA6o/n2-yPvhv2jUs30JKS4f2IZZ0NgHUmiarACLcBGAsYHQ/image.png)

2. Frame sizes
![img](https://lh3.googleusercontent.com/-AvDNsXSd9Bk/YS9tMox-bbI/AAAAAAAAA7I/9AE5n6GB91QV5xJuafIV6-NBdq67hymegCLcBGAsYHQ/image.png)

### Requirement
- Input the data
- Split up the sizes of the pictures and the frames into lengths and widths
    - Remember an inch is 2.54cm
- Frames can always be rotated, so make sure you know which is the min/max side
- See which pictures fit into which frames
- Work out the area of the frame vs the area of the picture and choose the frame with the smallest excess
- Output the data

### Output
![img](https://lh3.googleusercontent.com/-ccVraoOGt-I/YS6L0ikCBLI/AAAAAAAAA64/6NunkLPc7ywjecilgrlCmf_TFhzREMtWgCLcBGAsYHQ/image.png)

- 4 fields
    - Picture
    - Frame
    - Max Side
    - Min Side
- 14 rows (15 including headers)

In [146]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Input the data

In [147]:
data = pd.read_excel("./data/Pictures Input.xlsx", sheet_name=[0, 1])

In [148]:
pictures = data[0].copy()
frames = data[1].copy()

### Split up the sizes of the pictures and the frames into lengths and widths
- Remember an inch is 2.54cm

In [149]:
pictures

Unnamed: 0,Picture,Size
0,A,26cm x 23cm
1,B,30cm x 26cm
2,C,24cm2
3,D,25cm x 23cm
4,E,22cm x 19cm
5,F,28cm x 20cm
6,G,33cm x 23cm
7,H,23cm x 21cm
8,I,36cm x 25cm
9,J,26cm x 20cm


In [150]:
leng_wid = pictures["Size"].str.split("x").apply(pd.Series).fillna("24cm").rename(columns={0:"lengths", 1:"widths"})
leng_wid["lengths"] = leng_wid["lengths"].map(lambda x: x.split("c")[0])
leng_wid["widths"] = leng_wid["widths"].map(lambda x: x.split("c")[0])

In [151]:
pictures = pd.concat([pictures, leng_wid], axis=1)
pictures

Unnamed: 0,Picture,Size,lengths,widths
0,A,26cm x 23cm,26,23
1,B,30cm x 26cm,30,26
2,C,24cm2,24,24
3,D,25cm x 23cm,25,23
4,E,22cm x 19cm,22,19
5,F,28cm x 20cm,28,20
6,G,33cm x 23cm,33,23
7,H,23cm x 21cm,23,21
8,I,36cm x 25cm,36,25
9,J,26cm x 20cm,26,20


In [152]:
leng_wid = frames["Size"].str.split("x").apply(pd.Series).fillna(method="ffill", axis=1).rename(columns={0: "lengths", 1:"widths"})
inches = leng_wid.iloc[:3]
inches["lengths"] = inches["lengths"].map(lambda x: x.replace("\"", ""))
inches["widths"] = inches["widths"].map(lambda x: x.replace("\"", ""))
inches = inches.astype(int) * 2.54

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  after removing the cwd from sys.path.


In [153]:
leng_wid["lengths"] = leng_wid["lengths"].map(lambda x: x.split("c")[0])
leng_wid["widths"] = leng_wid["widths"].map(lambda x: x.split("c")[0])

In [154]:
leng_wid.iloc[:3] = inches

In [155]:
leng_wid = leng_wid.astype(float)

In [156]:
frames = pd.concat([frames, leng_wid], axis=1)
frames

Unnamed: 0,Size,lengths,widths
0,"8"" x 10""",20.32,25.4
1,"6"" x 4""",15.24,10.16
2,"8"" x 6""",20.32,15.24
3,30cm x 21cm,30.0,21.0
4,31cm x 25cm,31.0,25.0
5,30cm2,30.0,30.0
6,25cm2,25.0,25.0
7,20cm x 25cm,20.0,25.0
8,28cm x 36cm,28.0,36.0


In [157]:
pictures["lengths"] = pictures["lengths"].astype(int)
pictures["widths"] = pictures["widths"].astype(int)

In [158]:
frames["Area"] = frames["lengths"] * frames["widths"]
pictures["Area"] = pictures["lengths"] * pictures["widths"]

In [159]:
pictures

Unnamed: 0,Picture,Size,lengths,widths,Area
0,A,26cm x 23cm,26,23,598
1,B,30cm x 26cm,30,26,780
2,C,24cm2,24,24,576
3,D,25cm x 23cm,25,23,575
4,E,22cm x 19cm,22,19,418
5,F,28cm x 20cm,28,20,560
6,G,33cm x 23cm,33,23,759
7,H,23cm x 21cm,23,21,483
8,I,36cm x 25cm,36,25,900
9,J,26cm x 20cm,26,20,520


In [160]:
np.max([1, 2])

2

In [161]:
def max_slide(lengths_, widths_):
    compare_ = []
    compare_.append(lengths_)
    compare_.append(widths_)
    return np.max(compare_)

In [162]:
def min_slide(lengths_, widths_):
    compare_ = []
    compare_.append(lengths_)
    compare_.append(widths_)
    return np.min(compare_)

In [163]:
pictures["Max Slide"] = pictures.apply(lambda x: max_slide(x["lengths"], x["widths"]), axis=1)
pictures["Min Slide"] = pictures.apply(lambda x: min_slide(x["lengths"], x["widths"]), axis=1)

In [164]:
frames["Max Slide"] = frames.apply(lambda x: max_slide(x["lengths"], x["widths"]), axis=1)
frames["Min Slide"] = frames.apply(lambda x: min_slide(x["lengths"], x["widths"]), axis=1)

In [165]:
def check_excess():
    frames_min = frames["Min Slide"].tolist()
    frames_min_idx = frames["Min Slide"].index.tolist()
    frames_dict = dict(zip(frames_min_dix, frames_min))

In [166]:
frames_min = frames["Min Slide"].tolist()
frames_min_idx = frames["Min Slide"].index.tolist()

frames_max = frames["Max Slide"].tolist()
frames_max_idx = frames["Max Slide"].index.tolist()

In [167]:
def check_min_smallest_excess(x):
    result = 1000
    for i in frames_min:
        if (x <= i) & (i <= result):
            result = i
        else: pass
    return result

In [168]:
def check_max_smallest_excess(x):
    result = 1000
    for i in frames_max:
        if (x <= i) & (i <= result):
            result = i
        else: pass
    return result

In [169]:
pictures["Max Slide_compare"] = pictures["Max Slide"].map(lambda x: check_max_smallest_excess(x))
pictures["Min Slide_compare"] = pictures["Min Slide"].map(lambda x: check_min_smallest_excess(x))

In [170]:
pictures.merge(frames, how="left", left_on=["Min Slide_compare", "Max Slide_compare"],
               right_on=["Min Slide", "Max Slide"])

Unnamed: 0,Picture,Size_x,lengths_x,widths_x,Area_x,Max Slide_x,Min Slide_x,Max Slide_compare,Min Slide_compare,Size_y,lengths_y,widths_y,Area_y,Max Slide_y,Min Slide_y
0,A,26cm x 23cm,26,23,598,26,23,30.0,25.0,,,,,,
1,B,30cm x 26cm,30,26,780,30,26,30.0,28.0,,,,,,
2,C,24cm2,24,24,576,24,24,25.0,25.0,25cm2,25.0,25.0,625.0,25.0,25.0
3,D,25cm x 23cm,25,23,575,25,23,25.0,25.0,25cm2,25.0,25.0,625.0,25.0,25.0
4,E,22cm x 19cm,22,19,418,22,19,25.0,20.0,20cm x 25cm,20.0,25.0,500.0,25.0,20.0
5,F,28cm x 20cm,28,20,560,28,20,30.0,20.0,,,,,,
6,G,33cm x 23cm,33,23,759,33,23,36.0,25.0,,,,,,
7,H,23cm x 21cm,23,21,483,23,21,25.0,21.0,,,,,,
8,I,36cm x 25cm,36,25,900,36,25,36.0,25.0,,,,,,
9,J,26cm x 20cm,26,20,520,26,20,30.0,20.0,,,,,,


In [171]:
pictures = pictures.drop(["lengths", "widths", "Size"], axis=1)
frames = frames.drop(["lengths", "widths"], axis=1)

In [180]:
pd.merge_asof(pictures[["Max Slide_compare", "Min Slide_compare"]], 
              frames[["Max Slide", "Min Slide"]], left_on=["Max Slide_compare", "Min Slide_compare"],
              right_on=["Max Slide", "Min Slide"], direction="nearest")

MergeError: can only asof on a key for left

In [173]:
frames

Unnamed: 0,Size,Area,Max Slide,Min Slide
0,"8"" x 10""",516.128,25.4,20.32
1,"6"" x 4""",154.8384,15.24,10.16
2,"8"" x 6""",309.6768,20.32,15.24
3,30cm x 21cm,630.0,30.0,21.0
4,31cm x 25cm,775.0,31.0,25.0
5,30cm2,900.0,30.0,30.0
6,25cm2,625.0,25.0,25.0
7,20cm x 25cm,500.0,25.0,20.0
8,28cm x 36cm,1008.0,36.0,28.0
