## 2021: Week 22 - Answer Smash

Recently, my family and I have become quite invested in the TV quiz show Richard Osman's House of Games. The final round is always a round called Answer Smash. In this round you have a picture and question and you have to "smash" the name of the picture with the answer to the question, as per the below example.

![img](https://lh3.googleusercontent.com/-x16G24LFxik/YJzZMfUFdVI/AAAAAAAAAzQ/oDI1lOo9RA4ut-jVWFs7CfbS_z_-tSg4wCLcBGAsYHQ/w400-h244/image.png)

I thought it would be fun to work backwards and have a list of answer smashes from which we have to extract the Preppin' participant and the answer to the question!

### Input
1. Answer Smash list
![img](https://lh3.googleusercontent.com/-KPicq8d9jII/YJzbkRtRfOI/AAAAAAAAAzs/vZIaiRcWhhw1Lt8StK1zMc1cQ1IWbZvggCLcBGAsYHQ/image.png)

2. Names
![img](https://lh3.googleusercontent.com/-VZX54rgmlfM/YJzbPjM8y7I/AAAAAAAAAzc/LrY4ELLbV8MBJwGlhisTzsAQLEzpc9PwQCLcBGAsYHQ/image.png)

3. Questions
![img](https://lh3.googleusercontent.com/--6TUf8nijQE/YJzbZpziTsI/AAAAAAAAAzk/49dxzz-tQVE5KvaVY1C1wiZ18ogqHI3bQCLcBGAsYHQ/image.png)

4. Categories
![img](https://lh3.googleusercontent.com/-MfV11JywIiQ/YJzbx-Y8knI/AAAAAAAAAz0/V6VOAAY7IYs0JeV-TOsJMRtVgTcYcy75gCLcBGAsYHQ/image.png)

### Requirement
- Input the data
- The category dataset requires some cleaning so that Category and Answer are 2 separate fields (hint)
- Join the datasets together, making sure to keep an eye on row counts (hint)
- Filter the data so that each answer smash is matched with the corresponding name and answer (hint)
- Remove unnecessary columns
- Output the data

### Output
![img](https://lh3.googleusercontent.com/-e14km_tdnKs/YJzec7yMlsI/AAAAAAAAA0A/LrCd7oXn318CwVXWEj3S0W4aRyr65P5QwCLcBGAsYHQ/w640-h130/image.png)

- 5 fields
    - Q No
    - Name
    - Question
    - Answer
    - Answer Smash
- 20 rows (21 including headers)

In [208]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [209]:
### Input the data

In [210]:
data = pd.read_excel("./data/Answer Smash Input.xlsx", sheet_name=["Answer Smash", "Names",
                                                                   "Questions", "Category"])

In [211]:
answers = data["Answer Smash"].copy()
names = data["Names"].copy()
questions = data["Questions"].copy()
categories = data["Category"].copy()

In [212]:
### The category dataset requires some cleaning so that Category and Answer are 2 separate fields

In [213]:
categories["Category"] = categories["Category: Answer"].map(lambda x: x.split(":")[0].strip())
categories["Answer"] = categories["Category: Answer"].map(lambda x: x.split(":")[1].strip())
categories = categories.drop("Category: Answer", axis=1)
categories

Unnamed: 0,Category,Answer
0,Animals,Aardvark
1,Companies,Amazon
2,Companies,Annies Burger Shack
3,Science,Astrophysics
4,Companies,Barnes & Noble
5,Characters,Bert and Ernie
6,Characters,Big bird
7,Science,Brain
8,Animals,Brown Bear
9,Companies,Byron Burgers


In [214]:
### Join the datasets together, making sure to keep an eye on row counts (hint)

In [215]:
answer_name = pd.concat([answers, names], axis=1)
answer_name

Unnamed: 0,Q No,Answer Smash,Name
0,1,Mo Hassnow leopard,Mo Hassn
1,2,Kelly Gilbert and ernie,Kelly Gilbert
2,3,Arsenergy units,Arsene
3,4,Nicolas Mieszalymph nodes,Nicolas Mieszaly
4,5,Amalia García-Vellido Santíastrophysics,Amalia García-Vellido Santías
5,6,Owen Barnes & Noble,Owen Barnes
6,7,Simon Evans Cycles,Simon Evans
7,8,Donna Coleslaw,Donna Coles
8,9,Will Suttony Stark,Will Sutton
9,10,Shahzad Ziaardvark,Shahzad Zia


In [216]:
df = answer_name.merge(questions, how="inner", on="Q No")
df

Unnamed: 0,Q No,Answer Smash,Name,Category,Question
0,1,Mo Hassnow leopard,Mo Hassn,Animals,Which mammal has the latin name panthera uncia?
1,2,Kelly Gilbert and ernie,Kelly Gilbert,Characters,Name the famous Sesame Street duo.
2,3,Arsenergy units,Arsene,Science,What are joules or therms an example of?
3,4,Nicolas Mieszalymph nodes,Nicolas Mieszaly,Science,What parts of the body contain immune cells to...
4,5,Amalia García-Vellido Santíastrophysics,Amalia García-Vellido Santías,Science,Which branch of space science applies the laws...
5,6,Owen Barnes & Noble,Owen Barnes,Companies,What's the name of the American bookseller fou...
6,7,Simon Evans Cycles,Simon Evans,Companies,Which British cycle retailer was founded in 1921?
7,8,Donna Coleslaw,Donna Coles,Food,Which side dish consists mainly of shredded ra...
8,9,Will Suttony Stark,Will Sutton,Characters,"Who is the a genius, billionaire, playboy, phi..."
9,10,Shahzad Ziaardvark,Shahzad Zia,Animals,Which animal is frequently confused with the a...


In [217]:
### Filter the data so that each answer smash is matched with the corresponding name and answer 

In [218]:
categories

Unnamed: 0,Category,Answer
0,Animals,Aardvark
1,Companies,Amazon
2,Companies,Annies Burger Shack
3,Science,Astrophysics
4,Companies,Barnes & Noble
5,Characters,Bert and Ernie
6,Characters,Big bird
7,Science,Brain
8,Animals,Brown Bear
9,Companies,Byron Burgers


In [220]:
import re

results = []
for i in list_of_answers:
    tmp_list = df["Answer Smash"].map(lambda x: re.findall(i, x, flags=re.IGNORECASE))
    tmp_list = tmp_list.map(lambda x: np.nan if len(x) == 0 else x)
    results.append(tmp_list)
results = pd.concat(results, axis=1).dropna(how="all", axis=1)
results

Unnamed: 0,Answer Smash,Answer Smash.1,Answer Smash.2,Answer Smash.3,Answer Smash.4,Answer Smash.5,Answer Smash.6,Answer Smash.7,Answer Smash.8,Answer Smash.9,Answer Smash.10,Answer Smash.11,Answer Smash.12,Answer Smash.13,Answer Smash.14,Answer Smash.15,Answer Smash.16,Answer Smash.17,Answer Smash.18,Answer Smash.19
0,,,,,,,,,,,,,,,,,[snow leopard],,,
1,,,,[bert and ernie],,,,,,,,,,,,,,,,
2,,,,,,,,,,[energy units],,,,,,,,,,
3,,,,,,,,,,,,,,[lymph nodes],,,,,,
4,,[astrophysics],,,,,,,,,,,,,,,,,,
5,,,[Barnes & Noble],,,,,,,,,,,,,,,,,
6,,,,,,,,,,,[Evans Cycles],,,,,,,,,
7,,,,,,,,,[Coleslaw],,,,,,,,,,,
8,,,,,,,,,,,,,,,,,,[tony Stark],,
9,[aardvark],,,,,,,,,,,,,,,,,,,


In [223]:
test = []
for i in np.arange(len(results.columns)):
    tmp = results.iloc[:, i].dropna()
    test.append(tmp)
test = pd.concat(test)
test = pd.Series(test)
test = test.map(lambda x: x[0])
test = test.str.capitalize()
test

9                      Aardvark
4                  Astrophysics
5                Barnes & noble
1                Bert and ernie
18                   Brown bear
14                Byron burgers
16    Casper the friendly ghost
13                   Chinchilla
7                      Coleslaw
2                  Energy units
6                  Evans cycles
10                      Gherkin
19                    Hot sauce
3                   Lymph nodes
11                 Norman bates
12                     Seahorse
0                  Snow leopard
8                    Tony stark
15                        Unagi
17                      Wayfair
Name: Answer Smash, dtype: object

In [232]:
df.join(test, how="inner", rsuffix="_y").rename(columns={"Answer Smash_y": "Answer"})

Unnamed: 0,Q No,Answer Smash,Name,Category,Question,Answer
0,1,Mo Hassnow leopard,Mo Hassn,Animals,Which mammal has the latin name panthera uncia?,Snow leopard
1,2,Kelly Gilbert and ernie,Kelly Gilbert,Characters,Name the famous Sesame Street duo.,Bert and ernie
2,3,Arsenergy units,Arsene,Science,What are joules or therms an example of?,Energy units
3,4,Nicolas Mieszalymph nodes,Nicolas Mieszaly,Science,What parts of the body contain immune cells to...,Lymph nodes
4,5,Amalia García-Vellido Santíastrophysics,Amalia García-Vellido Santías,Science,Which branch of space science applies the laws...,Astrophysics
5,6,Owen Barnes & Noble,Owen Barnes,Companies,What's the name of the American bookseller fou...,Barnes & noble
6,7,Simon Evans Cycles,Simon Evans,Companies,Which British cycle retailer was founded in 1921?,Evans cycles
7,8,Donna Coleslaw,Donna Coles,Food,Which side dish consists mainly of shredded ra...,Coleslaw
8,9,Will Suttony Stark,Will Sutton,Characters,"Who is the a genius, billionaire, playboy, phi...",Tony stark
9,10,Shahzad Ziaardvark,Shahzad Zia,Animals,Which animal is frequently confused with the a...,Aardvark
