# How do you transpose a Question/Answer dataset?
Recently, a friend came to me with an interesting challenge.  He had a dataset of questions and answers where each record contained a single question and answer to the question.  Arguably, this dataset was already in a [tidy](https://vita.had.co.nz/papers/tidy-data.pdf), but my friend wanted to transpose the data such that each unique question became a column of its own with the answers as values.  

Before I could come to his aid, my friend already found a [great answer](https://medium.com/@enricobergamini/creating-non-numeric-pivot-tables-with-python-pandas-7aa9dfd788a7) at [Medium.com](https://medium.com/) using the pandas function: [pivot_table](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pivot_table.html).

Here's what he did:

### Step 1: Import pandas

In [17]:
import pandas as pd

### Step 2: Load the dataset into a pandas dataframe

In [18]:
df = pd.read_csv('questions.csv', sep='\t')
df.head()

Unnamed: 0,person,question,answer
0,Sir Robin,What is your name?,Sir Robin of Camelot
1,Sir Robin,What is your quest?,To seek the Holy Grail
2,Sir Robin,What is the capital of Assyria?,I don't know that
3,Sir Lancelot,What is your name?,Sir Lancelot of Camelot
4,Sir Lancelot,What is your quest?,To seek the Holy Grail


### Step 3: Run pivot_table
The trick here is the *aggfunc* operation.  The *aggfunc* parameter is normally used to sum, average, or perform some other type of numeric operation on your *values* columns.  Interestingly, though, you can apparently supply your own custom function to this parameter instead.  Here, the Medium.com author found that he could simply loop through every letter of the answer and re-join them with spaces, effectively return the original answer.

In [19]:
df_pivotted = df.pivot_table(index='person', values=['answer'], 
                             columns=['question'], aggfunc=lambda x: ' '.join(str(v) for v in x))
df_pivotted.head()

Unnamed: 0_level_0,answer,answer,answer,answer,answer
question,What is the air speed of an unladened swallow?,What is the capital of Assyria?,What is your favorite colour?,What is your name?,What is your quest?
person,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
King Arthur,What do you mean? An African or European swal...,,,"Arthur, King of the Britons",I seek the Holy Grail
Sir Galahad,,,"Blue, no Yellow",Sir Galahad of Camelot,I seek the Grail
Sir Lancelot,,,Blue,Sir Lancelot of Camelot,To seek the Holy Grail
Sir Robin,,I don't know that,,Sir Robin of Camelot,To seek the Holy Grail


### That seems pretty complicated
The use of [pivot_table](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pivot_table.html) certainly works in this example and it's pretty sweet to see that you can pass your own custom function to it.  However, pandas also has a more generic, [pivot](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pivot.html) function.  Could that have worked here?

The answer is: yes.  When you google [pandas pivot vs pivot_table](https://www.google.com/search?q=pandas+pivot+vs+pivot_table), one of the top responses is [this Stackoverflow.com post](https://stackoverflow.com/questions/30960338/pandas-difference-between-pivot-and-pivot-table-why-is-only-pivot-table-workin) that suggests *pivot_table* only allows numerically-typed columns in the *values* parameter while *pivot* will take strings.  I don't think this is quite true, since the above example passed a string column to the *values* parameter, but it does suggest that *pivot* might be more disposed to working with strings that *pivot_table*.  Let's give it a try:

In [20]:
df.pivot(index='person', values='answer', columns='question')

question,What is the air speed of an unladened swallow?,What is the capital of Assyria?,What is your favorite colour?,What is your name?,What is your quest?
person,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
King Arthur,What do you mean? An African or European swal...,,,"Arthur, King of the Britons",I seek the Holy Grail
Sir Galahad,,,"Blue, no Yellow",Sir Galahad of Camelot,I seek the Grail
Sir Lancelot,,,Blue,Sir Lancelot of Camelot,To seek the Holy Grail
Sir Robin,,I don't know that,,Sir Robin of Camelot,To seek the Holy Grail


Whaddya know?!  Looks like we can do the same transformation somewhat easier with *pivot*.