## Reshaping and Pivot Tables

### Data Frame Pivot

In [None]:
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/flights.csv')
df.head()

This dataset contains 144 rows. It's quite difficult to look at them at once. We can apply the following

In [None]:
df.pivot(index='year', columns='month', values='passengers')


Now we can see them all together to get some insights! For example, summertime flights were popular from the beginning and their number grew faster.
As you can see there are a couple of changes:

- Index and columns now have names: "year" and "month".
- The "year" column is now the index.
- The "month" has been transformed into columns.
- Each passenger value is now located where a year and a month collide.

! All further methods are Pandas functions. The first argument (the data argument) needs to be passed to the processed dataframe. It produces the same result:

In [None]:
pd.pivot(df, index='year', columns='month', values='passengers')


## DataFrame.pivot_table


In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/mpg.csv')
df.head()

Suppose, we have to find the mean horsepower number of the cars presented by country and year. The `.pivot_table()` should contain an aggregator in the aggfunc argument. It finds the mean value by default. We need to specify the required values of the index and columns and we will round the results up to 1 decimal for neatness:

In [None]:
df.pivot_table(index='origin', columns='model_year', values='horsepower').round(1)


## DataFrame.melt
- if our data is pivoted and we want to make it flat, we can use the .melt() method. Let's create a sample wide_df:

In [None]:
wide_df = df.pivot_table(index='origin', columns='model_year', values='horsepower').round(2)
wide_df.reset_index(inplace=True)

Let's transform all "year" columns into one by calling .melt() with suitable parameters. For the id_vars argument, we set the column name that contains an identifier. In our case, it is the "origin". Further, we should set the value_vars argument to the list of columns. .melt() by default takes all other columns for value_vars , so we just omit this argument. Then output the first 10 rows of the resulting dataframe:



In [None]:
wide_df.melt(id_vars='origin').head(10)


In [None]:
wide_df.melt(id_vars='origin', value_vars=wide_df.columns[-3:])
