# Faceting

Seaborn allows us to create a grid of charts related to multiple variables!

In [None]:


import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)


import os
print(os.listdir("../input"))



In [None]:
pd.set_option('max_columns',None)
df = pd.read_csv("../input/fifa-18-demo-player-dataset/CompleteDataset.csv", index_col=0)

import re
footballers = df.copy()
footballers['Unit'] = df['Value'].str[-1]
footballers['Value (M)'] = np.where(footballers['Unit'] == '0', 0,
                                   footballers['Value'].str[1:-1].replace(r'[a-zA-Z]',''))
footballers['Value (M)'] = footballers['Value (M)'].astype(float)
footballers['Value (M)'] = np.where(footballers['Unit'] =='M',
                                   footballers['Value (M)'],
                                   footballers['Value (M)']/1000)
footballers = footballers.assign(Value=footballers['Value (M)'],
                                Postion=footballers['Preferred Positions'].str.split().str[0])

In [None]:
footballers.head(100)

In [None]:
footballers.info()

In [None]:
footballers.rename(columns={'Postion': 'Position'}, inplace=True)

In [None]:
footballers.info()

In [None]:
import seaborn as sns

Lets break down the overalls for different positions.

In [None]:
df = footballers[footballers['Position'].isin(['ST','GK'])]
g = sns.FacetGrid(df, col="Position")
g.map(sns.kdeplot, "Overall")


What if we need to show more positions than the two above? `col_wrap` allows us to wrap multiple plots in many rows.  

In [None]:
df = footballers

g = sns.FacetGrid(df, col="Position", col_wrap=6)
g.map(sns.kdeplot, "Overall")

Seaborn allows us to plot data over rows and columns. Above the column has been position. In another example we might want to compare "Standard of Living" vs  "Education".

Below we compare Strikers and Goalkeepers at the two Madrid clubs and Barcelona.

In [None]:
df = footballers[footballers['Position'].isin(['ST','GK'])]
df = df[df['Club'].isin(['Real Madrid CF', 'FC Barcelona','Atlético Madrid'])]

g = sns.FacetGrid(df, row="Position", col="Club")
g.map(sns.violinplot, "Overall")

Who has the weakest defence in the Premier Leagues Top 6?

In [None]:
footballers['Club']

In [None]:
df = footballers[footballers['Position'].isin(['CB'])]
df = df[df['Club'].isin(['Arsenal', 'Tottenham Hotspur','Manchester City', 'Chelsea','Liverpool', 'Manchest United'])]

g = sns.FacetGrid(df, row="Position", col="Club")
g.map(sns.violinplot, "Overall")

We can order the subplots with `row_order` and `column_order` respectively. 

In [None]:
df = footballers[footballers['Position'].isin(['ST','GK'])]
df = df[df['Club'].isin(['Real Madrid CF', 'FC Barcelona','Atlético Madrid'])]

g = sns.FacetGrid(df, row="Position", col="Club",
                 row_order=['GK','ST'],
                 col_order=['Atlético Madrid','FC Barcelona','Real Madrid CF' ])
g.map(sns.violinplot, "Overall")

Faceting has limits, which are that it is sued to for single/paired categorical variables with low numeracy. You should not add more than five dimensions to the grid. 

# Pairplot

This will help us faceting variables as opposed to variable values. 

In [None]:
sns.pairplot(footballers[['Overall','Potential','Value']])

# Exercises

In [None]:
pokemon = pd.read_csv("../input/pokemon/Pokemon.csv", index_col=0)


In [None]:
g = sns.FacetGrid(pokemon, row="Legendary")
g.map(sns.kdeplot,"Attack")

In [None]:
g = sns.FacetGrid(pokemon, col="Legendary", row="Generation")
g.map(sns.kdeplot,"Attack")

In [None]:
sns.pairplot(pokemon[['HP','Attack','Defense']])