**find blog here**: [link](https://towardsdatascience.com/3-awesome-visualization-techniques-for-every-dataset-9737eecacbe8)

In this post, we will see 3 cool visual tools:

- Categorical Correlation with Graphs,
- Pairplots,
- Swarmplots and Graph Annotations using Seaborn.

In [3]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
# We dont Probably need the Gridlines. Do we? If yes comment this line
sns.set(style = "ticks")
player_df = pd.read_csv("fifa19data.csv")
numcols = ['Overall','Potential','Crossing','Finishing','ShortPassing','Dribbling','LongPassing','BallControl','Acceleration','SprintSpeed','Agility','Stamina','Value','Wage']
catcols = ['Name','Club','Nationality','Preferred Foot','Position','Body Type']
#subset the columns
player_df =  player_df[numcols + catcols]
#view few rows of data
player_df.head()

Unnamed: 0,Overall,Potential,Crossing,Finishing,ShortPassing,Dribbling,LongPassing,BallControl,Acceleration,SprintSpeed,Agility,Stamina,Value,Wage,Name,Club,Nationality,Preferred Foot,Position,Body Type
0,94,94,84.0,95.0,90.0,97.0,87.0,96.0,91.0,86.0,91.0,72.0,€110.5M,€565K,L. Messi,FC Barcelona,Argentina,Left,RF,Messi
1,94,94,84.0,94.0,81.0,88.0,77.0,94.0,89.0,91.0,87.0,88.0,€77M,€405K,Cristiano Ronaldo,Juventus,Portugal,Right,ST,C. Ronaldo
2,92,93,79.0,87.0,84.0,96.0,78.0,95.0,94.0,90.0,96.0,81.0,€118.5M,€290K,Neymar Jr,Paris Saint-Germain,Brazil,Right,LW,Neymar
3,91,93,17.0,13.0,50.0,18.0,51.0,42.0,57.0,58.0,60.0,43.0,€72M,€260K,De Gea,Manchester United,Spain,Right,GK,Lean
4,91,92,93.0,82.0,92.0,86.0,91.0,91.0,78.0,76.0,79.0,90.0,€102M,€355K,K. De Bruyne,Manchester City,Belgium,Right,RCM,Normal


This is a nicely formatted data, yet we need to do some preprocessing to the Wage and Value columns(as they are in Euro and contain strings) to make them numeric for our subsequent analysis.

In [4]:
def wage_split(x):
    try:
        return int(x.split("k")[0][1:])
    except:
        return 0
player_df['Wage'] = player_df['Wage'].apply(lambda x:wage_split(x))
def value_split(x):
    try:
        if 'M' in x:
            return float(x.split("M")[0][1:])
        elif 'K' in x:
            return float(x.split("K")[0][1:])/1000
    except:
        return 0
player_df['Value'] = player_df['Value'].apply(lambda x : value_split(x))    

#### Categorical Correlation with Graphs

So if our predictor variable is positively or negatively correlated with our target variable, it is valuable.

In [None]:
corr = player_df.corr()
g =  sns.heatmap(corr, vmax = 3, center = 0,square =True, linewidths =.5,cbar_kws{'shrink': .5}, annot = True, fmt = '.2f',cmap = 'coolwarm')