<a id="top"></a>
<h1 style="font-family:verdana;"> <center>🏆 FIFA : Exploratory Data Analysis 🏆</center> </h1>
<p><center style="color:#159364; font-family:cursive;">Let's start with a little intro</center></p>

***
<div class="alert alert-block alert-info" style="font-size:14px; font-family:verdana; line-height: 1.7em;">
    📌 &nbsp;FIFA 21 is a football simulation video game published by Electronic Arts as part of the FIFA series. It is the 28th installment in the FIFA series, and was released on 06 October 2020 for Microsoft Windows, PlayStation 4, Xbox One, and Nintendo Switch. 
</div>

![images.jpg](https://images.livemint.com/img/2020/06/19/600x338/Fifa_1592566574619_1592566698613.png)

# **Table of Contents**

<a id = 'max 10'></a>

* [Importing Libraries and Exploring Data](#1)


* [Data Manipualtion and Cleaning](#2)
  * Deleted Useless Columns
  * Formed a new column of BMI using Weights and Heights column
  * Countered Null values using Get Dummies for Player positions
  * For columns like Dribbling, Defendind,Pace etc used Median value


* [Data Analysis](#3)
   * Relationship of Overall Rating and Value in Euros using Scatter Plot
   * Relationship of Potential Rating and Wage in Euros using Scatter Plot
   * Count of Preferred Foot and International Reputation using Count Plot
   * Distribution of Age and BMI using Histogram
   * Relationship of BMI vs Pace and Weight vs Pace using Reg Plot
   
   
   
* [Top 20 Analysis](#4)
   * Which Country has Max players in Top 20?
   * Which Club has Max players in Top 20?
   * Which type of player comes most in Top 20?
   * Which age belongs to max players in Top 20?
   
   
   
* [The Biggest Rivalry - Messi and Ronaldo](#5)
   * Comparison of different attributes using **Radar Plot**
   
   
* [Other Top Class Comparisons](#6)
   * Comparison of Kevin De Bryne and Luka Modric using Radar Plot
   * Comparison of Van Dijk and Sergio Ramos using Radar Plot
   * Comparison of Harry Kane and Sergio Aguero using Radar Plot
  

# **1. Importing Libraries & Data Exploration** <a class="anchor" id="1"></a>

In [None]:
import pandas as pd
pd.set_option('display.max_columns', None)
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from math import pi #Used in Radar Plots
import seaborn as sns
sns.set_style(style='darkgrid')
import warnings
warnings.filterwarnings('ignore')

In [None]:
fifa_21 = pd.read_csv('../input/fifa-21-complete-player-dataset/players_21.csv')

# **2. Data Manipulation & Cleaning** <a class="anchor" id="2"></a>

In [None]:
useless_column = ['dob','sofifa_id','player_url','long_name','body_type','real_face','nation_position','loaned_from','nation_jersey_number']

In [None]:
fifa_21 = fifa_21.drop(useless_column, axis = 1)

In [None]:
fifa_21['BMI'] = fifa_21 ['weight_kg'] / (fifa_21['height_cm'] / 100) ** 2

In [None]:
new_player_position = fifa_21['player_positions'].str.get_dummies(sep=',').add_prefix('Position')

In [None]:
fifa_21 =  pd.concat([fifa_21,new_player_position],axis = 1)
fifa_21 =  fifa_21.drop('player_positions',axis=1)

In [None]:
columns = ['ls','st','rs','lw','lf','cf','rf','rw','lam','cam','ram','lm','lcm','cm','rcm','rm','lwb','ldm', 'cdm','rdm','rwb','lb','lcb','cb','rcb','rb']

In [None]:
for col in columns:
  fifa_21[col] = fifa_21[col].str.split('+',n=1,expand = True)[0]
        

In [None]:
fifa_21[columns] = fifa_21[columns].fillna(0)
fifa_21[columns] = fifa_21[columns].astype(int)

In [None]:
columns = ['dribbling','defending','physic','passing','shooting','pace']

In [None]:
fifa_21[columns].isnull().sum()

dribbling    2083
defending    2083
physic       2083
passing      2083
shooting     2083
pace         2083
dtype: int64

In [None]:
for col in columns:
  fifa_21[col] = fifa_21[col].fillna(fifa_21[col].median())

In [None]:
fifa_21 = fifa_21.fillna(0)

In [None]:
fifa_21.isnull().sum() #Verifying it Should be all zero

short_name     0
age            0
height_cm      0
weight_kg      0
nationality    0
              ..
PositionRB     0
PositionRM     0
PositionRW     0
PositionRWB    0
PositionST     0
Length: 126, dtype: int64

# Starts from here:

In [None]:
input_parameters = ['overall', 'potential', 'passing', 'pace', 'defending']
columns = input_parameters + ['wage_eur']
fifa = pd.DataFrame(fifa_21, columns = columns)
print(fifa.shape)

(18944, 6)


In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
fifa = fifa.dropna()
print(fifa.shape)

(18944, 6)


In [None]:
from sklearn.model_selection import train_test_split

train, test = train_test_split(fifa, test_size=0.2)

In [None]:
 
X_train = train[input_parameters]
y_train = train[['wage_eur']]

X_test = test[input_parameters]
y_test = test[['wage_eur']]

print(X_train.shape)
print(y_train.shape)

(15155, 5)
(15155, 1)


In [None]:
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_y = StandardScaler()
X_train = sc_X.fit_transform(X_train)
y_train = sc_y.fit_transform(y_train)

X_test = sc_X.fit_transform(X_test)
y_test = sc_y.fit_transform(y_test)

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error

In [None]:
regressor = LinearRegression()
regressor.fit(X_train, y_train)

LinearRegression()

In [None]:
y_pred = regressor.predict(X_test)
print(r2_score(y_test, y_pred))

0.3574597978464925


In [None]:
print(mean_squared_error(y_test, y_pred))

0.6425402021535073


In [None]:
regressor = SVR(kernel = 'rbf')
regressor.fit(X_train, y_train)

SVR()

In [None]:
y_pred = regressor.predict(X_test)
print(r2_score(y_test, y_pred))

0.7483311568500739


In [None]:
print(mean_squared_error(y_test, y_pred))

0.251668843149926


In [None]:
ar = np.array([[[1],[2],[3]], [[2.01],[4.03],[6.04]]])
x = ar[0,:]
#xTest = np.array([[4],[5],[6]])
#ytest =  np.array([[9],[8.5],[14]])
preds = reg.predict(X_test)
er = []
for i in range(len(y_test)): 
  x = (y_test[i] - preds[i]) **2
  er.append(x)
v = np.var(er)
print ("variance", v)

NameError: ignored