# FIFA 20 overall prediction

Based on the stats of FIFA 19, we'll try to predict the overall of all the players in FIFA 20. Later on, we can use this model to predict the overalls of the players from the future FIAFs.
We'll use libraries such as Pandas and Matplotlib to analyze given data and machine learning models from the Scikit library to construct prediction models to accurately predict the overalls of the players.

## Importing the datasets

Let's start by opening the dataset of the players from FIFA 19 and FIFA 20. We'll use Pandas to open and store the datasets in two different dataframes. We'll select only a handful of columns to reduce the data size as much as possible.

In [5]:
import pandas as pd
import numpy as np

cols_to_use = ['sofifa_id','short_name', 'age', 'height_cm',
               'weight_kg', 'nationality', 'club', 'overall',
               'potential', 'value_eur', 'weak_foot', 'skill_moves',
               'team_position', 
               'pace', 'shooting', 'passing', 'dribbling', 'defending', 'physic'] 
fifa19 = pd.read_csv('fifa-19.csv', usecols=cols_to_use)
fifa20 = pd.read_csv('fifa-20.csv', usecols=cols_to_use)

fifa19.head(10)

Unnamed: 0,sofifa_id,short_name,age,height_cm,weight_kg,nationality,club,overall,potential,value_eur,weak_foot,skill_moves,team_position,pace,shooting,passing,dribbling,defending,physic
0,20801,Cristiano Ronaldo,33,187,83,Portugal,Juventus,94,94,77000000,4,5,LW,90.0,93.0,81.0,89.0,35.0,79.0
1,158023,L. Messi,31,170,72,Argentina,FC Barcelona,94,94,110500000,4,4,RW,88.0,91.0,88.0,96.0,32.0,61.0
2,190871,Neymar Jr,26,175,68,Brazil,Paris Saint-Germain,92,93,118500000,5,5,CAM,92.0,84.0,83.0,95.0,32.0,59.0
3,193080,De Gea,27,193,76,Spain,Manchester United,91,93,72000000,3,1,GK,,,,,,
4,192985,K. De Bruyne,27,181,70,Belgium,Manchester City,91,92,102000000,5,4,RCM,77.0,86.0,92.0,87.0,60.0,78.0
5,155862,Sergio Ramos,32,184,82,Spain,Real Madrid,91,91,51000000,3,3,LCB,75.0,63.0,71.0,71.0,91.0,84.0
6,176580,L. Suárez,31,182,86,Uruguay,FC Barcelona,91,91,80000000,4,3,ST,80.0,90.0,79.0,88.0,52.0,85.0
7,177003,L. Modrić,32,172,66,Croatia,Real Madrid,91,91,67000000,4,4,RCM,76.0,76.0,90.0,91.0,70.0,67.0
8,183277,E. Hazard,27,173,74,Belgium,Chelsea,91,91,93000000,4,4,LW,91.0,82.0,86.0,94.0,35.0,67.0
9,200389,J. Oblak,25,188,87,Slovenia,Atlético Madrid,90,93,68000000,3,1,GK,,,,,,


In [2]:
fifa20.head(10)

Unnamed: 0,sofifa_id,player_url,short_name,long_name,age,dob,height_cm,weight_kg,nationality,club,...,lwb,ldm,cdm,rdm,rwb,lb,lcb,cb,rcb,rb
0,158023,https://sofifa.com/player/158023/lionel-messi/...,L. Messi,Lionel Andrés Messi Cuccittini,32,1987-06-24,170,72,Argentina,FC Barcelona,...,68+2,66+2,66+2,66+2,68+2,63+2,52+2,52+2,52+2,63+2
1,20801,https://sofifa.com/player/20801/c-ronaldo-dos-...,Cristiano Ronaldo,Cristiano Ronaldo dos Santos Aveiro,34,1985-02-05,187,83,Portugal,Juventus,...,65+3,61+3,61+3,61+3,65+3,61+3,53+3,53+3,53+3,61+3
2,190871,https://sofifa.com/player/190871/neymar-da-sil...,Neymar Jr,Neymar da Silva Santos Junior,27,1992-02-05,175,68,Brazil,Paris Saint-Germain,...,66+3,61+3,61+3,61+3,66+3,61+3,46+3,46+3,46+3,61+3
3,200389,https://sofifa.com/player/200389/jan-oblak/20/...,J. Oblak,Jan Oblak,26,1993-01-07,188,87,Slovenia,Atlético Madrid,...,,,,,,,,,,
4,183277,https://sofifa.com/player/183277/eden-hazard/2...,E. Hazard,Eden Hazard,28,1991-01-07,175,74,Belgium,Real Madrid,...,66+3,63+3,63+3,63+3,66+3,61+3,49+3,49+3,49+3,61+3
5,192985,https://sofifa.com/player/192985/kevin-de-bruy...,K. De Bruyne,Kevin De Bruyne,28,1991-06-28,181,70,Belgium,Manchester City,...,77+3,77+3,77+3,77+3,77+3,73+3,66+3,66+3,66+3,73+3
6,192448,https://sofifa.com/player/192448/marc-andre-te...,M. ter Stegen,Marc-André ter Stegen,27,1992-04-30,187,85,Germany,FC Barcelona,...,,,,,,,,,,
7,203376,https://sofifa.com/player/203376/virgil-van-di...,V. van Dijk,Virgil van Dijk,27,1991-07-08,193,92,Netherlands,Liverpool,...,79+3,83+3,83+3,83+3,79+3,81+3,87+3,87+3,87+3,81+3
8,177003,https://sofifa.com/player/177003/luka-modric/2...,L. Modrić,Luka Modrić,33,1985-09-09,172,66,Croatia,Real Madrid,...,81+3,81+3,81+3,81+3,81+3,79+3,72+3,72+3,72+3,79+3
9,209331,https://sofifa.com/player/209331/mohamed-salah...,M. Salah,Mohamed Salah Ghaly,27,1992-06-15,175,71,Egypt,Liverpool,...,70+3,67+3,67+3,67+3,70+3,66+3,57+3,57+3,57+3,66+3


## Analyzing and editing the datasets

Now, we'll check out the basic stats of the dataframes.

In [6]:
print(fifa19.shape)
print(fifa20.shape)

(17770, 19)
(18278, 19)


The FIFA 19 dataframe has 17770 rows, whereas, the FIFA 20 one has 18278 rows, indicating a larger player database.

Now, let's add the FIFA 20 overall column to our FIFA 19 dataframe to help us train our model.

In [9]:
def get20overall (row):
    try:
        return fifa20[fifa20['sofifa_id'] == row.sofifa_id].overall.values[0]
    except:
        return np.NaN

fifa19['20_overall'] = fifa19.apply (lambda row: get20overall(row), axis=1)
fifa19 = fifa19.dropna()
fifa19.head()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 11592 entries, 0 to 17769
Data columns (total 20 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   sofifa_id      11592 non-null  int64  
 1   short_name     11592 non-null  object 
 2   age            11592 non-null  int64  
 3   height_cm      11592 non-null  int64  
 4   weight_kg      11592 non-null  int64  
 5   nationality    11592 non-null  object 
 6   club           11592 non-null  object 
 7   overall        11592 non-null  int64  
 8   potential      11592 non-null  int64  
 9   value_eur      11592 non-null  int64  
 10  weak_foot      11592 non-null  int64  
 11  skill_moves    11592 non-null  int64  
 12  team_position  11592 non-null  object 
 13  pace           11592 non-null  float64
 14  shooting       11592 non-null  float64
 15  passing        11592 non-null  float64
 16  dribbling      11592 non-null  float64
 17  defending      11592 non-null  float64
 18  physic

Now, for simplicaity's sake, let's omit all the goalkeepers from our dataframe.

In [10]:
fifa19 = fifa19[fifa19.team_position != "GK"]
fifa19.drop_duplicates(inplace=True)