# Soccoer Player Salary Prediction 2023
#### Python, Machine Learning
Aaron Xie
___

# Table of Contents

___

<a id="problem"></a>
# 1. The Problem

One of the biggest events around the world in 2022 is that Lionel Messi and argentina national football team won the World Cup. This winning solidfies the fact that Messi is one of the greatest soccer players of all time. This fact also raises the main questions: 
* Should Messi have the highest salary because of his glory honors and rich experience? 
* Messi is already 35; can his age and other factors drag down his salary? 
* When football manager hire a soccer player, how can they determine the player's salary?

### 1.1. Goals
* Conduct EDA on this data set.
* Use machine learning algorithms to predict a player's salary.
* Find the most accurate machine learning algorithm in this case.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy
from sklearn.model_selection import train_test_split
#import warnings
#warnings.filterwarnings('ignore')

<a id="preparation"></a>
# 2. Data Preparation

### Import the Dataset
Since it is either hard or impossible to get the real world data of soccer players. This project uses a fictional data set from the game FIFA 2023. The data set was published in [Kaggle](https://www.kaggle.com/datasets/cashncarry/fifa-23-complete-player-dataset) by ALEX.

In [37]:
df = pd.read_csv(r"C:\Users\zong0\OneDrive\Documents\Data_Science\My_Projects\GitHub\Soccer-Player-Salary-Prediction\datasets\players_fifa23.csv")
train, test = train_test_split(df, test_size = 0.2)

In [38]:
# Create copies
train_df = train.copy(deep = True)
test_df = test.copy(deep = True)
# Create a cleaner variable to clean both dataset at the same time
cleaner = [train_df, test_df]

* **Don't clean dataset before splitting to avoid data leakage.**

### Describe the Data

In [39]:
train_df.shape

(14831, 90)

This data set has 89 features! Checking columns.

In [40]:
print(train_df.columns.values)

['ID' 'Name' 'FullName' 'Age' 'Height' 'Weight' 'PhotoUrl' 'Nationality'
 'Overall' 'Potential' 'Growth' 'TotalStats' 'BaseStats' 'Positions'
 'BestPosition' 'Club' 'ValueEUR' 'WageEUR' 'ReleaseClause' 'ClubPosition'
 'ContractUntil' 'ClubNumber' 'ClubJoined' 'OnLoad' 'NationalTeam'
 'NationalPosition' 'NationalNumber' 'PreferredFoot' 'IntReputation'
 'WeakFoot' 'SkillMoves' 'AttackingWorkRate' 'DefensiveWorkRate'
 'PaceTotal' 'ShootingTotal' 'PassingTotal' 'DribblingTotal'
 'DefendingTotal' 'PhysicalityTotal' 'Crossing' 'Finishing'
 'HeadingAccuracy' 'ShortPassing' 'Volleys' 'Dribbling' 'Curve'
 'FKAccuracy' 'LongPassing' 'BallControl' 'Acceleration' 'SprintSpeed'
 'Agility' 'Reactions' 'Balance' 'ShotPower' 'Jumping' 'Stamina'
 'Strength' 'LongShots' 'Aggression' 'Interceptions' 'Positioning'
 'Vision' 'Penalties' 'Composure' 'Marking' 'StandingTackle'
 'SlidingTackle' 'GKDiving' 'GKHandling' 'GKKicking' 'GKPositioning'
 'GKReflexes' 'STRating' 'LWRating' 'LFRating' 'CFRating' 'RFRat

In [41]:
pd.set_option('display.max_columns', None) # default is 20
train_df.sample(5)

Unnamed: 0,ID,Name,FullName,Age,Height,Weight,PhotoUrl,Nationality,Overall,Potential,Growth,TotalStats,BaseStats,Positions,BestPosition,Club,ValueEUR,WageEUR,ReleaseClause,ClubPosition,ContractUntil,ClubNumber,ClubJoined,OnLoad,NationalTeam,NationalPosition,NationalNumber,PreferredFoot,IntReputation,WeakFoot,SkillMoves,AttackingWorkRate,DefensiveWorkRate,PaceTotal,ShootingTotal,PassingTotal,DribblingTotal,DefendingTotal,PhysicalityTotal,Crossing,Finishing,HeadingAccuracy,ShortPassing,Volleys,Dribbling,Curve,FKAccuracy,LongPassing,BallControl,Acceleration,SprintSpeed,Agility,Reactions,Balance,ShotPower,Jumping,Stamina,Strength,LongShots,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Marking,StandingTackle,SlidingTackle,GKDiving,GKHandling,GKKicking,GKPositioning,GKReflexes,STRating,LWRating,LFRating,CFRating,RFRating,RWRating,CAMRating,LMRating,CMRating,RMRating,LWBRating,CDMRating,RWBRating,LBRating,CBRating,RBRating,GKRating
12134,266009,B. Egüez,Brahian Egüez,30,175,77,https://cdn.sofifa.net/players/266/009/23_60.png,Bolivia,63,63,0,1695,365,"RM,RW",RM,Club Deportivo Guabirá,475000,500,1000000,SUB,2022.0,18.0,2022,False,Not in team,,,Right,1,3,3,High,Medium,70,63,60,61,53,58,67,64,55,60,55,60,54,60,55,65,70,70,65,50,59,63,55,60,58,65,55,57,58,60,65,55,55,49,48,9,6,9,9,5,63,62,61,61,61,62,63,63,61,63,61,58,61,60,57,60,14
9553,110606,A. Mannus,Alan Mannus,40,188,94,https://cdn.sofifa.net/players/110/606/23_60.png,Northern Ireland,65,65,0,1127,354,GK,GK,Shamrock Rovers,80000,500,124000,GK,2022.0,1.0,2018,False,Not in team,,,Right,1,3,1,Medium,Medium,63,66,60,64,36,65,13,11,11,11,14,13,14,16,30,22,30,45,50,64,60,45,58,30,64,12,34,17,19,54,17,58,22,20,13,63,66,60,65,64,28,25,27,27,27,25,29,28,29,28,26,29,26,26,29,26,65
9415,271421,D. Doué,Désiré Doué,17,181,79,https://cdn.sofifa.net/players/271/421/23_60.png,France,66,84,18,1634,352,CAM,CAM,Stade Rennais FC,2100000,2000,5300000,CM,2024.0,33.0,2022,False,Not in team,,,Right,1,3,2,Medium,Medium,78,59,60,72,33,50,50,60,38,64,56,74,60,59,58,72,77,79,77,58,75,66,56,53,54,54,37,21,56,64,56,61,32,37,45,9,12,7,5,13,63,66,65,65,65,66,68,67,61,67,54,50,54,51,44,51,17
13412,263289,D. Antyukh,Denys Antyukh,24,183,78,https://cdn.sofifa.net/players/263/289/23_60.png,Ukraine,62,65,3,1585,351,LM,LM,Dynamo Kyiv,550000,500,1300000,SUB,2025.0,99.0,2021,False,Not in team,,,Left,1,3,3,Medium,Medium,70,56,61,58,44,62,66,56,38,64,46,58,53,37,56,60,71,70,63,55,59,54,43,63,64,61,61,62,56,63,49,44,38,45,30,12,7,8,5,12,59,61,59,59,59,61,63,63,61,63,58,57,58,56,52,56,16
7170,220895,A. Al Sulayhim,Abdulmajeed Al Sulayhim,28,171,61,https://cdn.sofifa.net/players/220/895/23_60.png,Saudi Arabia,68,68,0,1891,395,"CM,CDM",CAM,Al Nassr,1200000,15000,2000000,CM,2023.0,8.0,2020,False,Not in team,,,Right,1,3,3,High,Medium,73,60,67,73,58,64,56,61,53,72,46,72,70,62,69,70,78,68,88,68,89,62,79,76,58,59,59,60,66,68,50,66,56,60,59,8,15,11,9,14,66,68,68,68,68,68,68,68,68,68,67,67,67,66,63,66,20


In [42]:
train_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 14831 entries, 1416 to 2325
Data columns (total 90 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   ID                 14831 non-null  int64  
 1   Name               14831 non-null  object 
 2   FullName           14831 non-null  object 
 3   Age                14831 non-null  int64  
 4   Height             14831 non-null  int64  
 5   Weight             14831 non-null  int64  
 6   PhotoUrl           14831 non-null  object 
 7   Nationality        14831 non-null  object 
 8   Overall            14831 non-null  int64  
 9   Potential          14831 non-null  int64  
 10  Growth             14831 non-null  int64  
 11  TotalStats         14831 non-null  int64  
 12  BaseStats          14831 non-null  int64  
 13  Positions          14831 non-null  object 
 14  BestPosition       14831 non-null  object 
 15  Club               14831 non-null  object 
 16  ValueEUR           1

In [43]:
df.describe()

Unnamed: 0,ID,Age,Height,Weight,Overall,Potential,Growth,TotalStats,BaseStats,ValueEUR,WageEUR,ReleaseClause,ContractUntil,ClubNumber,ClubJoined,NationalNumber,IntReputation,WeakFoot,SkillMoves,PaceTotal,ShootingTotal,PassingTotal,DribblingTotal,DefendingTotal,PhysicalityTotal,Crossing,Finishing,HeadingAccuracy,ShortPassing,Volleys,Dribbling,Curve,FKAccuracy,LongPassing,BallControl,Acceleration,SprintSpeed,Agility,Reactions,Balance,ShotPower,Jumping,Stamina,Strength,LongShots,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Marking,StandingTackle,SlidingTackle,GKDiving,GKHandling,GKKicking,GKPositioning,GKReflexes,STRating,LWRating,LFRating,CFRating,RFRating,RWRating,CAMRating,LMRating,CMRating,RMRating,LWBRating,CDMRating,RWBRating,LBRating,CBRating,RBRating,GKRating
count,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18447.0,18447.0,18539.0,817.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0,18539.0
mean,237011.899725,25.240412,181.550839,75.173904,65.852042,71.016668,5.164626,1602.114569,357.946221,2875461.0,8824.537462,5081688.0,2023.772266,21.105329,2020.367442,12.29743,1.086305,2.946437,2.366147,68.017746,53.777874,58.024327,63.109553,50.241383,64.775338,49.476833,46.2553,51.846755,59.072226,42.513944,55.898754,47.695129,43.006689,53.568423,58.516263,64.725336,64.838341,63.518906,61.542586,64.05955,57.827661,64.795566,63.040455,65.152004,46.825719,55.668537,46.853282,50.520362,54.191542,47.994444,58.036625,46.747505,48.399159,46.264146,16.402125,16.157182,16.061007,16.205944,16.472895,56.725929,55.819138,55.714925,55.714925,55.714925,55.819138,57.950267,58.451319,57.374076,58.451319,56.281569,55.928583,56.281569,55.650251,54.528184,55.650251,23.257134
std,26714.860353,4.718163,6.858097,7.013593,6.788353,6.192866,5.374795,273.160237,39.628259,7635129.0,19460.531154,14672030.0,1.258955,18.129864,2.056372,6.810884,0.358753,0.673778,0.772428,10.649511,13.619867,9.71795,9.336566,16.392532,9.577715,17.887405,19.623881,17.318947,14.287698,17.635249,18.751691,17.910205,16.997758,14.633838,16.590051,15.280849,15.108259,14.90533,8.900297,14.483193,12.94987,12.293523,16.26933,12.622774,19.362064,16.905505,20.666647,19.660034,13.478006,15.730026,12.036272,20.350228,21.191644,20.701146,17.589457,16.924266,16.680839,17.089109,17.927602,13.475267,14.632018,14.2165,14.2165,14.2165,14.632018,13.905442,13.987122,13.171194,13.987122,13.903836,13.87219,13.903836,14.159466,14.743929,14.159466,15.108925
min,1179.0,16.0,155.0,49.0,47.0,48.0,0.0,759.0,224.0,0.0,0.0,0.0,2022.0,1.0,2002.0,1.0,1.0,1.0,1.0,28.0,16.0,25.0,28.0,15.0,30.0,6.0,3.0,5.0,10.0,3.0,3.0,6.0,4.0,9.0,5.0,14.0,15.0,18.0,30.0,20.0,18.0,22.0,14.0,25.0,4.0,10.0,3.0,2.0,10.0,6.0,13.0,3.0,6.0,6.0,2.0,2.0,2.0,2.0,2.0,19.0,14.0,15.0,15.0,15.0,14.0,17.0,18.0,18.0,18.0,17.0,19.0,17.0,17.0,18.0,17.0,10.0
25%,221663.0,21.0,177.0,70.0,62.0,67.0,0.0,1470.0,331.0,475000.0,1000.0,665000.0,2023.0,9.0,2020.0,6.0,1.0,3.0,2.0,62.0,44.0,52.0,58.0,36.0,58.0,39.0,31.0,44.0,55.0,30.0,51.0,36.0,32.0,45.0,55.0,57.0,57.0,55.0,56.0,56.0,48.0,57.0,56.0,57.0,32.0,45.0,26.0,40.0,45.0,39.0,51.0,29.0,28.0,26.0,8.0,8.0,8.0,8.0,8.0,51.0,50.0,50.0,50.0,50.0,50.0,52.0,54.0,53.0,54.0,51.0,48.0,51.0,49.0,45.0,49.0,17.0
50%,241195.0,25.0,182.0,75.0,66.0,71.0,4.0,1640.0,358.0,1000000.0,3000.0,1500000.0,2024.0,18.0,2021.0,12.0,1.0,3.0,2.0,69.0,56.0,59.0,64.0,54.0,66.0,54.0,50.0,55.0,62.0,44.0,61.0,49.0,42.0,56.0,63.0,68.0,68.0,66.0,62.0,66.0,59.0,65.0,66.0,66.0,51.0,58.0,54.0,56.0,56.0,49.0,59.0,53.0,56.0,53.0,11.0,11.0,11.0,11.0,11.0,60.0,59.0,59.0,59.0,59.0,59.0,61.0,62.0,60.0,62.0,59.0,59.0,59.0,59.0,58.0,59.0,18.0
75%,259059.0,29.0,186.0,80.0,70.0,75.0,9.0,1786.0,385.0,2000000.0,8000.0,3400000.0,2025.0,27.0,2022.0,18.0,1.0,3.0,3.0,75.0,64.0,65.0,69.0,64.0,72.0,63.0,62.0,64.0,68.0,56.0,68.0,61.0,55.0,64.0,69.0,75.0,75.0,74.0,67.0,74.0,68.0,73.0,74.0,74.0,62.0,68.0,64.0,64.0,64.0,60.0,66.0,63.0,66.0,63.0,14.0,14.0,14.0,14.0,14.0,66.0,65.0,65.0,65.0,65.0,65.0,67.0,67.0,66.0,67.0,66.0,66.0,66.0,65.0,66.0,65.0,20.0
max,271817.0,44.0,206.0,105.0,91.0,95.0,26.0,2312.0,502.0,190500000.0,450000.0,366700000.0,2032.0,99.0,2022.0,28.0,5.0,5.0,5.0,97.0,92.0,93.0,94.0,91.0,91.0,94.0,94.0,93.0,93.0,90.0,95.0,93.0,94.0,93.0,94.0,97.0,97.0,94.0,94.0,95.0,94.0,95.0,95.0,96.0,91.0,95.0,91.0,96.0,94.0,92.0,96.0,92.0,93.0,90.0,90.0,90.0,93.0,91.0,90.0,92.0,90.0,91.0,91.0,91.0,90.0,92.0,92.0,91.0,92.0,88.0,89.0,88.0,87.0,90.0,87.0,90.0


In [44]:
train_df.describe(include=['O'])

Unnamed: 0,Name,FullName,PhotoUrl,Nationality,Positions,BestPosition,Club,ClubPosition,NationalTeam,NationalPosition,PreferredFoot,AttackingWorkRate,DefensiveWorkRate
count,14831,14831,14831,14831,14831,14831,14831,14757,14831,660,14831,14831,14831
unique,14172,14700,14754,155,645,15,679,19,36,17,2,3,3
top,J. Taylor,Adama Traoré,https://cdn.sofifa.net/players/238/305/23_60.png,England,CB,CB,Free agent,SUB,Not in team,SUB,Right,Medium,Medium
freq,7,3,2,1307,1932,2919,74,6443,14171,355,11195,9711,10852


### Variables Overview
1. **Target/dependent/outcome Variable**: 'WageEUR'
2. **Variables with no impact on target** (should be excluded): 'ID', 'Name', 'Fullname', 'PhotoUrl', 'ClubNumber'(jersey number), 'NationalNumber'
3. **Categorical Variables**: 'Nationality', 'Position', 'BestPosition', 'Club', 'ClubPosition', 'Onload', 'NationalTeam', 'NationalPostion', 'PreferredFoot' 
4. **Ordinal (Categorical) Variables**: 'AttackingWorkRate', 'DefensiveWorkRate'
5. **Numerical Variables**: The others
6. For variables 'ClubJoined' and 'ContractUntil', it does not make sense to use year as features because those years cannot be used to predict the future cases. For example, 'ClubJoined' is smaller than 2022. We cannot use it to predict the salary in 2024. Therefore, I will leverage feature engineering to transform these two features into how many years they have stayed in the club and how many years their contracts will last.

# 3. Data Cleaning

### Correcting I

#### Removing Duplicates

In [45]:
# Check duplicate rows
train_df[train_df.duplicated(keep = False)==True].sort_values(by = ['ID']).head()

Unnamed: 0,ID,Name,FullName,Age,Height,Weight,PhotoUrl,Nationality,Overall,Potential,Growth,TotalStats,BaseStats,Positions,BestPosition,Club,ValueEUR,WageEUR,ReleaseClause,ClubPosition,ContractUntil,ClubNumber,ClubJoined,OnLoad,NationalTeam,NationalPosition,NationalNumber,PreferredFoot,IntReputation,WeakFoot,SkillMoves,AttackingWorkRate,DefensiveWorkRate,PaceTotal,ShootingTotal,PassingTotal,DribblingTotal,DefendingTotal,PhysicalityTotal,Crossing,Finishing,HeadingAccuracy,ShortPassing,Volleys,Dribbling,Curve,FKAccuracy,LongPassing,BallControl,Acceleration,SprintSpeed,Agility,Reactions,Balance,ShotPower,Jumping,Stamina,Strength,LongShots,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Marking,StandingTackle,SlidingTackle,GKDiving,GKHandling,GKKicking,GKPositioning,GKReflexes,STRating,LWRating,LFRating,CFRating,RFRating,RWRating,CAMRating,LMRating,CMRating,RMRating,LWBRating,CDMRating,RWBRating,LBRating,CBRating,RBRating,GKRating
1901,226045,J. Gallardo,Jesús Gallardo,27,176,73,https://cdn.sofifa.net/players/226/045/23_60.png,Mexico,75,75,0,2012,438,"LB,LM,LW",LB,Free agent,0,0,0,,,,2018,False,Mexico,LB,23.0,Left,1,3,3,High,Low,83,67,70,75,69,74,75,69,64,73,49,76,73,47,66,74,84,82,80,67,71,75,77,88,71,59,65,65,72,69,65,68,72,70,67,9,7,8,12,11,73,74,72,72,72,74,74,75,73,75,75,72,75,75,71,75,18
1657,226045,J. Gallardo,Jesús Gallardo,27,176,73,https://cdn.sofifa.net/players/226/045/23_60.png,Mexico,75,75,0,2012,438,"LB,LM,LW",LB,Free agent,0,0,0,,,,2018,False,Mexico,LB,23.0,Left,1,3,3,High,Low,83,67,70,75,69,74,75,69,64,73,49,76,73,47,66,74,84,82,80,67,71,75,77,88,71,59,65,65,72,69,65,68,72,70,67,9,7,8,12,11,73,74,72,72,72,74,74,75,73,75,75,72,75,75,71,75,18
1660,226536,O. Colley,Omar Colley,29,191,87,https://cdn.sofifa.net/players/226/536/23_60.png,Gambia,75,75,0,1590,355,CB,CB,U.C. Sampdoria,4700000,18000,8500000,CB,2025.0,15.0,2018,False,Not in team,,,Left,1,3,2,Medium,High,65,31,48,54,74,83,44,28,79,62,24,44,38,37,52,62,52,75,59,72,44,41,62,76,88,31,84,73,28,32,33,70,66,79,76,8,7,7,14,13,52,49,48,48,48,49,51,54,57,54,68,70,68,71,75,71,19
1430,226536,O. Colley,Omar Colley,29,191,87,https://cdn.sofifa.net/players/226/536/23_60.png,Gambia,75,75,0,1590,355,CB,CB,U.C. Sampdoria,4700000,18000,8500000,CB,2025.0,15.0,2018,False,Not in team,,,Left,1,3,2,Medium,High,65,31,48,54,74,83,44,28,79,62,24,44,38,37,52,62,52,75,59,72,44,41,62,76,88,31,84,73,28,32,33,70,66,79,76,8,7,7,14,13,52,49,48,48,48,49,51,54,57,54,68,70,68,71,75,71,19
1739,227232,F. Sotoca,Florian Sotoca,31,187,77,https://cdn.sofifa.net/players/227/232/23_60.png,France,75,75,0,1948,409,ST,ST,Racing Club de Lens,4800000,31000,9100000,RW,2024.0,7.0,2019,False,Not in team,,,Right,1,3,3,High,High,74,72,69,71,43,80,70,74,77,72,70,69,61,63,65,74,70,77,71,79,52,74,88,86,78,66,73,38,78,71,68,75,34,43,42,13,15,12,11,14,75,73,74,74,74,73,74,75,71,75,64,62,64,62,59,62,22


In [46]:
# Remove all duplicates and check the results
train_df = train_df.drop_duplicates()
test_df = test_df.drop_duplicates()
train_df.duplicated(keep = False).sum()

0

#### Handling Missing Values

In [47]:
# Checking missing columns
print('Train Dataframe:\n', train_df.isnull().sum()[train_df.isnull().sum() > 0])
print('Test Dataframe:\n', test_df.isnull().sum()[test_df.isnull().sum()> 0])

Train Dataframe:
 ClubPosition           73
ContractUntil          73
ClubNumber             73
NationalPosition    14107
NationalNumber      14107
dtype: int64
Test Dataframe:
 ClubPosition          18
ContractUntil         18
ClubNumber            18
NationalPosition    3549
NationalNumber      3549
dtype: int64


* 'ClubNumber' and 'NationalNumber' columns are already excluded because they have no impact on target variables. 
* Since NationalPosition has too many null values, it should be excluded, too. 

* For 'ClubPosition', I will fill it with a new category "NP" (No Position).
* For 'ContractUntil', I will fill it with median. 

In [48]:
# Dropping columns
column_todrop = ['ID', 'Name', 'FullName', 'PhotoUrl', 'ClubNumber', 'NationalNumber', 'NationalPosition']
train_df = train_df.drop(column_todrop, axis=1)
test_df = test_df.drop(column_todrop, axis=1)
cleaner = [train_df, test_df]

### Creating

#### Feature Engineering

In [49]:
# Transform 'ClubJoined' and 'ContractUntil'
for dataset in cleaner:
    dataset['ClubJoined'] = 2022 - dataset['ClubJoined']
    dataset['ContractUntil'] = dataset['ContractUntil'] - 2022

In [50]:
train_df[['ClubJoined','ContractUntil']].sample(2)

Unnamed: 0,ClubJoined,ContractUntil
4668,3,2.0
10955,1,3.0


### Completing

In [51]:
# Filling Null Values
for dataset in cleaner:
    dataset['ClubPosition'] = dataset['ClubPosition'].fillna('NP')
    dataset['ContractUntil'] = dataset['ContractUntil'].fillna(dataset['ContractUntil'].median())

In [52]:
# Checking Null Values again
train_df.isnull().sum()[train_df.isnull().sum() > 0]

Series([], dtype: int64)