## **Data Cleaning and Transformation - EA Sports FIFA 21 Dataset**

### Table of Contents
- Introduction
- Importing Data
- Understanding Data
- Columns to Clean
- Exporting Clean Data
- Conclusion

## <span style="color: green;">**Introduction**</span>

EA Sports FIFA 21 is a popular video game that simulates soccer matches. Often, data collected from this game might be messy, containing inconsistencies, missing values, and various formatting issues. In this project, I will focus on cleaning and preparing messy FIFA 21 data for analysis using Python and Pandas.

## <span style="color: green;">**Importing Data**</span>

In [1]:
# Importing libraries
import pandas as pd 
import numpy as np

In [2]:
# Read in the dataset
pd.set_option('display.max_columns', None) # displays all columns in the data frame
data = pd.read_csv('D:/Data Cleaning Project/data/fifa21_raw_data.csv', low_memory=False)
data #Checking data

Unnamed: 0,ID,Name,LongName,photoUrl,playerUrl,Nationality,Age,↓OVA,POT,Club,Contract,Positions,Height,Weight,Preferred Foot,BOV,Best Position,Joined,Loan Date End,Value,Wage,Release Clause,Attacking,Crossing,Finishing,Heading Accuracy,Short Passing,Volleys,Skill,Dribbling,Curve,FK Accuracy,Long Passing,Ball Control,Movement,Acceleration,Sprint Speed,Agility,Reactions,Balance,Power,Shot Power,Jumping,Stamina,Strength,Long Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing Tackle,Sliding Tackle,Goalkeeping,GK Diving,GK Handling,GK Kicking,GK Positioning,GK Reflexes,Total Stats,Base Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits
0,158023,L. Messi,Lionel Messi,https://cdn.sofifa.com/players/158/023/21_60.png,http://sofifa.com/player/158023/lionel-messi/2...,Argentina,33,93,93,\n\n\n\nFC Barcelona,2004 ~ 2021,"RW, ST, CF",170cm,72kg,Left,93,RW,"Jul 1, 2004",,€103.5M,€560K,€138.4M,429,85,95,70,91,88,470,96,93,94,91,96,451,91,80,91,94,95,389,86,68,72,69,94,347,44,40,93,95,75,96,91,32,35,24,54,6,11,15,14,8,2231,466,4 ★,4★,Medium,Low,5 ★,85,92,91,95,38,65,771
1,20801,Cristiano Ronaldo,C. Ronaldo dos Santos Aveiro,https://cdn.sofifa.com/players/020/801/21_60.png,http://sofifa.com/player/20801/c-ronaldo-dos-s...,Portugal,35,92,92,\n\n\n\nJuventus,2018 ~ 2022,"ST, LW",187cm,83kg,Right,92,ST,"Jul 10, 2018",,€63M,€220K,€75.9M,437,84,95,90,82,86,414,88,81,76,77,92,431,87,91,87,95,71,444,94,95,84,78,93,353,63,29,95,82,84,95,84,28,32,24,58,7,11,15,14,11,2221,464,4 ★,5★,High,Low,5 ★,89,93,81,89,35,77,562
2,200389,J. Oblak,Jan Oblak,https://cdn.sofifa.com/players/200/389/21_60.png,http://sofifa.com/player/200389/jan-oblak/210006/,Slovenia,27,91,93,\n\n\n\nAtlético Madrid,2014 ~ 2023,GK,188cm,87kg,Right,91,GK,"Jul 16, 2014",,€120M,€125K,€159.4M,95,13,11,15,43,13,109,12,13,14,40,30,307,43,60,67,88,49,268,59,78,41,78,12,140,34,19,11,65,11,68,57,27,12,18,437,87,92,78,90,90,1413,489,3 ★,1★,Medium,Medium,3 ★,87,92,78,90,52,90,150
3,192985,K. De Bruyne,Kevin De Bruyne,https://cdn.sofifa.com/players/192/985/21_60.png,http://sofifa.com/player/192985/kevin-de-bruyn...,Belgium,29,91,91,\n\n\n\nManchester City,2015 ~ 2023,"CAM, CM",181cm,70kg,Right,91,CAM,"Aug 30, 2015",,€129M,€370K,€161M,407,94,82,55,94,82,441,88,85,83,93,92,398,77,76,78,91,76,408,91,63,89,74,91,408,76,66,88,94,84,91,186,68,65,53,56,15,13,5,10,13,2304,485,5 ★,4★,High,High,4 ★,76,86,93,88,64,78,207
4,190871,Neymar Jr,Neymar da Silva Santos Jr.,https://cdn.sofifa.com/players/190/871/21_60.png,http://sofifa.com/player/190871/neymar-da-silv...,Brazil,28,91,91,\n\n\n\nParis Saint-Germain,2017 ~ 2022,"LW, CAM",175cm,68kg,Right,91,LW,"Aug 3, 2017",,€132M,€270K,€166.5M,408,85,87,62,87,87,448,95,88,89,81,95,453,94,89,96,91,83,357,80,62,81,50,84,356,51,36,87,90,92,93,94,35,30,29,59,9,9,15,15,11,2175,451,5 ★,5★,High,Medium,5 ★,91,85,86,94,36,59,595
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
18974,247223,Xia Ao,Ao Xia,https://cdn.sofifa.com/players/247/223/21_60.png,http://sofifa.com/player/247223/ao-xia/210006/,China PR,21,47,55,\n\n\n\nWuhan Zall,2018 ~ 2022,CB,178cm,66kg,Right,49,CB,"Jul 13, 2018",,€100K,€1K,€70K,145,23,26,43,26,27,142,27,23,21,29,42,294,68,60,69,46,51,221,36,57,54,50,24,192,48,50,28,28,38,44,147,45,52,50,45,7,8,5,14,11,1186,255,2 ★,2★,Medium,Medium,1 ★,64,28,26,38,48,51,
18975,258760,B. Hough,Ben Hough,https://cdn.sofifa.com/players/258/760/21_60.png,http://sofifa.com/player/258760/ben-hough/210006/,England,17,47,67,\n\n\n\nOldham Athletic,2020 ~ 2021,CM,175cm,65kg,Right,51,CAM,"Aug 1, 2020",,€130K,€500,€165K,211,38,42,40,56,35,219,46,40,35,50,48,305,63,64,61,51,66,226,48,58,43,47,30,193,40,23,47,47,36,38,116,32,44,40,45,12,10,9,6,8,1315,281,2 ★,2★,Medium,Medium,1 ★,64,40,48,49,35,45,
18976,252757,R. McKinley,Ronan McKinley,https://cdn.sofifa.com/players/252/757/21_60.png,http://sofifa.com/player/252757/ronan-mckinley...,England,18,47,65,\n\n\n\nDerry City,2019 ~ 2020,CM,179cm,74kg,Right,49,CAM,"Mar 8, 2019",,€120K,€500,€131K,200,30,34,43,54,39,207,43,39,31,47,47,290,59,66,51,47,67,242,45,52,50,54,41,230,56,42,47,43,42,43,121,33,43,45,48,13,12,6,6,11,1338,285,2 ★,2★,Medium,Medium,1 ★,63,39,44,46,40,53,
18977,243790,Wang Zhen'ao,Zhen'ao Wang,https://cdn.sofifa.com/players/243/790/21_60.png,http://sofifa.com/player/243790/zhenao-wang/21...,China PR,20,47,57,\n\n\n\nDalian YiFang FC,2020 ~ 2022,RW,175cm,69kg,Right,48,ST,"Sep 22, 2020",,€100K,€2K,€88K,215,45,52,34,42,42,194,51,35,31,31,46,254,62,55,50,33,54,235,56,45,46,48,40,190,31,25,42,46,46,45,100,26,32,42,55,14,12,9,8,12,1243,271,3 ★,2★,Medium,Medium,1 ★,58,49,41,49,30,44,


## <span style="color: green;">**Understanding Data**</span>

In [3]:
data.columns

Index(['ID', 'Name', 'LongName', 'photoUrl', 'playerUrl', 'Nationality', 'Age',
       '↓OVA', 'POT', 'Club', 'Contract', 'Positions', 'Height', 'Weight',
       'Preferred Foot', 'BOV', 'Best Position', 'Joined', 'Loan Date End',
       'Value', 'Wage', 'Release Clause', 'Attacking', 'Crossing', 'Finishing',
       'Heading Accuracy', 'Short Passing', 'Volleys', 'Skill', 'Dribbling',
       'Curve', 'FK Accuracy', 'Long Passing', 'Ball Control', 'Movement',
       'Acceleration', 'Sprint Speed', 'Agility', 'Reactions', 'Balance',
       'Power', 'Shot Power', 'Jumping', 'Stamina', 'Strength', 'Long Shots',
       'Mentality', 'Aggression', 'Interceptions', 'Positioning', 'Vision',
       'Penalties', 'Composure', 'Defending', 'Marking', 'Standing Tackle',
       'Sliding Tackle', 'Goalkeeping', 'GK Diving', 'GK Handling',
       'GK Kicking', 'GK Positioning', 'GK Reflexes', 'Total Stats',
       'Base Stats', 'W/F', 'SM', 'A/W', 'D/W', 'IR', 'PAC', 'SHO', 'PAS',
       'DRI', 'DEF', 

In [4]:
data.info() # View data frame information

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 18979 entries, 0 to 18978
Data columns (total 77 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   ID                18979 non-null  int64 
 1   Name              18979 non-null  object
 2   LongName          18979 non-null  object
 3   photoUrl          18979 non-null  object
 4   playerUrl         18979 non-null  object
 5   Nationality       18979 non-null  object
 6   Age               18979 non-null  int64 
 7   ↓OVA              18979 non-null  int64 
 8   POT               18979 non-null  int64 
 9   Club              18979 non-null  object
 10  Contract          18979 non-null  object
 11  Positions         18979 non-null  object
 12  Height            18979 non-null  object
 13  Weight            18979 non-null  object
 14  Preferred Foot    18979 non-null  object
 15  BOV               18979 non-null  int64 
 16  Best Position     18979 non-null  object
 17  Joined      

In [5]:
# Converting column name with spaces to snake_case
data.columns=data.columns.str.replace(' ','_')

# Removing photoUrl and playerUrl columns, as these are not relevant for any analysis.
data.drop(['photoUrl', 'playerUrl'], axis=1, inplace=True)

# Sample data of objects columns
data.select_dtypes(include=['object']).sample(5)

Unnamed: 0,Name,LongName,Nationality,Club,Contract,Positions,Height,Weight,Preferred_Foot,Best_Position,Joined,Loan_Date_End,Value,Wage,Release_Clause,W/F,SM,A/W,D/W,IR,Hits
18341,P. Johansen,Peder Meen Johansen,Norway,\n\n\n\nSandefjord Fotball,2020 ~ 2023,CM,167cm,60kg,Right,CAM,"Jun 28, 2020",,€300K,€500,€293K,3 ★,3★,Medium,Medium,1 ★,
18941,J. Arthur,Jack Arthur,England,\n\n\n\nExeter City,2020 ~ 2021,GK,175cm,70kg,Right,GK,"Jul 1, 2020",,€100K,€500,€119K,3 ★,1★,Medium,Medium,1 ★,
6532,Jeong Woo Yeong,Woo Yeong Jeong,Korea Republic,\n\n\n\nSC Freiburg,2019 ~ 2023,"RM, LM, CAM",179cm,69kg,Right,RM,"Jul 1, 2019",,€2.9M,€7K,€3.2M,4 ★,4★,Medium,Medium,1 ★,50.0
18357,C. Finch,Cristóbal Finch,Chile,\n\n\n\nUniversidad Católica,2020 ~ 2022,CB,182cm,77kg,Right,CB,"Jan 1, 2020",,€240K,€500,€264K,3 ★,2★,Medium,Medium,1 ★,
3977,I. Franco,Iván Franco,Paraguay,\n\n\n\nClub Libertad,2018 ~ 2023,CF,165cm,63kg,Right,CF,"Jul 1, 2018",,€4.1M,€700,€8.9M,3 ★,4★,High,Medium,1 ★,22.0


## <span style="color: green;">**Columns to Clean**</span>

- Club
- Contract
- Height
- Weight
- Joined
- Loan_Date_End
- Value
- Wage
- Release_Clause
- W/F
- SM
- IR
- Hits

### <span style="color: green;">Create a copy dataframe</span>

Create a copy so that editing does not affect the original data frame

In [6]:
fifa = data.copy() 
fifa.sample(3) 

Unnamed: 0,ID,Name,LongName,Nationality,Age,↓OVA,POT,Club,Contract,Positions,Height,Weight,Preferred_Foot,BOV,Best_Position,Joined,Loan_Date_End,Value,Wage,Release_Clause,Attacking,Crossing,Finishing,Heading_Accuracy,Short_Passing,Volleys,Skill,Dribbling,Curve,FK_Accuracy,Long_Passing,Ball_Control,Movement,Acceleration,Sprint_Speed,Agility,Reactions,Balance,Power,Shot_Power,Jumping,Stamina,Strength,Long_Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing_Tackle,Sliding_Tackle,Goalkeeping,GK_Diving,GK_Handling,GK_Kicking,GK_Positioning,GK_Reflexes,Total_Stats,Base_Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits
6717,210527,P. Paye,Pape Paye,France,30,68,68,\n\n\n\nFC Sochaux-Montbéliard,2019 ~ 2021,RB,169cm,65kg,Right,68,RB,"Jul 12, 2019",,€1M,€3K,€1.2M,233,54,31,49,65,34,241,45,33,42,56,65,390,80,85,76,65,84,294,52,66,80,63,33,280,70,66,53,44,47,60,188,57,66,65,58,11,11,10,13,13,1684,364,3 ★,2★,Medium,High,1 ★,83,38,55,57,62,69,1
14979,257076,L. Morante,Leandro Morante,France,23,60,68,\n\n\n\nLa Berrichonne de Châteauroux,2020 ~ 2022,CB,194cm,88kg,Right,62,CB,"Jun 24, 2020",,€500K,€900,€536K,190,32,12,66,57,23,177,24,35,23,49,46,240,54,54,48,50,34,276,55,67,56,78,20,210,58,59,20,41,32,44,169,53,62,54,50,6,14,10,6,14,1312,284,3 ★,2★,Medium,Medium,1 ★,54,24,45,35,58,68,2
13133,224793,B. Wembangomo,Brice Wembangomo,Norway,23,62,70,\n\n\n\nSandefjord Fotball,2019 ~ 2021,RB,181cm,75kg,Right,63,RWB,"Jan 18, 2019",,€850K,€700,€580K,244,56,45,48,57,38,267,64,54,40,46,63,366,81,83,74,53,75,284,45,73,66,58,42,252,63,56,45,48,40,49,170,54,59,57,61,16,15,14,9,7,1644,361,3 ★,3★,Medium,Medium,1 ★,82,44,52,65,56,62,3


### <span style="color: green;">Column: Club (Cleaning String)</span> 

The Data Club column has a lot of gaps.
The solution is to remove the gaps with str.strip()

View and check data types:

In [7]:
fifa['Club'].dtype

dtype('O')

In [8]:
fifa['Club'].unique()

array(['\n\n\n\nFC Barcelona', '\n\n\n\nJuventus',
       '\n\n\n\nAtlético Madrid', '\n\n\n\nManchester City',
       '\n\n\n\nParis Saint-Germain', '\n\n\n\nFC Bayern München',
       '\n\n\n\nLiverpool', '\n\n\n\nReal Madrid', '\n\n\n\nChelsea',
       '\n\n\n\nTottenham Hotspur', '\n\n\n\nInter', '\n\n\n\nNapoli',
       '\n\n\n\nBorussia Dortmund', '\n\n\n\nManchester United',
       '\n\n\n\nArsenal', '\n\n\n\nLazio', '\n\n\n\nLeicester City',
       '\n\n\n\nBorussia Mönchengladbach', '\n\n\n\nReal Sociedad',
       '\n\n\n\nAtalanta', '\n\n\n\nOlympique Lyonnais', '\n\n\n\nMilan',
       '\n\n\n\nVillarreal CF', '\n\n\n\nRB Leipzig', '\n\n\n\nCagliari',
       '\n\n\n\nAjax', '\n\n\n\nSL Benfica', '\n\n\n\nAS Monaco',
       '\n\n\n\nWolverhampton Wanderers', '\n\n\n\nEverton',
       '\n\n\n\nFiorentina', '\n\n\n\nFC Porto', '\n\n\n\nRC Celta',
       '\n\n\n\nTorino', '\n\n\n\nSevilla FC', '\n\n\n\nGrêmio',
       '\n\n\n\nReal Betis', '\n\n\n\nRoma', '\n\n\n\nNewcastle Unite

Clear gaps in data frame:

In [9]:
fifa['Club'] = fifa['Club'].str.strip()
fifa['Club'].unique()

array(['FC Barcelona', 'Juventus', 'Atlético Madrid', 'Manchester City',
       'Paris Saint-Germain', 'FC Bayern München', 'Liverpool',
       'Real Madrid', 'Chelsea', 'Tottenham Hotspur', 'Inter', 'Napoli',
       'Borussia Dortmund', 'Manchester United', 'Arsenal', 'Lazio',
       'Leicester City', 'Borussia Mönchengladbach', 'Real Sociedad',
       'Atalanta', 'Olympique Lyonnais', 'Milan', 'Villarreal CF',
       'RB Leipzig', 'Cagliari', 'Ajax', 'SL Benfica', 'AS Monaco',
       'Wolverhampton Wanderers', 'Everton', 'Fiorentina', 'FC Porto',
       'RC Celta', 'Torino', 'Sevilla FC', 'Grêmio', 'Real Betis', 'Roma',
       'Newcastle United', 'Eintracht Frankfurt', 'Valencia CF',
       'Medipol Başakşehir FK', 'Inter Miami', 'Bayer 04 Leverkusen',
       'Levante UD', 'Crystal Palace', 'Athletic Club de Bilbao',
       'Shanghai SIPG FC', 'VfL Wolfsburg',
       'Guangzhou Evergrande Taobao FC', 'Al Shabab',
       'Olympique de Marseille', 'Los Angeles FC',
       'Beijing Sino

### <span style="color: green;">Column: Contract (Cleaning/Transforming Date Column)</span>

View and check data types:

In [10]:
fifa['Contract'].dtype

dtype('O')

In [11]:
fifa['Contract'].unique()

array(['2004 ~ 2021', '2018 ~ 2022', '2014 ~ 2023', '2015 ~ 2023',
       '2017 ~ 2022', '2017 ~ 2023', '2018 ~ 2024', '2014 ~ 2022',
       '2018 ~ 2023', '2016 ~ 2023', '2013 ~ 2023', '2011 ~ 2023',
       '2009 ~ 2022', '2005 ~ 2021', '2011 ~ 2021', '2015 ~ 2022',
       '2017 ~ 2024', '2010 ~ 2024', '2012 ~ 2021', '2019 ~ 2024',
       '2015 ~ 2024', '2017 ~ 2025', '2020 ~ 2025', '2019 ~ 2023',
       '2008 ~ 2023', '2015 ~ 2021', '2020 ~ 2022', '2012 ~ 2022',
       '2016 ~ 2025', '2013 ~ 2022', '2011 ~ 2022', '2012 ~ 2024',
       '2016 ~ 2021', '2012 ~ 2023', '2008 ~ 2022', '2019 ~ 2022',
       '2017 ~ 2021', '2013 ~ 2024', '2020 ~ 2024', '2010 ~ 2022',
       '2020 ~ 2021', '2011 ~ 2024', '2020 ~ 2023', '2014 ~ 2024',
       '2013 ~ 2026', '2016 ~ 2022', '2010 ~ 2021', '2013 ~ 2021',
       '2019 ~ 2025', '2018 ~ 2025', '2016 ~ 2024', '2018 ~ 2021',
       '2009 ~ 2024', '2007 ~ 2022', 'Jun 30, 2021 On Loan',
       '2009 ~ 2021', '2019 ~ 2021', '2019 ~ 2026', 'Free', '2012 ~ 

Run the for each row loop by using dataframe.iterrows(), then print the result of each Contract to the screen:

In [12]:
for index, row in fifa.iterrows():
    if 'On Loan' in row['Contract'] or 'Free' in row['Contract']:
        print(row['Contract'])

Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Free
Free
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Free
Free
Free
Free
Free
Free
Free
Free
Free
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Free
Free
Free
Free
Jun 30, 2021 On Loan
Dec 31, 2020 On Loan
Jun 30, 2021 On Loan
Free
Jun 30, 2021 On Loan
Free
Free
Free
Free
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Free
Jun 30, 2021 On Loan
Free
Jun 30, 2021 On Loan
Jan 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Free
Dec 31, 2020 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Free
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Jun 30, 2021 On Loan
Free
Jun 30, 2021 On Loan
Free
Free
Free

Create the extract_contract_infor function, then apply the newly created function to create 3 columns as 'Contract Start', 'Contract End', 'Contract Length(years)':

In [13]:
def extract_contract_info(contract):
    if contract == 'Free' or 'On Loan' in contract:
        start_date = np.nan
        end_date = np.nan
        contract_length = 0
    else:
        start_date, end_date = contract.split(' ~ ')
        start_year = int(start_date[:4])
        end_year = int(end_date[:4])
        contract_length = end_year - start_year
    return start_date, end_date, contract_length

#Apply fn to Contract column & create new columns

new_cols = ['Contract Start', 'Contract End', 'Contract Length(years)']
new_data = fifa['Contract'].apply(lambda x: pd.Series(extract_contract_info(x)))

for i in range(len(new_cols)):
    try:
        fifa.insert(loc=fifa.columns.get_loc('Contract')+1+i, column=new_cols[i], value=new_data[i])
    except ValueError as e:
        print(f"Error inserting column '{new_cols[i]}': {str(e)}")

In [14]:
fifa.sample(3)

Unnamed: 0,ID,Name,LongName,Nationality,Age,↓OVA,POT,Club,Contract,Contract Start,Contract End,Contract Length(years),Positions,Height,Weight,Preferred_Foot,BOV,Best_Position,Joined,Loan_Date_End,Value,Wage,Release_Clause,Attacking,Crossing,Finishing,Heading_Accuracy,Short_Passing,Volleys,Skill,Dribbling,Curve,FK_Accuracy,Long_Passing,Ball_Control,Movement,Acceleration,Sprint_Speed,Agility,Reactions,Balance,Power,Shot_Power,Jumping,Stamina,Strength,Long_Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing_Tackle,Sliding_Tackle,Goalkeeping,GK_Diving,GK_Handling,GK_Kicking,GK_Positioning,GK_Reflexes,Total_Stats,Base_Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits
17835,242375,A. Eldeen Andejani,Ahmed Emad Eldeen Andejani,Saudi Arabia,27,55,57,Al Ittihad,2020 ~ 2023,2020,2023,3.0,"CB, LB",187cm,72kg,Left,57,CB,"Feb 5, 2020",,€130K,€5K,€142K,198,36,34,55,43,30,168,35,31,24,39,39,269,50,54,46,53,66,262,37,73,46,67,39,202,52,54,27,33,36,41,167,56,56,55,49,13,5,13,10,8,1315,278,2 ★,2★,Low,Medium,1 ★,52,35,37,40,55,59,
18535,242211,H. Woods,Henry Woods,England,20,52,63,Gillingham,2018 ~ 2021,2018,2021,3.0,"CM, CAM, RB",187cm,73kg,Right,56,RM,"Jan 16, 2018",,€170K,€1K,€176K,224,49,40,49,55,31,243,54,58,37,46,48,313,69,67,51,56,70,333,70,69,67,68,59,245,56,48,57,49,35,40,143,55,47,41,55,10,12,7,12,14,1556,336,3 ★,2★,High,High,1 ★,68,50,51,53,49,65,
5866,168317,H. Goitom,Henok Goitom,Eritrea,35,69,69,AIK,2017 ~ 2020,2017,2020,3.0,ST,189cm,85kg,Right,69,ST,"Mar 7, 2017",,€600K,€4K,€594K,330,55,67,68,67,73,322,68,63,58,61,72,267,49,51,52,70,45,322,70,48,55,89,60,328,63,49,78,73,65,76,109,55,34,20,57,16,10,14,6,11,1735,366,2 ★,3★,Medium,Medium,1 ★,50,67,64,67,45,73,2.0


View data with 3 columns 'Contract', 'Contract Start', 'Contract Length(years)':

In [15]:
fifa[['Contract', 'Contract Start', 'Contract Length(years)']].sample(5)

Unnamed: 0,Contract,Contract Start,Contract Length(years)
5726,2016 ~ 2021,2016,5.0
2239,2018 ~ 2023,2018,5.0
14695,2020 ~ 2021,2020,1.0
12410,2020 ~ 2023,2020,3.0
16515,2020 ~ 2022,2020,2.0


Contract categories:

In [16]:
def categorize_contract_status(contract):
    if contract == 'Free':
        return 'Free'
    elif 'On Loan' in contract:
        return 'On Loan'
    else:
        return 'Contract'

#Add contract status column
fifa. insert(fifa.columns.get_loc('Contract Length(years)')+1, 'Contract Status', fifa['Contract'].apply(categorize_contract_status))
fifa.sample(3)

Unnamed: 0,ID,Name,LongName,Nationality,Age,↓OVA,POT,Club,Contract,Contract Start,Contract End,Contract Length(years),Contract Status,Positions,Height,Weight,Preferred_Foot,BOV,Best_Position,Joined,Loan_Date_End,Value,Wage,Release_Clause,Attacking,Crossing,Finishing,Heading_Accuracy,Short_Passing,Volleys,Skill,Dribbling,Curve,FK_Accuracy,Long_Passing,Ball_Control,Movement,Acceleration,Sprint_Speed,Agility,Reactions,Balance,Power,Shot_Power,Jumping,Stamina,Strength,Long_Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing_Tackle,Sliding_Tackle,Goalkeeping,GK_Diving,GK_Handling,GK_Kicking,GK_Positioning,GK_Reflexes,Total_Stats,Base_Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits
10849,242434,C. Jones,Curtis Jones,England,19,64,84,Liverpool,2018 ~ 2025,2018,2025,7.0,Contract,"CM, CAM, LM",185cm,68kg,Right,69,CAM,"Feb 1, 2018",,€1.6M,€9K,€2.7M,285,54,60,50,68,53,310,71,61,47,61,70,337,67,65,74,61,70,309,66,57,64,57,65,301,65,44,62,70,60,64,146,47,52,47,44,6,13,7,11,7,1732,369,4 ★,4★,High,Medium,1 ★,66,62,63,70,48,60,227
4564,199689,A. Sepúlveda,Ángel Sepúlveda,Mexico,29,70,70,Querétaro,2020 ~ 2024,2020,2024,4.0,Contract,"ST, LM, RM",180cm,75kg,Right,70,ST,"Jul 1, 2020",,€1.6M,€8K,€3M,335,67,65,70,68,65,333,68,66,65,63,71,357,76,78,76,69,58,365,71,79,73,77,65,303,54,48,69,66,66,66,109,30,41,38,64,16,14,11,12,11,1866,392,3 ★,3★,High,High,1 ★,77,66,66,69,42,72,6
13631,250786,B. Jackson,Ben Jackson,England,19,62,77,Huddersfield Town,2019 ~ 2022,2019,2022,3.0,Contract,"LB, RB",178cm,73kg,Left,62,LB,"Jul 1, 2019",,€900K,€2K,€1.5M,216,58,33,52,41,32,220,64,37,41,29,49,329,82,82,65,54,46,235,34,64,55,46,36,251,58,58,50,40,45,45,191,56,70,65,54,14,11,10,10,9,1496,330,3 ★,2★,High,Medium,1 ★,82,35,42,58,61,52,8


View data with 3 columns 'Contract', 'Contract Start', 'Contract Length(years)', 'Contract Status'

In [17]:
fifa[['Contract', 'Contract Start', 'Contract Length(years)', 'Contract Status']].sample(5)

Unnamed: 0,Contract,Contract Start,Contract Length(years),Contract Status
12386,2019 ~ 2022,2019,3.0,Contract
10892,2019 ~ 2024,2019,5.0,Contract
12038,2018 ~ 2024,2018,6.0,Contract
17041,2020 ~ 2024,2020,4.0,Contract
1456,2017 ~ 2020,2017,3.0,Contract


### <span style="color: green;">Column: Joined (Transforming Date Column)</span>

In [18]:
# Converting Joined and Loan_Date_End column from object to datetime
data['Joined']=pd.to_datetime(data['Joined'])
data['Loan_Date_End']=pd.to_datetime(data['Loan_Date_End'])

### <span style="color: green;">Column: Height</span>

View and check data types

In [19]:
fifa['Height'].dtype

dtype('O')

In [20]:
fifa['Height'].unique()

array(['170cm', '187cm', '188cm', '181cm', '175cm', '184cm', '191cm',
       '178cm', '193cm', '185cm', '199cm', '173cm', '168cm', '176cm',
       '177cm', '183cm', '180cm', '189cm', '179cm', '195cm', '172cm',
       '182cm', '186cm', '192cm', '165cm', '194cm', '167cm', '196cm',
       '163cm', '190cm', '174cm', '169cm', '171cm', '197cm', '200cm',
       '166cm', '6\'2"', '164cm', '198cm', '6\'3"', '6\'5"', '5\'11"',
       '6\'4"', '6\'1"', '6\'0"', '5\'10"', '5\'9"', '5\'6"', '5\'7"',
       '5\'4"', '201cm', '158cm', '162cm', '161cm', '160cm', '203cm',
       '157cm', '156cm', '202cm', '159cm', '206cm', '155cm'], dtype=object)

Create functions to convert data (Player's height in centimeter).

Use strip("cm") to delete "cm" in the data frame, split(" ' ") to divide the Height into two parts

In [21]:
def convert_height(height):
    if "cm" in height:
        return int(height.strip("cm"))
    else:
        feet, inches = height.split("'")
        total_inches = int(feet)*12 + int(inches.strip('"'))
        return round(total_inches * 2.54) 
    
# Apply fn to height column
fifa['Height'] = fifa['Height'].apply(convert_height)

In [22]:
fifa['Height'].unique()

array([170, 187, 188, 181, 175, 184, 191, 178, 193, 185, 199, 173, 168,
       176, 177, 183, 180, 189, 179, 195, 172, 182, 186, 192, 165, 194,
       167, 196, 163, 190, 174, 169, 171, 197, 200, 166, 164, 198, 201,
       158, 162, 161, 160, 203, 157, 156, 202, 159, 206, 155], dtype=int64)

Rename the Height column to Height(cm)

In [23]:
fifa = fifa.rename(columns={'Height':"Height(cm)"})
fifa.sample(3)

Unnamed: 0,ID,Name,LongName,Nationality,Age,↓OVA,POT,Club,Contract,Contract Start,Contract End,Contract Length(years),Contract Status,Positions,Height(cm),Weight,Preferred_Foot,BOV,Best_Position,Joined,Loan_Date_End,Value,Wage,Release_Clause,Attacking,Crossing,Finishing,Heading_Accuracy,Short_Passing,Volleys,Skill,Dribbling,Curve,FK_Accuracy,Long_Passing,Ball_Control,Movement,Acceleration,Sprint_Speed,Agility,Reactions,Balance,Power,Shot_Power,Jumping,Stamina,Strength,Long_Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing_Tackle,Sliding_Tackle,Goalkeeping,GK_Diving,GK_Handling,GK_Kicking,GK_Positioning,GK_Reflexes,Total_Stats,Base_Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits
16682,252374,N. Ntolla Thio,Natanael Ntolla Thio,France,21,58,70,FC Sochaux-Montbéliard,2019 ~ 2023,2019.0,2023.0,4.0,Contract,ST,185,71kg,Left,60,ST,"Aug 2, 2019",,€500K,€950,€449K,246,30,68,59,48,41,221,65,38,33,34,51,311,67,61,62,57,64,281,57,64,55,46,59,222,40,11,55,48,68,52,54,16,18,20,47,6,9,10,11,11,1382,295,2 ★,2★,Medium,Medium,1 ★,64,62,41,60,20,48,
3577,224428,J. Otero,Juan Ferney Otero,Colombia,25,72,75,Amiens SC,2018 ~ 2021,2018.0,2021.0,3.0,Contract,"RM, ST, RW",182,70kg,Right,75,ST,"Jul 1, 2018",,€3.1M,€5K,€7.3M,322,60,70,65,63,64,337,72,51,75,69,70,384,88,88,78,62,68,391,87,79,74,77,74,274,36,35,73,69,61,73,76,21,30,25,59,14,8,15,12,10,1843,397,3 ★,3★,Medium,Low,1 ★,88,74,65,71,31,68,17.0
9935,247883,S. Torres,Saúl Torres,Bolivia,30,65,65,No Club,Free,,,0.0,Free,RB,180,68kg,Right,65,RB,"Jul 5, 2019",,€0,€0,€0,228,68,28,45,59,28,220,56,30,31,63,40,322,70,71,54,51,76,268,25,64,87,64,28,234,60,61,46,32,35,57,190,55,68,67,55,13,9,15,10,8,1517,334,3 ★,2★,High,Medium,1 ★,71,29,53,52,60,69,7.0


### <span style="color: green;">Column: Weight</span>

View and check data types

In [24]:
fifa['Weight'].dtype

dtype('O')

In [25]:
fifa['Weight'].unique()

array(['72kg', '83kg', '87kg', '70kg', '68kg', '80kg', '71kg', '91kg',
       '73kg', '85kg', '92kg', '69kg', '84kg', '96kg', '81kg', '82kg',
       '75kg', '86kg', '89kg', '74kg', '76kg', '64kg', '78kg', '90kg',
       '66kg', '60kg', '94kg', '79kg', '67kg', '65kg', '59kg', '61kg',
       '93kg', '88kg', '97kg', '77kg', '62kg', '63kg', '95kg', '100kg',
       '58kg', '183lbs', '179lbs', '172lbs', '196lbs', '176lbs', '185lbs',
       '170lbs', '203lbs', '168lbs', '161lbs', '146lbs', '130lbs',
       '190lbs', '174lbs', '148lbs', '165lbs', '159lbs', '192lbs',
       '181lbs', '139lbs', '154lbs', '157lbs', '163lbs', '98kg', '103kg',
       '99kg', '102kg', '56kg', '101kg', '57kg', '55kg', '104kg', '107kg',
       '110kg', '53kg', '50kg', '54kg', '52kg'], dtype=object)

Create functions to convert data (The weight of the player in kilograms).

Use strip ("kg") to delete "kg" in the data frame, strip("lbs") to delete "lbs" in the data frame

In [26]:
def convert_weight(weight):
    if "kg" in weight:
        return (weight.strip("kg"))
    else:
        total_kg = int(weight.strip("lbs"))/2.20426
        return round(total_kg)
#Apply fn to Weight
fifa['Weight'] = fifa['Weight'].apply(convert_weight)
fifa['Weight'].unique()

array(['72', '83', '87', '70', '68', '80', '71', '91', '73', '85', '92',
       '69', '84', '96', '81', '82', '75', '86', '89', '74', '76', '64',
       '78', '90', '66', '60', '94', '79', '67', '65', '59', '61', '93',
       '88', '97', '77', '62', '63', '95', '100', '58', 83, 81, 78, 89,
       80, 84, 77, 92, 76, 73, 66, 59, 86, 79, 67, 75, 72, 87, 82, 63, 70,
       71, 74, '98', '103', '99', '102', '56', '101', '57', '55', '104',
       '107', '110', '53', '50', '54', '52'], dtype=object)

Rename the Height column to Height(cm)

In [27]:
fifa = fifa.rename(columns={'Weight': 'Weight(kg)'})
fifa.sample(1)

Unnamed: 0,ID,Name,LongName,Nationality,Age,↓OVA,POT,Club,Contract,Contract Start,Contract End,Contract Length(years),Contract Status,Positions,Height(cm),Weight(kg),Preferred_Foot,BOV,Best_Position,Joined,Loan_Date_End,Value,Wage,Release_Clause,Attacking,Crossing,Finishing,Heading_Accuracy,Short_Passing,Volleys,Skill,Dribbling,Curve,FK_Accuracy,Long_Passing,Ball_Control,Movement,Acceleration,Sprint_Speed,Agility,Reactions,Balance,Power,Shot_Power,Jumping,Stamina,Strength,Long_Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing_Tackle,Sliding_Tackle,Goalkeeping,GK_Diving,GK_Handling,GK_Kicking,GK_Positioning,GK_Reflexes,Total_Stats,Base_Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits
17636,256088,M. Carcelén,Michael Carcelén,Ecuador,23,55,63,El Nacional,2020 ~ 2024,2020,2024,4.0,Contract,CM,175,75,Right,57,CM,"Jan 1, 2020",,€275K,€500,€282K,229,38,49,48,64,30,233,41,37,39,58,58,316,68,63,58,55,72,288,47,62,70,54,55,233,50,49,47,48,39,36,117,35,36,46,53,6,6,14,14,13,1469,314,3 ★,2★,Medium,High,1 ★,65,48,52,50,41,58,


### <span style="color: green;">Column: Loan Date End (Cleaning/Transforming Date Column)</span>

View and check data types:

In [28]:
fifa['Loan_Date_End'].dtype

dtype('O')

In [29]:
fifa['Loan_Date_End'].unique()

array([nan, 'Jun 30, 2021', 'Dec 31, 2020', 'Jan 30, 2021',
       'Jun 30, 2022', 'May 31, 2021', 'Jul 5, 2021', 'Dec 31, 2021',
       'Jul 1, 2021', 'Jan 1, 2021', 'Aug 31, 2021', 'Jan 31, 2021',
       'Dec 30, 2021', 'Jun 23, 2021', 'Jan 3, 2021', 'Nov 27, 2021',
       'Jan 17, 2021', 'Jun 30, 2023', 'Jul 31, 2021', 'Nov 22, 2020',
       'May 31, 2022', 'Dec 30, 2020', 'Jan 4, 2021', 'Nov 30, 2020',
       'Aug 1, 2021'], dtype=object)

Create on_loan with 'Contract Status' as 'On Loan':

In [30]:
on_loan = fifa[fifa['Contract Status'] == 'On Loan']
on_loan[['Contract', 'Contract Status', 'Loan_Date_End']]

Unnamed: 0,Contract,Contract Status,Loan_Date_End
205,"Jun 30, 2021 On Loan",On Loan,"Jun 30, 2021"
248,"Jun 30, 2021 On Loan",On Loan,"Jun 30, 2021"
254,"Jun 30, 2021 On Loan",On Loan,"Jun 30, 2021"
302,"Jun 30, 2021 On Loan",On Loan,"Jun 30, 2021"
306,"Jun 30, 2021 On Loan",On Loan,"Jun 30, 2021"
...,...,...,...
18472,"Aug 31, 2021 On Loan",On Loan,"Aug 31, 2021"
18571,"Jun 30, 2021 On Loan",On Loan,"Jun 30, 2021"
18600,"Dec 31, 2020 On Loan",On Loan,"Dec 31, 2020"
18622,"Dec 31, 2020 On Loan",On Loan,"Dec 31, 2020"


### <span style="color: green;">Column: Value, Wage, Release_Clause </span>

In [31]:
# Examining the Value, Wage and Release_Clause columns together
fifa.loc[:,['Value', 'Wage', 'Release_Clause']]

Unnamed: 0,Value,Wage,Release_Clause
0,€103.5M,€560K,€138.4M
1,€63M,€220K,€75.9M
2,€120M,€125K,€159.4M
3,€129M,€370K,€161M
4,€132M,€270K,€166.5M
...,...,...,...
18974,€100K,€1K,€70K
18975,€130K,€500,€165K
18976,€120K,€500,€131K
18977,€100K,€2K,€88K


In [32]:
# Getting currency symbol and financial suffix from Value column
symbol=[]
suffix=[]
for x in range(len(data['Value'])):
    symbol.append(data['Value'][x][:1])
    suffix.append(data['Value'][x][-1:])
    
print(list(set(symbol)))
print(list(set(suffix)))

['€']
['M', '0', 'K']


In [33]:
# Getting currency symbol and financial suffix from Wage column
symbol=[]
suffix=[]
for x in range(len(data['Wage'])):
    symbol.append(data['Wage'][x][:1])
    suffix.append(data['Wage'][x][-1:])
    
print(list(set(symbol)))
print(list(set(suffix)))

['€']
['0', 'K']


In [34]:
# Getting currency symbol and financial suffix from Release_Clause column
symbol=[]
suffix=[]
for x in range(len(data['Release_Clause'])):
    symbol.append(data['Release_Clause'][x][:1])
    suffix.append(data['Release_Clause'][x][-1:])
    
print(list(set(symbol)))
print(list(set(suffix)))

['€']
['M', '0', 'K']


In [35]:
def convert(value):
    if isinstance(value, str):
        if value.find('K')!=-1:
            value=value.replace('K','').replace('€','')
            return float(value)*1000
        elif value.find('M')!=-1:
            value=value.replace('M','').replace('€','')
            return float(value)*10e6
        elif value.find('€')!=-1:
            value=value.replace('€','')
            return float(value)
        else:
            return value
    else:
        return value
    
# Applying function to Value column
fifa['Value(€)']=fifa['Value'].apply(convert)  #rename
fifa.drop(['Value'], axis=1, inplace=True) 

# Applying function to Wage column
fifa['Wage(€)']=fifa['Wage'].apply(convert) #rename 
fifa.drop(['Wage'], axis=1, inplace=True)

# Applying function to Release_Clause column
fifa['Release_Clause(€)']=fifa['Release_Clause'].apply(convert) #rename
fifa.drop(['Release_Clause'], axis=1, inplace=True)

### <span style="color: green;">Column: W/F, SM, IR</span>

In [36]:
# Examing the W/F, SM and IR columns together
data.loc[:,['W/F', 'SM', 'IR']]

Unnamed: 0,W/F,SM,IR
0,4 ★,4★,5 ★
1,4 ★,5★,5 ★
2,3 ★,1★,3 ★
3,5 ★,4★,4 ★
4,5 ★,5★,5 ★
...,...,...,...
18974,2 ★,2★,1 ★
18975,2 ★,2★,1 ★
18976,2 ★,2★,1 ★
18977,3 ★,2★,1 ★


In [37]:
fifa['W/F'] = fifa['W/F'].str.replace('★','')
fifa['W/F'].unique()

array(['4 ', '3 ', '5 ', '2 ', '1 '], dtype=object)

In [38]:
fifa['SM'] = fifa['SM'].str.replace('★','')
fifa['SM'].unique()

array(['4', '5', '1', '2', '3'], dtype=object)

In [39]:
fifa['IR'] = fifa['IR'].str.replace('★','')
fifa['IR'].unique()

array(['5 ', '3 ', '4 ', '2 ', '1 '], dtype=object)

### <span style="color: green;">Column: Hits</span>

In [40]:
fifa['Hits'].dtype

dtype('O')

In [41]:
fifa['Hits'].unique()

array(['771', '562', '150', '207', '595', '248', '246', '120', '1.6K',
       '130', '321', '189', '175', '96', '118', '216', '212', '154',
       '205', '202', '339', '408', '103', '332', '86', '173', '161',
       '396', '1.1K', '433', '242', '206', '177', '1.5K', '198', '459',
       '117', '119', '209', '84', '187', '165', '203', '65', '336', '126',
       '313', '124', '145', '538', '182', '101', '45', '377', '99', '194',
       '403', '414', '593', '374', '245', '3.2K', '266', '299', '309',
       '215', '265', '211', '112', '337', '70', '159', '688', '116', '63',
       '144', '123', '71', '224', '113', '168', '61', '89', '137', '278',
       '75', '148', '176', '197', '264', '214', '247', '402', '440',
       '1.7K', '2.3K', '171', '320', '657', '87', '259', '200', '255',
       '253', '196', '60', '97', '85', '169', '256', '132', '239', '166',
       '121', '109', '32', '46', '122', '48', '527', '199', '282', '51',
       '1.9K', '642', '155', '323', '288', '497', '509', '79',

Coversion for Thousand(K) is needed


In [42]:
fifa['Hits'].unique()

array(['771', '562', '150', '207', '595', '248', '246', '120', '1.6K',
       '130', '321', '189', '175', '96', '118', '216', '212', '154',
       '205', '202', '339', '408', '103', '332', '86', '173', '161',
       '396', '1.1K', '433', '242', '206', '177', '1.5K', '198', '459',
       '117', '119', '209', '84', '187', '165', '203', '65', '336', '126',
       '313', '124', '145', '538', '182', '101', '45', '377', '99', '194',
       '403', '414', '593', '374', '245', '3.2K', '266', '299', '309',
       '215', '265', '211', '112', '337', '70', '159', '688', '116', '63',
       '144', '123', '71', '224', '113', '168', '61', '89', '137', '278',
       '75', '148', '176', '197', '264', '214', '247', '402', '440',
       '1.7K', '2.3K', '171', '320', '657', '87', '259', '200', '255',
       '253', '196', '60', '97', '85', '169', '256', '132', '239', '166',
       '121', '109', '32', '46', '122', '48', '527', '199', '282', '51',
       '1.9K', '642', '155', '323', '288', '497', '509', '79',

In [43]:
def convert_hits(hits):
    if isinstance(hits, str):
        if hits.find('K')!=-1:
            hits=hits.replace('K','')
            return float(hits)*1000
        else:
            return hits
    else:
        return hits
    
#Apply fn to Weight
fifa['Hits'] = fifa['Hits'].apply(convert_hits)
fifa['Hits'].unique()

array(['771', '562', '150', '207', '595', '248', '246', '120', 1600.0,
       '130', '321', '189', '175', '96', '118', '216', '212', '154',
       '205', '202', '339', '408', '103', '332', '86', '173', '161',
       '396', 1100.0, '433', '242', '206', '177', 1500.0, '198', '459',
       '117', '119', '209', '84', '187', '165', '203', '65', '336', '126',
       '313', '124', '145', '538', '182', '101', '45', '377', '99', '194',
       '403', '414', '593', '374', '245', 3200.0, '266', '299', '309',
       '215', '265', '211', '112', '337', '70', '159', '688', '116', '63',
       '144', '123', '71', '224', '113', '168', '61', '89', '137', '278',
       '75', '148', '176', '197', '264', '214', '247', '402', '440',
       1700.0, 2300.0, '171', '320', '657', '87', '259', '200', '255',
       '253', '196', '60', '97', '85', '169', '256', '132', '239', '166',
       '121', '109', '32', '46', '122', '48', '527', '199', '282', '51',
       1900.0, '642', '155', '323', '288', '497', '509', '79',

## <span style="color: green;">**Exporting Clean Data**</span>

In [44]:
data.to_csv('data/fifa21_cleaned_data.csv')


## <span style="color: green;">**Conclusion**</span>

The cleaned dataset is now ready for more advanced analysis, such as exploring player statistics, team performance, or other insights that can provide a deeper understanding of the FIFA 21 game.