# 0. Intro

The analysis below was done using data from FIFA 2021 players that were made available on the website www.kaggle.com.

The motivation is to clean and transform the data.
Cleaning and transforming the data is a crucial step in the data analysis and modeling process in any information project. This step is known as data pre-processing and is critical for several reasons.

Questões:
- Do the Height and Weight columns have the proper data types?
- Can you separate the joined column into yar, month and day columns?
- Can you clean and transform the calue, Wage and release Clause column into columns of interegers?
- How can you remove the newline characters from the Hits column?
- Shoud you separate the Team & Contract column into separate team and contract columns?

source dataset: https://www.kaggle.com/datasets/yagunnersya/fifa-21-messy-raw-dataset-for-cleaning-exploring?select=fifa21_raw_data.csv%E2%80%8B

# 1. Loading Libraries

In [1]:
import numpy as np
import pandas as pd

import datetime as dt

In [2]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# 2. Loading Dataset

In [3]:
df_project1 = pd.read_csv("C:/Users/renan/Downloads/fifa21_raw_data.csv", low_memory = False)

In [4]:
df_project1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 18979 entries, 0 to 18978
Data columns (total 77 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   photoUrl          18979 non-null  object
 1   LongName          18979 non-null  object
 2   playerUrl         18979 non-null  object
 3   Nationality       18979 non-null  object
 4   Positions         18979 non-null  object
 5   Name              18979 non-null  object
 6   Age               18979 non-null  int64 
 7   ↓OVA              18979 non-null  int64 
 8   POT               18979 non-null  int64 
 9   Team & Contract   18979 non-null  object
 10  ID                18979 non-null  int64 
 11  Height            18979 non-null  object
 12  Weight            18979 non-null  object
 13  foot              18979 non-null  object
 14  BOV               18979 non-null  int64 
 15  BP                18979 non-null  object
 16  Growth            18979 non-null  int64 
 17  Joined      

In [5]:
df_project1.shape

(18979, 77)

In [6]:
df_project1.columns

Index(['photoUrl', 'LongName', 'playerUrl', 'Nationality', 'Positions', 'Name',
       'Age', '↓OVA', 'POT', 'Team & Contract', 'ID', 'Height', 'Weight',
       'foot', 'BOV', 'BP', 'Growth', 'Joined', 'Loan Date End', 'Value',
       'Wage', 'Release Clause', 'Attacking', 'Crossing', 'Finishing',
       'Heading Accuracy', 'Short Passing', 'Volleys', 'Skill', 'Dribbling',
       'Curve', 'FK Accuracy', 'Long Passing', 'Ball Control', 'Movement',
       'Acceleration', 'Sprint Speed', 'Agility', 'Reactions', 'Balance',
       'Power', 'Shot Power', 'Jumping', 'Stamina', 'Strength', 'Long Shots',
       'Mentality', 'Aggression', 'Interceptions', 'Positioning', 'Vision',
       'Penalties', 'Composure', 'Defending', 'Marking', 'Standing Tackle',
       'Sliding Tackle', 'Goalkeeping', 'GK Diving', 'GK Handling',
       'GK Kicking', 'GK Positioning', 'GK Reflexes', 'Total Stats',
       'Base Stats', 'W/F', 'SM', 'A/W', 'D/W', 'IR', 'PAC', 'SHO', 'PAS',
       'DRI', 'DEF', 'PHY', 'Hits

In [7]:
df_project1.head()

Unnamed: 0,photoUrl,LongName,playerUrl,Nationality,Positions,Name,Age,↓OVA,POT,Team & Contract,ID,Height,Weight,foot,BOV,BP,Growth,Joined,Loan Date End,Value,Wage,Release Clause,Attacking,Crossing,Finishing,Heading Accuracy,Short Passing,Volleys,Skill,Dribbling,Curve,FK Accuracy,Long Passing,Ball Control,Movement,Acceleration,Sprint Speed,Agility,Reactions,Balance,Power,Shot Power,Jumping,Stamina,Strength,Long Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing Tackle,Sliding Tackle,Goalkeeping,GK Diving,GK Handling,GK Kicking,GK Positioning,GK Reflexes,Total Stats,Base Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits
0,https://cdn.sofifa.com/players/158/023/21_60.png,Lionel Messi,http://sofifa.com/player/158023/lionel-messi/2...,Argentina,RW ST CF,L. Messi,33,93,93,\n\n\n\nFC Barcelona\n2004 ~ 2021\n\n,158023,"5'7""",159lbs,Left,93,RW,0,"Jul 1, 2004",,€67.5M,€560K,€138.4M,429,85,95,70,91,88,470,96,93,94,91,96,451,91,80,91,94,95,389,86,68,72,69,94,347,44,40,93,95,75,96,91,32,35,24,54,6,11,15,14,8,2231,466,4 ★,4★,Medium,Low,5 ★,85,92,91,95,38,65,\n372
1,https://cdn.sofifa.com/players/020/801/21_60.png,C. Ronaldo dos Santos Aveiro,http://sofifa.com/player/20801/c-ronaldo-dos-s...,Portugal,ST LW,Cristiano Ronaldo,35,92,92,\n\n\n\nJuventus\n2018 ~ 2022\n\n,20801,"6'2""",183lbs,Right,92,ST,0,"Jul 10, 2018",,€46M,€220K,€75.9M,437,84,95,90,82,86,414,88,81,76,77,92,431,87,91,87,95,71,444,94,95,84,78,93,353,63,29,95,82,84,95,84,28,32,24,58,7,11,15,14,11,2221,464,4 ★,5★,High,Low,5 ★,89,93,81,89,35,77,\n344
2,https://cdn.sofifa.com/players/200/389/21_60.png,Jan Oblak,http://sofifa.com/player/200389/jan-oblak/210005/,Slovenia,GK,J. Oblak,27,91,93,\n\n\n\nAtlético Madrid\n2014 ~ 2023\n\n,200389,"6'2""",192lbs,Right,91,GK,2,"Jul 16, 2014",,€75M,€125K,€159.4M,95,13,11,15,43,13,109,12,13,14,40,30,307,43,60,67,88,49,268,59,78,41,78,12,140,34,19,11,65,11,68,57,27,12,18,437,87,92,78,90,90,1413,489,3 ★,1★,Medium,Medium,3 ★,87,92,78,90,52,90,\n86
3,https://cdn.sofifa.com/players/192/985/21_60.png,Kevin De Bruyne,http://sofifa.com/player/192985/kevin-de-bruyn...,Belgium,CAM CM,K. De Bruyne,29,91,91,\n\n\n\nManchester City\n2015 ~ 2023\n\n,192985,"5'11""",154lbs,Right,91,CAM,0,"Aug 30, 2015",,€87M,€370K,€161M,407,94,82,55,94,82,441,88,85,83,93,92,398,77,76,78,91,76,408,91,63,89,74,91,408,76,66,88,94,84,91,186,68,65,53,56,15,13,5,10,13,2304,485,5 ★,4★,High,High,4 ★,76,86,93,88,64,78,\n163
4,https://cdn.sofifa.com/players/190/871/21_60.png,Neymar da Silva Santos Jr.,http://sofifa.com/player/190871/neymar-da-silv...,Brazil,LW CAM,Neymar Jr,28,91,91,\n\n\n\nParis Saint-Germain\n2017 ~ 2022\n\n,190871,"5'9""",150lbs,Right,91,LW,0,"Aug 3, 2017",,€90M,€270K,€166.5M,408,85,87,62,87,87,448,95,88,89,81,95,453,94,89,96,91,83,357,80,62,81,50,84,356,51,36,87,90,92,93,94,35,30,29,59,9,9,15,15,11,2175,451,5 ★,5★,High,Medium,5 ★,91,85,86,94,36,59,\n273


In [8]:
df_project1.tail()

Unnamed: 0,photoUrl,LongName,playerUrl,Nationality,Positions,Name,Age,↓OVA,POT,Team & Contract,ID,Height,Weight,foot,BOV,BP,Growth,Joined,Loan Date End,Value,Wage,Release Clause,Attacking,Crossing,Finishing,Heading Accuracy,Short Passing,Volleys,Skill,Dribbling,Curve,FK Accuracy,Long Passing,Ball Control,Movement,Acceleration,Sprint Speed,Agility,Reactions,Balance,Power,Shot Power,Jumping,Stamina,Strength,Long Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing Tackle,Sliding Tackle,Goalkeeping,GK Diving,GK Handling,GK Kicking,GK Positioning,GK Reflexes,Total Stats,Base Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits
18974,https://cdn.sofifa.com/players/257/710/21_60.png,Mengxuan Zhang,http://sofifa.com/player/257710/mengxuan-zhang...,China PR,CB,Zhang Mengxuan,21,47,52,\n\n\n\nChongqing Dangdai Lifan FC SWM Team\n2...,257710,"5'10""",154lbs,Right,49,CB,5,"Aug 1, 2020",,€35K,€1K,€57K,141,23,16,45,31,26,111,19,21,22,25,24,279,60,56,45,48,70,223,35,65,55,45,23,179,44,50,25,25,35,40,148,45,56,47,53,12,13,8,14,6,1134,232,2 ★,2★,Low,Low,1 ★,58,23,26,27,50,48,\n2
18975,https://cdn.sofifa.com/players/258/736/21_60.png,Vani Da Silva,http://sofifa.com/player/258736/vani-da-silva/...,England,ST,V. Da Silva,17,47,67,\n\n\n\nOldham Athletic\n2020 ~ 2021\n\n,258736,"5'7""",128lbs,Right,51,CAM,20,"Aug 1, 2020",,€60K,€500,€165K,202,27,47,36,45,47,208,53,43,31,37,44,339,69,70,67,53,80,231,50,50,55,35,41,186,33,11,43,49,50,45,42,18,11,13,48,11,13,9,9,6,1256,265,2 ★,2★,Medium,Medium,1 ★,70,46,40,53,16,40,\n3
18976,https://cdn.sofifa.com/players/247/223/21_60.png,Ao Xia,http://sofifa.com/player/247223/ao-xia/210005/,China PR,CB,Xia Ao,21,47,55,\n\n\n\nWuhan Zall\n2018 ~ 2022\n\n,247223,"5'10""",146lbs,Right,49,CB,8,"Jul 13, 2018",,€40K,€1K,€70K,145,23,26,43,26,27,142,27,23,21,29,42,294,68,60,69,46,51,221,36,57,54,50,24,192,48,50,28,28,38,44,147,45,52,50,45,7,8,5,14,11,1186,255,2 ★,2★,Medium,Medium,1 ★,64,28,26,38,48,51,\n3
18977,https://cdn.sofifa.com/players/258/760/21_60.png,Ben Hough,http://sofifa.com/player/258760/ben-hough/210005/,England,CM,B. Hough,17,47,67,\n\n\n\nOldham Athletic\n2020 ~ 2021\n\n,258760,"5'9""",143lbs,Right,51,CAM,20,"Aug 1, 2020",,€60K,€500,€165K,211,38,42,40,56,35,219,46,40,35,50,48,305,63,64,61,51,66,226,48,58,43,47,30,193,40,23,47,47,36,38,116,32,44,40,45,12,10,9,6,8,1315,281,2 ★,2★,Medium,Medium,1 ★,64,40,48,49,35,45,\n5
18978,https://cdn.sofifa.com/players/255/958/21_60.png,Mateo Flores,http://sofifa.com/player/255958/mateo-flores/2...,Bolivia,CDM,M. Flores,19,47,63,\n\n\n\nClub Bolívar\n2020 ~ 2024\n\n,255958,"5'9""",150lbs,Right,48,CDM,16,"Jan 1, 2020",,€60K,€500,€167K,191,32,27,48,52,32,198,45,31,32,47,43,298,55,59,59,50,75,223,41,55,47,47,33,206,57,45,30,39,35,36,129,43,41,45,56,10,8,15,9,14,1301,273,3 ★,2★,Medium,Medium,1 ★,57,32,43,48,44,49,\n2


In [9]:
df_project1.dtypes

photoUrl            object
LongName            object
playerUrl           object
Nationality         object
Positions           object
Name                object
Age                  int64
↓OVA                 int64
POT                  int64
Team & Contract     object
ID                   int64
Height              object
Weight              object
foot                object
BOV                  int64
BP                  object
Growth               int64
Joined              object
Loan Date End       object
Value               object
Wage                object
Release Clause      object
Attacking            int64
Crossing             int64
Finishing            int64
Heading Accuracy     int64
Short Passing        int64
Volleys              int64
Skill                int64
Dribbling            int64
Curve                int64
FK Accuracy          int64
Long Passing         int64
Ball Control         int64
Movement             int64
Acceleration         int64
Sprint Speed         int64
A

# 3. Cleaning Height and Wieght

In [10]:
df_project1['Weight'].head()

0    159lbs
1    183lbs
2    192lbs
3    154lbs
4    150lbs
Name: Weight, dtype: object

In [11]:
df_project1['Weight'] = df_project1['Weight'].replace(r'lbs', '', regex=True)

In [12]:
df_project1['Weight'].head()

0    159
1    183
2    192
3    154
4    150
Name: Weight, dtype: object

In [13]:
df_project1['Weight'] = df_project1['Weight'].astype(float)

In [14]:
df_project1['Weight'].head()

0    159.0
1    183.0
2    192.0
3    154.0
4    150.0
Name: Weight, dtype: float64

In [15]:
df_project1['Height'].head()

0     5'7"
1     6'2"
2     6'2"
3    5'11"
4     5'9"
Name: Height, dtype: object

In [16]:
def formatar_altura(altura):
    altura_sem_aspas = altura.replace('"', '')  
    altura_formatada = altura_sem_aspas.replace("'", '.')
    return altura_formatada
df_project1['Height'] = df_project1['Height'].apply(formatar_altura)

In [17]:
df_project1['Height'].head()

0     5.7
1     6.2
2     6.2
3    5.11
4     5.9
Name: Height, dtype: object

In [18]:
df_project1['Height'] = df_project1['Height'].astype(float)

In [19]:
df_project1['Height'].dtypes

dtype('float64')

In [20]:
df_project1.head()

Unnamed: 0,photoUrl,LongName,playerUrl,Nationality,Positions,Name,Age,↓OVA,POT,Team & Contract,ID,Height,Weight,foot,BOV,BP,Growth,Joined,Loan Date End,Value,Wage,Release Clause,Attacking,Crossing,Finishing,Heading Accuracy,Short Passing,Volleys,Skill,Dribbling,Curve,FK Accuracy,Long Passing,Ball Control,Movement,Acceleration,Sprint Speed,Agility,Reactions,Balance,Power,Shot Power,Jumping,Stamina,Strength,Long Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing Tackle,Sliding Tackle,Goalkeeping,GK Diving,GK Handling,GK Kicking,GK Positioning,GK Reflexes,Total Stats,Base Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits
0,https://cdn.sofifa.com/players/158/023/21_60.png,Lionel Messi,http://sofifa.com/player/158023/lionel-messi/2...,Argentina,RW ST CF,L. Messi,33,93,93,\n\n\n\nFC Barcelona\n2004 ~ 2021\n\n,158023,5.7,159.0,Left,93,RW,0,"Jul 1, 2004",,€67.5M,€560K,€138.4M,429,85,95,70,91,88,470,96,93,94,91,96,451,91,80,91,94,95,389,86,68,72,69,94,347,44,40,93,95,75,96,91,32,35,24,54,6,11,15,14,8,2231,466,4 ★,4★,Medium,Low,5 ★,85,92,91,95,38,65,\n372
1,https://cdn.sofifa.com/players/020/801/21_60.png,C. Ronaldo dos Santos Aveiro,http://sofifa.com/player/20801/c-ronaldo-dos-s...,Portugal,ST LW,Cristiano Ronaldo,35,92,92,\n\n\n\nJuventus\n2018 ~ 2022\n\n,20801,6.2,183.0,Right,92,ST,0,"Jul 10, 2018",,€46M,€220K,€75.9M,437,84,95,90,82,86,414,88,81,76,77,92,431,87,91,87,95,71,444,94,95,84,78,93,353,63,29,95,82,84,95,84,28,32,24,58,7,11,15,14,11,2221,464,4 ★,5★,High,Low,5 ★,89,93,81,89,35,77,\n344
2,https://cdn.sofifa.com/players/200/389/21_60.png,Jan Oblak,http://sofifa.com/player/200389/jan-oblak/210005/,Slovenia,GK,J. Oblak,27,91,93,\n\n\n\nAtlético Madrid\n2014 ~ 2023\n\n,200389,6.2,192.0,Right,91,GK,2,"Jul 16, 2014",,€75M,€125K,€159.4M,95,13,11,15,43,13,109,12,13,14,40,30,307,43,60,67,88,49,268,59,78,41,78,12,140,34,19,11,65,11,68,57,27,12,18,437,87,92,78,90,90,1413,489,3 ★,1★,Medium,Medium,3 ★,87,92,78,90,52,90,\n86
3,https://cdn.sofifa.com/players/192/985/21_60.png,Kevin De Bruyne,http://sofifa.com/player/192985/kevin-de-bruyn...,Belgium,CAM CM,K. De Bruyne,29,91,91,\n\n\n\nManchester City\n2015 ~ 2023\n\n,192985,5.11,154.0,Right,91,CAM,0,"Aug 30, 2015",,€87M,€370K,€161M,407,94,82,55,94,82,441,88,85,83,93,92,398,77,76,78,91,76,408,91,63,89,74,91,408,76,66,88,94,84,91,186,68,65,53,56,15,13,5,10,13,2304,485,5 ★,4★,High,High,4 ★,76,86,93,88,64,78,\n163
4,https://cdn.sofifa.com/players/190/871/21_60.png,Neymar da Silva Santos Jr.,http://sofifa.com/player/190871/neymar-da-silv...,Brazil,LW CAM,Neymar Jr,28,91,91,\n\n\n\nParis Saint-Germain\n2017 ~ 2022\n\n,190871,5.9,150.0,Right,91,LW,0,"Aug 3, 2017",,€90M,€270K,€166.5M,408,85,87,62,87,87,448,95,88,89,81,95,453,94,89,96,91,83,357,80,62,81,50,84,356,51,36,87,90,92,93,94,35,30,29,59,9,9,15,15,11,2175,451,5 ★,5★,High,Medium,5 ★,91,85,86,94,36,59,\n273


In [21]:
df_project1['Wage'].head()

0    €560K
1    €220K
2    €125K
3    €370K
4    €270K
Name: Wage, dtype: object

In [22]:
# def function to clear and transform the values.
def clean_value(value):
    value = value.replace('€', '')
    if 'M' in value:
        value = float(value.replace('M', '')) * 1000000
    elif 'K' in value:
        value = float(value.replace('K', '')) * 1000
    return int(value)

In [23]:
# Applying the function to the columns
df_project1['Value'] = df_project1['Value'].apply(clean_value)
df_project1['Wage'] = df_project1['Wage'].apply(clean_value)


In [24]:
df_project1['Value'].head()

0    67500000
1    46000000
2    75000000
3    87000000
4    90000000
Name: Value, dtype: int64

In [25]:
df_project1['Wage'].head()

0    560000
1    220000
2    125000
3    370000
4    270000
Name: Wage, dtype: int64

In [26]:
df_project1['Release Clause'].head()

0    €138.4M
1     €75.9M
2    €159.4M
3      €161M
4    €166.5M
Name: Release Clause, dtype: object

In [27]:
def clean_value(value):
    value = value.replace('€', '')
    if 'M' in value:
        value = float(value.replace('M', '')) * 1000000
    elif 'K' in value:
        value = float(value.replace('K', '')) * 1000
    return int(value)

In [28]:
df_project1['Release Clause'] = df_project1['Release Clause'].apply(clean_value)

In [29]:
df_project1['Release Clause'].head()

0    138400000
1     75900000
2    159400000
3    161000000
4    166500000
Name: Release Clause, dtype: int64

# 4. Removing characters from Hits column

In [30]:
df_project1['Hits'].head()

0    \n372
1    \n344
2     \n86
3    \n163
4    \n273
Name: Hits, dtype: object

In [31]:
#def function to remove newlines from Hits column
def clean_and_convert_to_int(hits):
    hits = hits.replace('\n', '').strip()
    multiplier = 1
    if 'K' in hits:
        multiplier = 1000
        hits = hits.replace('K', '')
    return int(float(hits) * multiplier)

In [32]:
df_project1['Hits'] = df_project1['Hits'].apply(clean_and_convert_to_int)

In [33]:
df_project1['Hits'].head()

0    372
1    344
2     86
3    163
4    273
Name: Hits, dtype: int64

# 5. separating the column Joined into Year, Month and Day columns.

In [34]:
df_project1['Joined'].head(10)

0     Jul 1, 2004
1    Jul 10, 2018
2    Jul 16, 2014
3    Aug 30, 2015
4     Aug 3, 2017
5     Jul 1, 2014
6     Jul 1, 2018
7    Jul 19, 2018
8     Jul 1, 2017
9     Jul 1, 2016
Name: Joined, dtype: object

In [35]:
# Converting the 'Joined' column to datetime type
df_project1['Joined'] = pd.to_datetime(df_project1['Joined'])

In [36]:
# Creating month, day and year columns
df_project1['Month'] = df_project1['Joined'].dt.month
df_project1['Day'] = df_project1['Joined'].dt.day
df_project1['Year'] = df_project1['Joined'].dt.year

In [37]:
df_project1.head()

Unnamed: 0,photoUrl,LongName,playerUrl,Nationality,Positions,Name,Age,↓OVA,POT,Team & Contract,ID,Height,Weight,foot,BOV,BP,Growth,Joined,Loan Date End,Value,Wage,Release Clause,Attacking,Crossing,Finishing,Heading Accuracy,Short Passing,Volleys,Skill,Dribbling,Curve,FK Accuracy,Long Passing,Ball Control,Movement,Acceleration,Sprint Speed,Agility,Reactions,Balance,Power,Shot Power,Jumping,Stamina,Strength,Long Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing Tackle,Sliding Tackle,Goalkeeping,GK Diving,GK Handling,GK Kicking,GK Positioning,GK Reflexes,Total Stats,Base Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits,Month,Day,Year
0,https://cdn.sofifa.com/players/158/023/21_60.png,Lionel Messi,http://sofifa.com/player/158023/lionel-messi/2...,Argentina,RW ST CF,L. Messi,33,93,93,\n\n\n\nFC Barcelona\n2004 ~ 2021\n\n,158023,5.7,159.0,Left,93,RW,0,2004-07-01,,67500000,560000,138400000,429,85,95,70,91,88,470,96,93,94,91,96,451,91,80,91,94,95,389,86,68,72,69,94,347,44,40,93,95,75,96,91,32,35,24,54,6,11,15,14,8,2231,466,4 ★,4★,Medium,Low,5 ★,85,92,91,95,38,65,372,7,1,2004
1,https://cdn.sofifa.com/players/020/801/21_60.png,C. Ronaldo dos Santos Aveiro,http://sofifa.com/player/20801/c-ronaldo-dos-s...,Portugal,ST LW,Cristiano Ronaldo,35,92,92,\n\n\n\nJuventus\n2018 ~ 2022\n\n,20801,6.2,183.0,Right,92,ST,0,2018-07-10,,46000000,220000,75900000,437,84,95,90,82,86,414,88,81,76,77,92,431,87,91,87,95,71,444,94,95,84,78,93,353,63,29,95,82,84,95,84,28,32,24,58,7,11,15,14,11,2221,464,4 ★,5★,High,Low,5 ★,89,93,81,89,35,77,344,7,10,2018
2,https://cdn.sofifa.com/players/200/389/21_60.png,Jan Oblak,http://sofifa.com/player/200389/jan-oblak/210005/,Slovenia,GK,J. Oblak,27,91,93,\n\n\n\nAtlético Madrid\n2014 ~ 2023\n\n,200389,6.2,192.0,Right,91,GK,2,2014-07-16,,75000000,125000,159400000,95,13,11,15,43,13,109,12,13,14,40,30,307,43,60,67,88,49,268,59,78,41,78,12,140,34,19,11,65,11,68,57,27,12,18,437,87,92,78,90,90,1413,489,3 ★,1★,Medium,Medium,3 ★,87,92,78,90,52,90,86,7,16,2014
3,https://cdn.sofifa.com/players/192/985/21_60.png,Kevin De Bruyne,http://sofifa.com/player/192985/kevin-de-bruyn...,Belgium,CAM CM,K. De Bruyne,29,91,91,\n\n\n\nManchester City\n2015 ~ 2023\n\n,192985,5.11,154.0,Right,91,CAM,0,2015-08-30,,87000000,370000,161000000,407,94,82,55,94,82,441,88,85,83,93,92,398,77,76,78,91,76,408,91,63,89,74,91,408,76,66,88,94,84,91,186,68,65,53,56,15,13,5,10,13,2304,485,5 ★,4★,High,High,4 ★,76,86,93,88,64,78,163,8,30,2015
4,https://cdn.sofifa.com/players/190/871/21_60.png,Neymar da Silva Santos Jr.,http://sofifa.com/player/190871/neymar-da-silv...,Brazil,LW CAM,Neymar Jr,28,91,91,\n\n\n\nParis Saint-Germain\n2017 ~ 2022\n\n,190871,5.9,150.0,Right,91,LW,0,2017-08-03,,90000000,270000,166500000,408,85,87,62,87,87,448,95,88,89,81,95,453,94,89,96,91,83,357,80,62,81,50,84,356,51,36,87,90,92,93,94,35,30,29,59,9,9,15,15,11,2175,451,5 ★,5★,High,Medium,5 ★,91,85,86,94,36,59,273,8,3,2017


# 6. Separating Team & Contract column

In [38]:
df_project1['Team & Contract'].head(10)

0           \n\n\n\nFC Barcelona\n2004 ~ 2021\n\n
1               \n\n\n\nJuventus\n2018 ~ 2022\n\n
2        \n\n\n\nAtlético Madrid\n2014 ~ 2023\n\n
3        \n\n\n\nManchester City\n2015 ~ 2023\n\n
4    \n\n\n\nParis Saint-Germain\n2017 ~ 2022\n\n
5      \n\n\n\nFC Bayern München\n2014 ~ 2023\n\n
6    \n\n\n\nParis Saint-Germain\n2018 ~ 2022\n\n
7              \n\n\n\nLiverpool\n2018 ~ 2024\n\n
8              \n\n\n\nLiverpool\n2017 ~ 2023\n\n
9              \n\n\n\nLiverpool\n2016 ~ 2023\n\n
Name: Team & Contract, dtype: object

In [39]:
def clean_and_split_team_contract(data):
    data = data.strip()  # Remove leading and trailing whitespaces
    team, contract = data.split('\n')[-2:]  # Split by '\n' and take last two elements
    return pd.Series([team, contract], index=['Team', 'Contract'])

In [40]:
df_project1[['Team', 'Contract']] = df_project1['Team & Contract'].apply(clean_and_split_team_contract)

In [41]:
df_project1 = df_project1.drop('Team & Contract', axis = 1)

In [42]:
df_project1.head()

Unnamed: 0,photoUrl,LongName,playerUrl,Nationality,Positions,Name,Age,↓OVA,POT,ID,Height,Weight,foot,BOV,BP,Growth,Joined,Loan Date End,Value,Wage,Release Clause,Attacking,Crossing,Finishing,Heading Accuracy,Short Passing,Volleys,Skill,Dribbling,Curve,FK Accuracy,Long Passing,Ball Control,Movement,Acceleration,Sprint Speed,Agility,Reactions,Balance,Power,Shot Power,Jumping,Stamina,Strength,Long Shots,Mentality,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Defending,Marking,Standing Tackle,Sliding Tackle,Goalkeeping,GK Diving,GK Handling,GK Kicking,GK Positioning,GK Reflexes,Total Stats,Base Stats,W/F,SM,A/W,D/W,IR,PAC,SHO,PAS,DRI,DEF,PHY,Hits,Month,Day,Year,Team,Contract
0,https://cdn.sofifa.com/players/158/023/21_60.png,Lionel Messi,http://sofifa.com/player/158023/lionel-messi/2...,Argentina,RW ST CF,L. Messi,33,93,93,158023,5.7,159.0,Left,93,RW,0,2004-07-01,,67500000,560000,138400000,429,85,95,70,91,88,470,96,93,94,91,96,451,91,80,91,94,95,389,86,68,72,69,94,347,44,40,93,95,75,96,91,32,35,24,54,6,11,15,14,8,2231,466,4 ★,4★,Medium,Low,5 ★,85,92,91,95,38,65,372,7,1,2004,FC Barcelona,2004 ~ 2021
1,https://cdn.sofifa.com/players/020/801/21_60.png,C. Ronaldo dos Santos Aveiro,http://sofifa.com/player/20801/c-ronaldo-dos-s...,Portugal,ST LW,Cristiano Ronaldo,35,92,92,20801,6.2,183.0,Right,92,ST,0,2018-07-10,,46000000,220000,75900000,437,84,95,90,82,86,414,88,81,76,77,92,431,87,91,87,95,71,444,94,95,84,78,93,353,63,29,95,82,84,95,84,28,32,24,58,7,11,15,14,11,2221,464,4 ★,5★,High,Low,5 ★,89,93,81,89,35,77,344,7,10,2018,Juventus,2018 ~ 2022
2,https://cdn.sofifa.com/players/200/389/21_60.png,Jan Oblak,http://sofifa.com/player/200389/jan-oblak/210005/,Slovenia,GK,J. Oblak,27,91,93,200389,6.2,192.0,Right,91,GK,2,2014-07-16,,75000000,125000,159400000,95,13,11,15,43,13,109,12,13,14,40,30,307,43,60,67,88,49,268,59,78,41,78,12,140,34,19,11,65,11,68,57,27,12,18,437,87,92,78,90,90,1413,489,3 ★,1★,Medium,Medium,3 ★,87,92,78,90,52,90,86,7,16,2014,Atlético Madrid,2014 ~ 2023
3,https://cdn.sofifa.com/players/192/985/21_60.png,Kevin De Bruyne,http://sofifa.com/player/192985/kevin-de-bruyn...,Belgium,CAM CM,K. De Bruyne,29,91,91,192985,5.11,154.0,Right,91,CAM,0,2015-08-30,,87000000,370000,161000000,407,94,82,55,94,82,441,88,85,83,93,92,398,77,76,78,91,76,408,91,63,89,74,91,408,76,66,88,94,84,91,186,68,65,53,56,15,13,5,10,13,2304,485,5 ★,4★,High,High,4 ★,76,86,93,88,64,78,163,8,30,2015,Manchester City,2015 ~ 2023
4,https://cdn.sofifa.com/players/190/871/21_60.png,Neymar da Silva Santos Jr.,http://sofifa.com/player/190871/neymar-da-silv...,Brazil,LW CAM,Neymar Jr,28,91,91,190871,5.9,150.0,Right,91,LW,0,2017-08-03,,90000000,270000,166500000,408,85,87,62,87,87,448,95,88,89,81,95,453,94,89,96,91,83,357,80,62,81,50,84,356,51,36,87,90,92,93,94,35,30,29,59,9,9,15,15,11,2175,451,5 ★,5★,High,Medium,5 ★,91,85,86,94,36,59,273,8,3,2017,Paris Saint-Germain,2017 ~ 2022
