# Financial data about players and club (Everton FC)

We are going to webscrape Transfermarkt for the player market values, how much the team spent on new signing and how much they recieved for players sold. 

Market value website: https://www.transfermarkt.co.uk/transfers/transferrekorde/statistik?saison_id=2017&land_id=0&ausrichtung=&spielerposition_id=&altersklasse=&leihe=&w_s=&plus=1

Transfer info website: https://www.transfermarkt.com/premier-league/transfers/wettbewerb/GB1/saison_id/2017 

Everton squad market value: https://www.transfermarkt.com/everton-fc/kader/verein/29/plus/0/galerie/0?saison_id=2017 

In [88]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

In [89]:
# Get out Player Name, Position, Left Club, Joined Club, Fee
player_names = []
player_positions = []
player_left_clubs = []
player_joined_clubs = []
player_fees = []
player_values = []

headers = {'User-Agent': 
           'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}

base_url = "https://www.transfermarkt.co.uk/transfers/transferrekorde/statistik?saison_id=2017&land_id=0&ausrichtung=&spielerposition_id=&altersklasse=&leihe=&w_s=&plus=1&page="

# Loop through multiple pages
for page_number in range(1, 11):
    page = base_url + str(page_number)
    pageTree = requests.get(page, headers=headers)
    pageSoup = BeautifulSoup(pageTree.content, 'html.parser')
    player_rows = pageSoup.find_all("tr", {"class": ["odd", "even"]})

    for player_row in player_rows:
        player_name = player_row.select("a")[0].text
        player_position = player_row.select("td")[4].text
        player_left_club = player_row.select("td")[11].text
        player_joined_club = player_row.select("td")[15].text
        player_fee = player_row.select("td")[17].text
        player_value = player_row.select("td")[6].text

        player_names.append(player_name)
        player_positions.append(player_position)
        player_left_clubs.append(player_left_club)
        player_joined_clubs.append(player_joined_club)
        player_fees.append(player_fee)
        player_values.append(player_value)

df = pd.DataFrame({
    "Player Name": player_names,
    "Player Position": player_positions,
    "Left Club": player_left_clubs,
    "Current Club": player_joined_clubs,
    "Market Value (m€)": player_values,
    "Fee (m€)": player_fees
})

df

Unnamed: 0,Player Name,Player Position,Left Club,Current Club,Market Value (m€),Fee (m€)
0,Neymar,Left Winger,\nBarcelona,\nParis SG,€100.00m,€222.00m
1,Philippe Coutinho,Attacking Midfield,\nLiverpool,\nBarcelona,€90.00m,€135.00m
2,Ousmane Dembélé,Right Winger,\nBor. Dortmund,\nBarcelona,€33.00m,€135.00m
3,Romelu Lukaku,Centre-Forward,\nEverton,\nMan Utd,€50.00m,€84.70m
4,Virgil van Dijk,Centre-Back,\nSouthampton,\nLiverpool,€30.00m,€84.65m
...,...,...,...,...,...,...
245,Avilés Hurtado,Right Winger,\nClub Tijuana,\nMonterrey,€2.50m,€7.13m
246,Éver Banega,Central Midfield,\nInter,\nSevilla FC,€16.00m,€7.00m
247,Nolito,Left Winger,\nMan City,\nSevilla FC,€12.00m,€7.00m
248,Ryad Boudebouz,Attacking Midfield,\nMontpellier,\nReal Betis,€10.00m,€7.00m


In [90]:
# Clean up the data
df["Left Club"] = df["Left Club"].str.replace("\n", "").str.strip()
df["Current Club"] = df["Current Club"].str.replace("\n", "").str.strip()
df['Fee (m€)'] = df['Fee (m€)'].str.replace('€', '').str.replace('m', '')
df['Market Value (m€)'] = df['Market Value (m€)'].str.replace('€', '').str.replace('m', '')

df['Market Value (m€)'] = pd.to_numeric(df['Market Value (m€)'], errors='coerce')
df['Fee (m€)'] = pd.to_numeric(df['Fee (m€)'], errors='coerce')

df

Unnamed: 0,Player Name,Player Position,Left Club,Current Club,Market Value (m€),Fee (m€)
0,Neymar,Left Winger,Barcelona,Paris SG,100.0,222.00
1,Philippe Coutinho,Attacking Midfield,Liverpool,Barcelona,90.0,135.00
2,Ousmane Dembélé,Right Winger,Bor. Dortmund,Barcelona,33.0,135.00
3,Romelu Lukaku,Centre-Forward,Everton,Man Utd,50.0,84.70
4,Virgil van Dijk,Centre-Back,Southampton,Liverpool,30.0,84.65
...,...,...,...,...,...,...
245,Avilés Hurtado,Right Winger,Club Tijuana,Monterrey,2.5,7.13
246,Éver Banega,Central Midfield,Inter,Sevilla FC,16.0,7.00
247,Nolito,Left Winger,Man City,Sevilla FC,12.0,7.00
248,Ryad Boudebouz,Attacking Midfield,Montpellier,Real Betis,10.0,7.00


In [91]:
# Get out rows where the the player either left or joined Everton
everton_df = df[(df['Left Club'] == 'Everton') | (df['Current Club'] == 'Everton')]
everton_df

Unnamed: 0,Player Name,Player Position,Left Club,Current Club,Market Value (m€),Fee (m€)
3,Romelu Lukaku,Centre-Forward,Everton,Man Utd,50.0,84.7
13,Gylfi Sigurdsson,Attacking Midfield,Swansea,Everton,25.0,49.4
39,Michael Keane,Centre-Back,Burnley,Everton,18.0,28.5
40,Jordan Pickford,Goalkeeper,Sunderland,Everton,15.0,28.5
44,Davy Klaassen,Central Midfield,Ajax,Everton,18.0,27.0
58,Theo Walcott,Right Winger,Arsenal,Everton,20.0,22.5
59,Cenk Tosun,Centre-Forward,Besiktas,Everton,10.5,22.5
97,Ross Barkley,Central Midfield,Everton,Chelsea,25.0,16.8
149,Gerard Deulofeu,Centre-Forward,Everton,Barcelona,12.0,12.0
172,Nikola Vlašić,Attacking Midfield,Hajduk Split,Everton,4.0,10.8


In [92]:
# From this we can see all the transfers that Everton made this season and how much they spent on each player
# We can also see the market value of each player at the time of the transfer

# Lets calculate the total fee spent by Everton and how much they got for selling players
total_spent = everton_df[everton_df['Current Club'] == 'Everton']['Fee (m€)'].sum()
total_received = everton_df[everton_df['Left Club'] == 'Everton']['Fee (m€)'].sum()
net_profit = total_received - total_spent

print("Total Spent: " + str(total_spent) + " million €")
print("Total Received: " + str(total_received) + " million €")
print("Net profit: " + str(net_profit) + " million €")

Total Spent: 197.2 million €
Total Received: 122.8 million €
Net profit: -74.39999999999999 million €


In [93]:
# Lets the what the squad they had at the start of the season was worth
# Need to webscrape another page to get the squad at the start of the season

# Get out Player Name, Position, Market Value
player_names = []
player_positions = []
player_values = []

headers = {'User-Agent': 
           'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}

page = "https://www.transfermarkt.com/everton-fc/kader/verein/29/saison_id/2017"
pageTree = requests.get(page, headers=headers)
pageSoup = BeautifulSoup(pageTree.content, 'html.parser')

player_rows = pageSoup.find_all("tr", {"class": ["odd", "even"]})

for player_row in player_rows:
    player_name = player_row.select("a")[0].text
    player_position = player_row.select("td")[4].text
    player_value = player_row.select("td")[8].text

    player_names.append(player_name)
    player_positions.append(player_position)
    player_values.append(player_value)

everton_squad = pd.DataFrame({
    "Player Name": player_names,
    "Player Position": player_positions,
    "Left Club": None,
    "Current Club": None,
    "Market Value (m€)": player_values,
    "Fee (m€)": None
})

everton_squad['Player Name'] = everton_squad['Player Name'].str.replace("\n", "").str.strip()
everton_squad['Player Position'] = everton_squad['Player Position'].str.replace("\n", "").str.strip()
everton_squad['Market Value (m€)'] = everton_squad['Market Value (m€)'].str.replace('€', '').str.replace('m', '').str.replace('k', '')
everton_squad['Market Value (m€)'] = pd.to_numeric(everton_squad['Market Value (m€)'], errors='coerce')

everton_squad

Unnamed: 0,Player Name,Player Position,Left Club,Current Club,Market Value (m€),Fee (m€)
0,Jordan Pickford,Goalkeeper,,,30.0,
1,Joel Robles,Goalkeeper,,,3.5,
2,Maarten Stekelenburg,Goalkeeper,,,1.0,
3,Mateusz Hewelt,Goalkeeper,,,,
4,Michael Keane,Centre-Back,,,20.0,
5,Eliaquim Mangala,Centre-Back,,,15.0,
6,Ramiro Funes Mori,Centre-Back,,,10.0,
7,Mason Holgate,Centre-Back,,,7.0,
8,Ashley Williams,Centre-Back,,,5.0,
9,Cuco Martina,Centre-Back,,,2.5,


In [94]:
# Put the two dataframes together (append doesnt work)
everton_df = pd.concat([everton_df, everton_squad])

  everton_df = pd.concat([everton_df, everton_squad])


In [95]:
# sort by player name
everton_df = everton_df.sort_values(by=['Player Name'])
everton_df = everton_df.reset_index(drop=True)

# remove duplicate players and keep the last one (the one with the joined club)
everton_df = everton_df.drop_duplicates(subset=['Player Name'], keep='last')
everton_df = everton_df.reset_index(drop=True)

# Set everyone current club to Everton
everton_df['Current Club'] = 'Everton'

# Sort after market value
everton_df = everton_df.sort_values(by=['Market Value (m€)'], ascending=False)
everton_df = everton_df.reset_index(drop=True)

# The first two players market value is incorecct, 500.0 should be 0.5 and 250 should be 0.25
everton_df['Market Value (m€)'][0] = 0.5
everton_df['Market Value (m€)'][1] = 0.25

# Sort after market value (again)
everton_df = everton_df.sort_values(by=['Market Value (m€)'], ascending=False)
everton_df = everton_df.reset_index(drop=True)

everton_df

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  everton_df['Market Value (m€)'][0] = 0.5
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  everton_df['Market Value (m€)'][1] = 0.25


Unnamed: 0,Player Name,Player Position,Left Club,Current Club,Market Value (m€),Fee (m€)
0,Romelu Lukaku,Centre-Forward,Everton,Everton,50.0,84.7
1,Ross Barkley,Central Midfield,Everton,Everton,25.0,16.8
2,Gylfi Sigurdsson,Attacking Midfield,Swansea,Everton,25.0,49.4
3,Morgan Schneiderlin,Defensive Midfield,,Everton,20.0,
4,Michael Keane,Centre-Back,,Everton,20.0,
5,Idrissa Gueye,Central Midfield,,Everton,20.0,
6,Theo Walcott,Right Winger,Arsenal,Everton,20.0,22.5
7,Yannick Bolasie,Left Winger,,Everton,18.0,
8,Davy Klaassen,Central Midfield,Ajax,Everton,18.0,27.0
9,Dominic Calvert-Lewin,Centre-Forward,,Everton,15.0,


In [96]:
# Lets calculate the total value of the squad at the start of the season
total_value = everton_df['Market Value (m€)'].sum()
print("Total Value: " + str(total_value) + " million €")

Total Value: 449.75 million €
