# BUSINESS CASE: EUROLEAGUE CALENDAR

### Welcome to Euroleague Data!

In this scenario, our main goal is to obtain useful information receiving the calendar of the Euroleague Basketball

![](https://i.ytimg.com/vi/V9m_mrusffc/hqdefault.jpg)

Instead of opening a CSV or an Excel File, we will start opening a Json file: 

In [1]:
# import the recommended libraries
import pandas as pd
import json
import numpy as np

import warnings
warnings.filterwarnings("ignore")

In [2]:
data=json.load(open("datasets/el2023.json"))

#### First, let's explore the data: 

In [3]:
type(data)

dict

Looking good! It is a dictionary (json style!) We will now try to use it as a DataFrame using pandas: 

In [4]:
df=pd.DataFrame(data)

In [5]:
df.head()

Unnamed: 0,data,total
0,"{'gameCode': 304, 'season': {'name': 'EuroLeag...",306
1,"{'gameCode': 306, 'season': {'name': 'EuroLeag...",306
2,"{'gameCode': 303, 'season': {'name': 'EuroLeag...",306
3,"{'gameCode': 305, 'season': {'name': 'EuroLeag...",306
4,"{'gameCode': 301, 'season': {'name': 'EuroLeag...",306


![](https://media.tenor.com/O2Tz9B1UEMsAAAAM/sxv-wtf.gif)

WTF is going on? Let's explore more about this json file: 

In [6]:
data["data"][300].keys()

dict_keys(['gameCode', 'season', 'group', 'phaseType', 'round', 'roundAlias', 'roundName', 'played', 'date', 'confirmedDate', 'confirmedHour', 'localTimeZone', 'localDate', 'utcDate', 'local', 'road', 'audience', 'audienceConfirmed', 'socialFeed', 'operationsCode', 'referee1', 'referee2', 'referee3', 'referee4', 'venue', 'isNeutralVenue', 'gameStatus', 'winner'])

In [7]:
data["data"][300]["road"]

{'club': {'code': 'MIL',
  'name': 'EA7 Emporio Armani Milan',
  'abbreviatedName': 'Milan',
  'editorialName': 'Milan',
  'tvCode': 'EA7',
  'isVirtual': False,
  'images': {'crest': 'https://media-cdn.incrowdsports.com/8154f184-c61a-4e7f-b14d-9d802e35cb95.png'}},
 'score': 82,
 'standingsScore': 76,
 'partials': {'partials1': 23,
  'partials2': 12,
  'partials3': 19,
  'partials4': 22,
  'extraPeriods': {'1': 6}}}

In [8]:
data["data"][0].keys()

dict_keys(['gameCode', 'season', 'group', 'phaseType', 'round', 'roundAlias', 'roundName', 'played', 'date', 'confirmedDate', 'confirmedHour', 'localTimeZone', 'localDate', 'utcDate', 'local', 'road', 'audience', 'audienceConfirmed', 'socialFeed', 'operationsCode', 'referee1', 'referee2', 'referee3', 'referee4', 'venue', 'isNeutralVenue', 'gameStatus', 'winner'])

### We will need to extract our information step by step!
#### Before trying to extract everything and then filtering the data, we will do the other way around: What are the basic columns that we must obtain from this json file?
- fecha
- ronda
- local
- localscore
- road
- roadscore
- arbitros(list)

In [9]:
# Explore the df to know exactly where the useful info is
data["data"][300]["referee3"]["name"]

'NIKOLIC, UROS'

#### Now that we have everythin that we need, it is time to extract it and save it in lists!

In [10]:
fecha=[]
ronda=[]
local=[]
visitante=[]
localscore=[]
roadscore=[]
arbitros=[[] for i in range(len(data["data"]))]

In [11]:
data["data"][0]["referee1"]

In [12]:
count=0
for i in data["data"]:
    fecha.append(i["date"])
    ronda.append(i["round"])
    local.append(i["local"]["club"]["name"])
    visitante.append(i["road"]["club"]["name"])
    localscore.append(i["local"]["score"])
    roadscore.append(i["road"]["score"])
    if i["local"]["score"]>0:
        for k in range(3):
            arbitros[count].append(i[f"referee{k+1}"]["name"])
            
    count+=1

In [13]:
arbitros[::-1]

[['ROCHA, FERNANDO', 'PATERNICO, CARMELO', 'PASTUSIAK, PIOTR'],
 ['PEREZ, MIGUEL ANGEL', 'VILIUS, GYTIS', 'KARDUM, LUKA'],
 ['RADOVIC, SRETEN', 'DIFALLAH, MEHDI', 'JOVCIC, MILIVOJE'],
 ['GARCIA, JUAN CARLOS', 'LATISEVS, OLEGS', 'SUKYS, ARTURAS'],
 ['BOLTAUZER, MATEJ', 'PANTHER, ANNE', 'SILVA, SERGIO'],
 ['PUKL,SASA', 'PERUGA, CARLOS', 'NIKOLIC, UROS'],
 ['LOTTERMOSER, ROBERT', 'PEREZ, EMILIO', 'NEDOVIC, MILAN'],
 ['BELOSEVIC, ILIJA', 'MOGULKOC, EMIN', 'RACYS, SAULIUS'],
 ['JAVOR, DAMIR', 'HORDOV, TOMISLAV ', 'PEERANDI, RAIN'],
 ['BOLTAUZER, MATEJ', 'NIKOLIC, UROS', 'BISSANG, JOSEPH'],
 ['ROCHA, FERNANDO', 'KOLJENSIC, MILOS ', 'TRAWICKI, TOMASZ'],
 ['LOTTERMOSER, ROBERT', 'MOGULKOC, EMIN', 'RADOJKOVIC, JOSIP '],
 ['BELOSEVIC, ILIJA', 'PERUGA, CARLOS', 'MAJKIC, MARIO '],
 ['RADOVIC, SRETEN', 'LATISEVS, OLEGS', 'ALIAGA, JORDI'],
 ['DIFALLAH, MEHDI', 'NEDOVIC, MILAN', 'CORTES, CARLOS'],
 ['PUKL,SASA', 'VILIUS, GYTIS', 'SILVA, SERGIO'],
 ['JOVCIC, MILIVOJE', 'PANTHER, ANNE', 'ZAMOJSKI, JAKU

#### Let's now save it as a DataFrame

In [14]:
tabla=pd.DataFrame({"Fecha": fecha, 
                    "Ronda": ronda, 
                    "Local": local, 
                    "Visitante": visitante, 
                    "Arbitros": arbitros, 
                    "Localscore": localscore,
                    "Roadscore": roadscore})

In [15]:
tabla.tail()

Unnamed: 0,Fecha,Ronda,Local,Visitante,Arbitros,Localscore,Roadscore
301,2023-10-05T20:30:00,1,Virtus Segafredo Bologna,Zalgiris Kaunas,"[BOLTAUZER, MATEJ, PANTHER, ANNE, SILVA, SERGIO]",79,82
302,2023-10-05T20:30:00,1,FC Bayern Munich,ALBA Berlin,"[GARCIA, JUAN CARLOS, LATISEVS, OLEGS, SUKYS, ...",80,68
303,2023-10-05T20:30:00,1,FC Barcelona,Anadolu Efes Istanbul,"[RADOVIC, SRETEN, DIFALLAH, MEHDI, JOVCIC, MIL...",91,74
304,2023-10-05T20:05:00,1,Maccabi Playtika Tel Aviv,Partizan Mozzart Bet Belgrade,"[PEREZ, MIGUEL ANGEL, VILIUS, GYTIS, KARDUM, L...",96,81
305,2023-10-05T19:00:00,1,Crvena Zvezda Meridianbet Belgrade,LDLC ASVEL Villeurbanne,"[ROCHA, FERNANDO, PATERNICO, CARMELO, PASTUSIA...",94,73


#### Is there any problem when looking at the .head of the data? Is it normal?

In [16]:
tabla.head()

Unnamed: 0,Fecha,Ronda,Local,Visitante,Arbitros,Localscore,Roadscore
0,2024-04-12T21:00:00,34,LDLC ASVEL Villeurbanne,FC Barcelona,[],0,0
1,2024-04-12T20:30:00,34,Partizan Mozzart Bet Belgrade,Valencia Basket,[],0,0
2,2024-04-12T20:30:00,34,Virtus Segafredo Bologna,Baskonia Vitoria-Gasteiz,[],0,0
3,2024-04-12T20:15:00,34,Olympiacos Piraeus,Fenerbahce Beko Istanbul,[],0,0
4,2024-04-11T20:15:00,34,Panathinaikos AKTOR Athens,ALBA Berlin,[],0,0


#### Try the info and describe methods. Is any of them not the thing we are expecting? 
tik, tak

In [17]:
tabla.Fecha=pd.to_datetime(tabla.Fecha)
tabla.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 306 entries, 0 to 305
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype         
---  ------      --------------  -----         
 0   Fecha       306 non-null    datetime64[ns]
 1   Ronda       306 non-null    int64         
 2   Local       306 non-null    object        
 3   Visitante   306 non-null    object        
 4   Arbitros    306 non-null    object        
 5   Localscore  306 non-null    int64         
 6   Roadscore   306 non-null    int64         
dtypes: datetime64[ns](1), int64(3), object(3)
memory usage: 16.9+ KB


In [18]:
tabla['Dia'] = tabla['Fecha'].map(lambda x: str(x)[:10])
tabla['Hora'] = tabla['Fecha'].map(lambda x: str(x)[11:])

tabla['Dia'] = tabla['Fecha'].dt.strftime('%Y-%m-%d')
tabla['Hora'] = tabla['Fecha'].dt.strftime('%H:%M:%S')

#### Columns Addition: Competition and Phase

In [19]:
tabla["Competition"]="Euroleague"
tabla["Phase"]="Regular Season"
tabla.sample(2)

Unnamed: 0,Fecha,Ronda,Local,Visitante,Arbitros,Localscore,Roadscore,Dia,Hora,Competition,Phase
299,2023-10-06 20:15:00,1,Panathinaikos AKTOR Athens,Olympiacos Piraeus,"[LOTTERMOSER, ROBERT, PEREZ, EMILIO, NEDOVIC, ...",78,88,2023-10-06,20:15:00,Euroleague,Regular Season
73,2024-02-09 20:30:00,26,FC Barcelona,ALBA Berlin,[],0,0,2024-02-09,20:30:00,Euroleague,Regular Season


#### Column Drop: Any useless information?

In [20]:
tabla.drop(columns="Fecha", inplace=True)

### Rearranging the Columns: Any ideas?

In [21]:
tabla.columns

Index(['Ronda', 'Local', 'Visitante', 'Arbitros', 'Localscore', 'Roadscore',
       'Dia', 'Hora', 'Competition', 'Phase'],
      dtype='object')

In [22]:
tabla=tabla.loc[:, ['Competition', 'Phase', 'Ronda', 'Dia', 'Hora', 'Local', 'Visitante', 'Localscore', 'Roadscore','Arbitros' ]]

In [23]:
tabla

Unnamed: 0,Competition,Phase,Ronda,Dia,Hora,Local,Visitante,Localscore,Roadscore,Arbitros
0,Euroleague,Regular Season,34,2024-04-12,21:00:00,LDLC ASVEL Villeurbanne,FC Barcelona,0,0,[]
1,Euroleague,Regular Season,34,2024-04-12,20:30:00,Partizan Mozzart Bet Belgrade,Valencia Basket,0,0,[]
2,Euroleague,Regular Season,34,2024-04-12,20:30:00,Virtus Segafredo Bologna,Baskonia Vitoria-Gasteiz,0,0,[]
3,Euroleague,Regular Season,34,2024-04-12,20:15:00,Olympiacos Piraeus,Fenerbahce Beko Istanbul,0,0,[]
4,Euroleague,Regular Season,34,2024-04-11,20:15:00,Panathinaikos AKTOR Athens,ALBA Berlin,0,0,[]
...,...,...,...,...,...,...,...,...,...,...
301,Euroleague,Regular Season,1,2023-10-05,20:30:00,Virtus Segafredo Bologna,Zalgiris Kaunas,79,82,"[BOLTAUZER, MATEJ, PANTHER, ANNE, SILVA, SERGIO]"
302,Euroleague,Regular Season,1,2023-10-05,20:30:00,FC Bayern Munich,ALBA Berlin,80,68,"[GARCIA, JUAN CARLOS, LATISEVS, OLEGS, SUKYS, ..."
303,Euroleague,Regular Season,1,2023-10-05,20:30:00,FC Barcelona,Anadolu Efes Istanbul,91,74,"[RADOVIC, SRETEN, DIFALLAH, MEHDI, JOVCIC, MIL..."
304,Euroleague,Regular Season,1,2023-10-05,20:05:00,Maccabi Playtika Tel Aviv,Partizan Mozzart Bet Belgrade,96,81,"[PEREZ, MIGUEL ANGEL, VILIUS, GYTIS, KARDUM, L..."


### Now let's try to get rid of the matches without information: BRAINSTORMING

In [24]:
actual=tabla[tabla["Localscore"]>0]

In [25]:
actual.reset_index(inplace=True, drop=True)

In [26]:
actual.head()

Unnamed: 0,Competition,Phase,Ronda,Dia,Hora,Local,Visitante,Localscore,Roadscore,Arbitros
0,Euroleague,Regular Season,25,2024-02-02,20:30:00,Virtus Segafredo Bologna,Partizan Mozzart Bet Belgrade,88,84,"[LOTTERMOSER, ROBERT, PUKL,SASA, TRAWICKI, TOM..."
1,Euroleague,Regular Season,25,2024-02-02,20:30:00,AS Monaco,Fenerbahce Beko Istanbul,76,69,"[JAVOR, DAMIR, FOUFIS, IOANNIS, BALAK, AMIT]"
2,Euroleague,Regular Season,25,2024-02-02,20:00:00,Crvena Zvezda Meridianbet Belgrade,FC Barcelona,76,85,"[ROCHA, FERNANDO, RACYS, SAULIUS, KONSTANTINOV..."
3,Euroleague,Regular Season,25,2024-02-02,19:00:00,Zalgiris Kaunas,Panathinaikos AKTOR Athens,80,68,"[RADOVIC, SRETEN, PASTUSIAK, PIOTR, CORTES, CA..."
4,Euroleague,Regular Season,25,2024-02-02,18:30:00,Anadolu Efes Istanbul,EA7 Emporio Armani Milan,79,73,"[PEREZ, MIGUEL ANGEL, KARDUM, LUKA, BAENA, ALB..."


In [27]:
actual["Plusminus"]=actual["Localscore"]-actual["Roadscore"]

In [28]:
actual["Winner"]=actual.apply(lambda row: row["Local"] if row["Plusminus"]>0 else row["Visitante"], axis=1)
actual["winner"]=np.where(actual["Plusminus"]>0, actual.Local, actual.Visitante)

In [29]:
actual.rename(columns={"winner": "Ganador"}, inplace=True)

In [30]:
actual["Ganador"]=np.where(actual.Plusminus>0, "Local", "Visitante")

In [31]:
actual.head()

Unnamed: 0,Competition,Phase,Ronda,Dia,Hora,Local,Visitante,Localscore,Roadscore,Arbitros,Plusminus,Winner,Ganador
0,Euroleague,Regular Season,25,2024-02-02,20:30:00,Virtus Segafredo Bologna,Partizan Mozzart Bet Belgrade,88,84,"[LOTTERMOSER, ROBERT, PUKL,SASA, TRAWICKI, TOM...",4,Virtus Segafredo Bologna,Local
1,Euroleague,Regular Season,25,2024-02-02,20:30:00,AS Monaco,Fenerbahce Beko Istanbul,76,69,"[JAVOR, DAMIR, FOUFIS, IOANNIS, BALAK, AMIT]",7,AS Monaco,Local
2,Euroleague,Regular Season,25,2024-02-02,20:00:00,Crvena Zvezda Meridianbet Belgrade,FC Barcelona,76,85,"[ROCHA, FERNANDO, RACYS, SAULIUS, KONSTANTINOV...",-9,FC Barcelona,Visitante
3,Euroleague,Regular Season,25,2024-02-02,19:00:00,Zalgiris Kaunas,Panathinaikos AKTOR Athens,80,68,"[RADOVIC, SRETEN, PASTUSIAK, PIOTR, CORTES, CA...",12,Zalgiris Kaunas,Local
4,Euroleague,Regular Season,25,2024-02-02,18:30:00,Anadolu Efes Istanbul,EA7 Emporio Armani Milan,79,73,"[PEREZ, MIGUEL ANGEL, KARDUM, LUKA, BAENA, ALB...",6,Anadolu Efes Istanbul,Local


### Now, let's order the dataframe: 
- Should we order it depending on more than one column?
- Will it affect the index?

In [32]:
actual.sort_values(["Ronda", "Dia", "Hora"], inplace=True)

In [33]:
actual.reset_index(inplace=True, drop=True)

In [34]:
actual.head(4)

Unnamed: 0,Competition,Phase,Ronda,Dia,Hora,Local,Visitante,Localscore,Roadscore,Arbitros,Plusminus,Winner,Ganador
0,Euroleague,Regular Season,1,2023-10-05,19:00:00,Crvena Zvezda Meridianbet Belgrade,LDLC ASVEL Villeurbanne,94,73,"[ROCHA, FERNANDO, PATERNICO, CARMELO, PASTUSIA...",21,Crvena Zvezda Meridianbet Belgrade,Local
1,Euroleague,Regular Season,1,2023-10-05,20:05:00,Maccabi Playtika Tel Aviv,Partizan Mozzart Bet Belgrade,96,81,"[PEREZ, MIGUEL ANGEL, VILIUS, GYTIS, KARDUM, L...",15,Maccabi Playtika Tel Aviv,Local
2,Euroleague,Regular Season,1,2023-10-05,20:30:00,Virtus Segafredo Bologna,Zalgiris Kaunas,79,82,"[BOLTAUZER, MATEJ, PANTHER, ANNE, SILVA, SERGIO]",-3,Zalgiris Kaunas,Visitante
3,Euroleague,Regular Season,1,2023-10-05,20:30:00,FC Bayern Munich,ALBA Berlin,80,68,"[GARCIA, JUAN CARLOS, LATISEVS, OLEGS, SUKYS, ...",12,FC Bayern Munich,Local


### Addition of more columns (based on conditions):

- +-
- Winners of the matches

#### Rename of a particular column

#### What happens if we apply the np.where to a column that has already been created?

## Questions about this basic DF
- There has been more Local or Road winners?
- Which club has won more matches?

In [35]:
actual.Ganador.value_counts()

Local        150
Visitante     75
Name: Ganador, dtype: int64

In [36]:
actual.Winner.value_counts()

Real Madrid                           22
FC Barcelona                          17
Virtus Segafredo Bologna              16
AS Monaco                             15
Fenerbahce Beko Istanbul              15
Panathinaikos AKTOR Athens            15
Olympiacos Piraeus                    14
Valencia Basket                       13
Maccabi Playtika Tel Aviv             13
Baskonia Vitoria-Gasteiz              13
Partizan Mozzart Bet Belgrade         12
Anadolu Efes Istanbul                 10
EA7 Emporio Armani Milan              10
Crvena Zvezda Meridianbet Belgrade    10
FC Bayern Munich                      10
Zalgiris Kaunas                       10
ALBA Berlin                            5
LDLC ASVEL Villeurbanne                5
Name: Winner, dtype: int64

## Referees

In this particular column, we could have some troubles as the object is not a string but a `list`

We want to analyze: 

- How many referees has already been in a match?
- Who is the referee that has been in more local victories?

In [37]:
arbitros=actual.copy()

In [38]:
arbitros_duplicated = arbitros.loc[arbitros.index.repeat(3)].reset_index(drop=True)

In [39]:
arbitros_duplicated.head()

Unnamed: 0,Competition,Phase,Ronda,Dia,Hora,Local,Visitante,Localscore,Roadscore,Arbitros,Plusminus,Winner,Ganador
0,Euroleague,Regular Season,1,2023-10-05,19:00:00,Crvena Zvezda Meridianbet Belgrade,LDLC ASVEL Villeurbanne,94,73,"[ROCHA, FERNANDO, PATERNICO, CARMELO, PASTUSIA...",21,Crvena Zvezda Meridianbet Belgrade,Local
1,Euroleague,Regular Season,1,2023-10-05,19:00:00,Crvena Zvezda Meridianbet Belgrade,LDLC ASVEL Villeurbanne,94,73,"[ROCHA, FERNANDO, PATERNICO, CARMELO, PASTUSIA...",21,Crvena Zvezda Meridianbet Belgrade,Local
2,Euroleague,Regular Season,1,2023-10-05,19:00:00,Crvena Zvezda Meridianbet Belgrade,LDLC ASVEL Villeurbanne,94,73,"[ROCHA, FERNANDO, PATERNICO, CARMELO, PASTUSIA...",21,Crvena Zvezda Meridianbet Belgrade,Local
3,Euroleague,Regular Season,1,2023-10-05,20:05:00,Maccabi Playtika Tel Aviv,Partizan Mozzart Bet Belgrade,96,81,"[PEREZ, MIGUEL ANGEL, VILIUS, GYTIS, KARDUM, L...",15,Maccabi Playtika Tel Aviv,Local
4,Euroleague,Regular Season,1,2023-10-05,20:05:00,Maccabi Playtika Tel Aviv,Partizan Mozzart Bet Belgrade,96,81,"[PEREZ, MIGUEL ANGEL, VILIUS, GYTIS, KARDUM, L...",15,Maccabi Playtika Tel Aviv,Local


In [40]:
referee=[]
count=0
for i in arbitros_duplicated.Arbitros:
    #count=0 why not here?
    referee.append(i[count])
    count+=1
    if count==3:
        count=0

In [41]:
arbitros_duplicated.Arbitros=referee

In [42]:
arbitros_duplicated.head()

Unnamed: 0,Competition,Phase,Ronda,Dia,Hora,Local,Visitante,Localscore,Roadscore,Arbitros,Plusminus,Winner,Ganador
0,Euroleague,Regular Season,1,2023-10-05,19:00:00,Crvena Zvezda Meridianbet Belgrade,LDLC ASVEL Villeurbanne,94,73,"ROCHA, FERNANDO",21,Crvena Zvezda Meridianbet Belgrade,Local
1,Euroleague,Regular Season,1,2023-10-05,19:00:00,Crvena Zvezda Meridianbet Belgrade,LDLC ASVEL Villeurbanne,94,73,"PATERNICO, CARMELO",21,Crvena Zvezda Meridianbet Belgrade,Local
2,Euroleague,Regular Season,1,2023-10-05,19:00:00,Crvena Zvezda Meridianbet Belgrade,LDLC ASVEL Villeurbanne,94,73,"PASTUSIAK, PIOTR",21,Crvena Zvezda Meridianbet Belgrade,Local
3,Euroleague,Regular Season,1,2023-10-05,20:05:00,Maccabi Playtika Tel Aviv,Partizan Mozzart Bet Belgrade,96,81,"PEREZ, MIGUEL ANGEL",15,Maccabi Playtika Tel Aviv,Local
4,Euroleague,Regular Season,1,2023-10-05,20:05:00,Maccabi Playtika Tel Aviv,Partizan Mozzart Bet Belgrade,96,81,"VILIUS, GYTIS",15,Maccabi Playtika Tel Aviv,Local


In [43]:
arbitros_duplicated.Arbitros.nunique()

66

In [44]:
pd.crosstab(arbitros_duplicated["Arbitros"], arbitros_duplicated["Ganador"]).sort_values(["Local"], ascending=False)

Ganador,Local,Visitante
Arbitros,Unnamed: 1_level_1,Unnamed: 2_level_1
"PERUGA, CARLOS",17,1
"JAVOR, DAMIR",16,3
"VILIUS, GYTIS",14,2
"ROCHA, FERNANDO",14,5
"LOTTERMOSER, ROBERT",13,5
...,...,...
"LAURINAVICIUS, JURGIS",1,3
"HALLIKO, AARE",1,3
"CELIK, HUSEYIN",1,4
"CLIVAZ, SEBASTIEN",0,2


## Geolocalization: Map function

Con este listado de países de la línea de más abajo, queremos asignar en cada row el país a cada uno de los equipos para saber dónde juega cada uno. 

Después, queremos analizar: 

- La media de puntos local y visitante por país
- Cuál es el máximo de anotación local y visitante por país
- El número de veces que han ganado los equipos locales en sus tierras, cuando jugaban de local

In [45]:
countries=["Serbia", "Israel", "Italy", "Germany", "Spain", "Turkey", "Greece", "Spain", "Spain", "Turkey","France", "Germany", "Lithuania", "France", "Greece", "Italy", "Serbia", "Spain"]

In [46]:
teams=list(actual.Local.unique())

In [47]:
geolocate={}
for team, country in zip(teams, countries):
    geolocate[team]=country

In [48]:
geolocate

{'Crvena Zvezda Meridianbet Belgrade': 'Serbia',
 'Maccabi Playtika Tel Aviv': 'Israel',
 'Virtus Segafredo Bologna': 'Italy',
 'FC Bayern Munich': 'Germany',
 'FC Barcelona': 'Spain',
 'Fenerbahce Beko Istanbul': 'Turkey',
 'Panathinaikos AKTOR Athens': 'Greece',
 'Valencia Basket': 'Spain',
 'Baskonia Vitoria-Gasteiz': 'Spain',
 'Anadolu Efes Istanbul': 'Turkey',
 'LDLC ASVEL Villeurbanne': 'France',
 'ALBA Berlin': 'Germany',
 'Zalgiris Kaunas': 'Lithuania',
 'AS Monaco': 'France',
 'Olympiacos Piraeus': 'Greece',
 'EA7 Emporio Armani Milan': 'Italy',
 'Partizan Mozzart Bet Belgrade': 'Serbia',
 'Real Madrid': 'Spain'}

In [49]:
paises=actual.copy()

In [50]:
paises["Country"]=paises["Local"].map(geolocate)

In [51]:
paises.head()

Unnamed: 0,Competition,Phase,Ronda,Dia,Hora,Local,Visitante,Localscore,Roadscore,Arbitros,Plusminus,Winner,Ganador,Country
0,Euroleague,Regular Season,1,2023-10-05,19:00:00,Crvena Zvezda Meridianbet Belgrade,LDLC ASVEL Villeurbanne,94,73,"[ROCHA, FERNANDO, PATERNICO, CARMELO, PASTUSIA...",21,Crvena Zvezda Meridianbet Belgrade,Local,Serbia
1,Euroleague,Regular Season,1,2023-10-05,20:05:00,Maccabi Playtika Tel Aviv,Partizan Mozzart Bet Belgrade,96,81,"[PEREZ, MIGUEL ANGEL, VILIUS, GYTIS, KARDUM, L...",15,Maccabi Playtika Tel Aviv,Local,Israel
2,Euroleague,Regular Season,1,2023-10-05,20:30:00,Virtus Segafredo Bologna,Zalgiris Kaunas,79,82,"[BOLTAUZER, MATEJ, PANTHER, ANNE, SILVA, SERGIO]",-3,Zalgiris Kaunas,Visitante,Italy
3,Euroleague,Regular Season,1,2023-10-05,20:30:00,FC Bayern Munich,ALBA Berlin,80,68,"[GARCIA, JUAN CARLOS, LATISEVS, OLEGS, SUKYS, ...",12,FC Bayern Munich,Local,Germany
4,Euroleague,Regular Season,1,2023-10-05,20:30:00,FC Barcelona,Anadolu Efes Istanbul,91,74,"[RADOVIC, SRETEN, DIFALLAH, MEHDI, JOVCIC, MIL...",17,FC Barcelona,Local,Spain


In [52]:
paises[["Country", "Ganador"]].groupby("Country").value_counts()
paises.groupby(by=["Country", "Local"]).value_counts(["Ganador"])

Country    Local                               Ganador  
France     AS Monaco                           Local         9
                                               Visitante     3
           LDLC ASVEL Villeurbanne             Visitante    11
                                               Local         2
Germany    ALBA Berlin                         Visitante     9
                                               Local         4
           FC Bayern Munich                    Local         7
                                               Visitante     5
Greece     Olympiacos Piraeus                  Local         9
                                               Visitante     4
           Panathinaikos AKTOR Athens          Local        10
                                               Visitante     3
Israel     Maccabi Playtika Tel Aviv           Local         9
                                               Visitante     4
Italy      EA7 Emporio Armani Milan            Local         

# Creation of the classification

Now we are going to create the Final Classification: 

- What are the columns that we are going to use?
- Is any transformation needed?
- How can we operate with both local and Road teams?

In [53]:
local_df=actual[["Ronda", "Local", "Localscore", "Plusminus", "Winner"]]

In [54]:
road_df=actual[["Ronda", "Visitante", "Roadscore", "Plusminus", "Winner"]]

In [55]:
road_df.Plusminus=-1*road_df.Plusminus

In [56]:
clasificacion=pd.concat([local_df, road_df], axis=0, ignore_index=True)

In [57]:
clasificacion

Unnamed: 0,Ronda,Local,Localscore,Plusminus,Winner,Visitante,Roadscore
0,1,Crvena Zvezda Meridianbet Belgrade,94.0,21,Crvena Zvezda Meridianbet Belgrade,,
1,1,Maccabi Playtika Tel Aviv,96.0,15,Maccabi Playtika Tel Aviv,,
2,1,Virtus Segafredo Bologna,79.0,-3,Zalgiris Kaunas,,
3,1,FC Bayern Munich,80.0,12,FC Bayern Munich,,
4,1,FC Barcelona,91.0,17,FC Barcelona,,
...,...,...,...,...,...,...,...
445,25,,,-6,Anadolu Efes Istanbul,EA7 Emporio Armani Milan,73.0
446,25,,,-12,Zalgiris Kaunas,Panathinaikos AKTOR Athens,68.0
447,25,,,9,FC Barcelona,FC Barcelona,85.0
448,25,,,-4,Virtus Segafredo Bologna,Partizan Mozzart Bet Belgrade,84.0


In [58]:
local_df.rename(columns={"Local":"Team", "Localscore":"Score"}, inplace=True)
road_df.rename(columns={"Visitante":"Team", "Roadscore":"Score"}, inplace=True)

In [59]:
class_df=pd.concat([local_df, road_df], axis=0, ignore_index=True)

In [60]:
class_df

Unnamed: 0,Ronda,Team,Score,Plusminus,Winner
0,1,Crvena Zvezda Meridianbet Belgrade,94,21,Crvena Zvezda Meridianbet Belgrade
1,1,Maccabi Playtika Tel Aviv,96,15,Maccabi Playtika Tel Aviv
2,1,Virtus Segafredo Bologna,79,-3,Zalgiris Kaunas
3,1,FC Bayern Munich,80,12,FC Bayern Munich
4,1,FC Barcelona,91,17,FC Barcelona
...,...,...,...,...,...
445,25,EA7 Emporio Armani Milan,73,-6,Anadolu Efes Istanbul
446,25,Panathinaikos AKTOR Athens,68,-12,Zalgiris Kaunas
447,25,FC Barcelona,85,9,FC Barcelona
448,25,Partizan Mozzart Bet Belgrade,84,-4,Virtus Segafredo Bologna


In [61]:
class_df["Win"]=np.where(class_df["Team"]==class_df["Winner"], 1, 0)

In [62]:
class_df

Unnamed: 0,Ronda,Team,Score,Plusminus,Winner,Win
0,1,Crvena Zvezda Meridianbet Belgrade,94,21,Crvena Zvezda Meridianbet Belgrade,1
1,1,Maccabi Playtika Tel Aviv,96,15,Maccabi Playtika Tel Aviv,1
2,1,Virtus Segafredo Bologna,79,-3,Zalgiris Kaunas,0
3,1,FC Bayern Munich,80,12,FC Bayern Munich,1
4,1,FC Barcelona,91,17,FC Barcelona,1
...,...,...,...,...,...,...
445,25,EA7 Emporio Armani Milan,73,-6,Anadolu Efes Istanbul,0
446,25,Panathinaikos AKTOR Athens,68,-12,Zalgiris Kaunas,0
447,25,FC Barcelona,85,9,FC Barcelona,1
448,25,Partizan Mozzart Bet Belgrade,84,-4,Virtus Segafredo Bologna,0


In [63]:
final=class_df.groupby(["Team"]).agg(
    Appearances=("Ronda", "count"),
    TotalPoints=("Score", "sum"),
    Plusminus=("Plusminus", "sum"),
    Wins=("Win", "sum")
    )

In [64]:
final.sort_values(["Wins", "Plusminus", "TotalPoints"], ascending=False, inplace=True)

In [65]:
final

Unnamed: 0_level_0,Appearances,TotalPoints,Plusminus,Wins
Team,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Real Madrid,25,2239,243,22
FC Barcelona,25,2064,101,17
Virtus Segafredo Bologna,25,2015,-15,16
Panathinaikos AKTOR Athens,25,2028,79,15
Fenerbahce Beko Istanbul,25,2085,56,15
AS Monaco,25,2033,56,15
Olympiacos Piraeus,25,1968,77,14
Valencia Basket,25,1914,16,13
Baskonia Vitoria-Gasteiz,25,2073,-17,13
Maccabi Playtika Tel Aviv,25,2156,-38,13


In [66]:
final["Losses"]=final["Appearances"]-final["Wins"]
final["Points"]=final["Wins"]*2

In [67]:
final.reset_index(inplace=True)

In [68]:
final

Unnamed: 0,Team,Appearances,TotalPoints,Plusminus,Wins,Losses,Points
0,Real Madrid,25,2239,243,22,3,44
1,FC Barcelona,25,2064,101,17,8,34
2,Virtus Segafredo Bologna,25,2015,-15,16,9,32
3,Panathinaikos AKTOR Athens,25,2028,79,15,10,30
4,Fenerbahce Beko Istanbul,25,2085,56,15,10,30
5,AS Monaco,25,2033,56,15,10,30
6,Olympiacos Piraeus,25,1968,77,14,11,28
7,Valencia Basket,25,1914,16,13,12,26
8,Baskonia Vitoria-Gasteiz,25,2073,-17,13,12,26
9,Maccabi Playtika Tel Aviv,25,2156,-38,13,12,26


In [70]:
actual.to_csv("actual.csv", index=False)

# WE WILL SEE THIS IN THE LAB
## Let's compare the competitions!

![](https://eurospects.com/wp-content/uploads/2018/10/eurocupeuroleague.png)