# Transfermarkt League Data

This table links Transfermarkt league/competition pages and identifiers.  
Main purpose: support competition-level joins and resolve league identity across seasons/pages.

We will verify:
- uniqueness of league IDs / URLs,
- country and season fields (if present),
- whether links are stable across seasons.

## 0) Imports & Setup

In [1]:
import pandas as pd
import numpy as np

pd.set_option("display.max_columns", 200)
pd.set_option("display.width", 180)

TM_PATH = "../../raw_data_agust_tm/"
TW_PATH = "../../raw_data_agust_12/"
WY_PATH = "../../raw_data_agust_wy/"


df_leagues = pd.read_parquet(f"{TM_PATH}tm_league_links.parquet")
df_comp = pd.read_parquet(f"{WY_PATH}competitions_wyscout.parquet")

## 1) Quick Data Snapshot

In [2]:
df_leagues.shape

(399, 3)

In [3]:
df_leagues.dtypes.value_counts()

object    3
Name: count, dtype: int64

In [4]:
df_leagues.head()

Unnamed: 0,league_name,country,tm_link
0,Premier League,England,/premier-league/startseite/wettbewerb/GB1
1,LaLiga,Spain,/laliga/startseite/wettbewerb/ES1
2,Serie A,Italy,/serie-a/startseite/wettbewerb/IT1
3,Bundesliga,Germany,/bundesliga/startseite/wettbewerb/L1
4,Ligue 1,France,/ligue-1/startseite/wettbewerb/FR1


In [5]:
df_leagues.columns.tolist()

['league_name', 'country', 'tm_link']

In [6]:
df_leagues.isna().mean().sort_values(ascending=False).head(15)

league_name    0.0
country        0.0
tm_link        0.0
dtype: float64

In [7]:
df_leagues.isna().mean().sort_values(ascending=False).head(15)

league_name    0.0
country        0.0
tm_link        0.0
dtype: float64

In [8]:
df_leagues.duplicated().sum()

np.int64(0)

In [9]:
league_overlap = (
    df_comp[["competition_id", "name", "country"]]
    .drop_duplicates()
    .merge(
        df_leagues[["league_name", "country", "tm_link"]],
        left_on=["name", "country"],
        right_on=["league_name", "country"],
        how="inner"
    )
)

league_overlap.shape
league_overlap.head(10)

Unnamed: 0,competition_id,name,country,league_name,tm_link
0,127,Abissnet Superiore,Albania,Abissnet Superiore,/abissnet-superiore/startseite/wettbewerb/ALB1
1,137,Girabola,Angola,Girabola,/girabola/startseite/wettbewerb/AN1L
2,143,Primera Nacional,Argentina,Primera Nacional,/primera-nacional/startseite/wettbewerb/ARG2
3,166,2. Liga,Austria,2. Liga,/2-liga/startseite/wettbewerb/A2
4,168,Bundesliga,Austria,Bundesliga,/bundesliga/startseite/wettbewerb/A1
5,177,Premyer Liqa,Azerbaijan,Premyer Liqa,/premyer-liqa/startseite/wettbewerb/AZ1
6,202,Challenger Pro League,Belgium,Challenger Pro League,/challenger-pro-league/startseite/wettbewerb/BE2
7,213,Premijer Liga,Bosnia-Herzegovina,Premijer Liga,/premijer-liga/startseite/wettbewerb/BOS1
8,271,Cambodian Premier League,Cambodia,Cambodian Premier League,/cambodian-premier-league/startseite/wettbewer...
9,283,Primera B,Chile,Primera B,/primera-b/startseite/wettbewerb/CL2B


In [10]:
league_overlap["competition_id"].nunique(), df_comp["competition_id"].nunique()

(53, 269)

## Transfermarkt League Links â€“ Final Assessment

- Only 53 out of 269 Wyscout competitions (~20%) match a Transfermarkt league by exact name and country.
- Matches are largely confined to top-tier European leagues.
- Differences in naming conventions, league structure, and regional competitions prevent reliable automated matching.