# Transfermarkt Teams Data

This table contains Transfermarkt team identifiers and team metadata.  
Main purpose: standardize team IDs and support joins from player histories or league links.

We will verify:
- unique team IDs,
- naming consistency,
- country / competition fields if available.

## 0) Imports & Setup

In [1]:
import pandas as pd
import numpy as np

pd.set_option("display.max_columns", 200)
pd.set_option("display.width", 180)

TM_PATH = "../../raw_data_agust_tm/"
TW_PATH = "../../raw_data_agust_12/"
WY_PATH = "../../raw_data_agust_wy/"

df_teams = pd.read_parquet(f"{TM_PATH}tm_teams.parquet")
df_transfers = pd.read_parquet(f"{TW_PATH}male_transfers_data.parquet")

## 1) Quick Data Snapshot

In [2]:
df_teams.shape

(5890, 3)

In [3]:
df_teams.dtypes.value_counts()

object     2
float64    1
Name: count, dtype: int64

In [4]:
df_teams.head()

Unnamed: 0,team_name,tm_id,country
0,Arsenal FC,11.0,England
1,Aston Villa,405.0,England
2,AFC Bournemouth,989.0,England
3,Brentford FC,1148.0,England
4,Brighton & Hove Albion,1237.0,England


In [5]:
df_teams.columns.tolist()

['team_name', 'tm_id', 'country']

In [6]:
df_teams.isna().mean().sort_values(ascending=False).head(15)

team_name    0.0
tm_id        0.0
country      0.0
dtype: float64

In [7]:
df_teams.duplicated().sum()

np.int64(0)

## 2) Team ID Check

In [8]:
tm_team_ids = set(df_teams["tm_id"].unique())

twelve_team_ids = set(
    pd.concat([
        df_transfers["from_team_id"],
        df_transfers["to_team_id"],
    ]).dropna().unique()
)

len(tm_team_ids), len(twelve_team_ids)

(5890, 5739)

In [9]:
n_match = len(tm_team_ids & twelve_team_ids)
coverage = n_match / len(twelve_team_ids)

n_match, coverage

(647, 0.112737410698728)

In [10]:
missing_teams = twelve_team_ids - tm_team_ids
len(missing_teams)

5092

In [11]:
df_transfers.loc[
    df_transfers["from_team_id"].isin(list(missing_teams)) |
    df_transfers["to_team_id"].isin(list(missing_teams)),
    ["from_team_id","to_team_id","from_competition","to_competition","from_season","to_season"]
].drop_duplicates().head(10)

Unnamed: 0,from_team_id,to_team_id,from_competition,to_competition,from_season,to_season
3541,8714,10985,127,546,2018,2019
3542,8682,10989,127,546,2018,2019
3544,8714,63764,127,43114,2018,2019
3546,8682,10032,127,591,2018,2018
3547,8682,10014,127,591,2018,2019
3548,8677,10014,127,591,2018,2018
3549,8673,10014,127,591,2018,2018
3550,8688,14035,127,723,2018,2018
3551,8714,30817,127,720,2018,2019
3552,8714,15267,127,607,2018,2019


## Transfermarkt Teams â†” Twelve Football Teams Coverage

- Only ~11% of team IDs appearing in the Twelve Football dataset match a Transfermarkt team ID.
- The vast majority of Twelve team IDs (~5,000+) have no corresponding Transfermarkt identifier.
- This indicates that team identifiers in Twelve Football and Transfermarkt belong to different namespaces.