# Data Enrichment and Derived Tables

The basic dataframes provide a good start for analysis, but there are a few gaps in the data.

For example, the withdrawals data does not directly provide information regarding what class a withdrawal is associated with.

To find the class, we need to merge in data from the `clazz` dataset.

## Enriching the Withdrawal Data

Let's enrich the base withdrawal dataframe by adding in a column that identifies the vehicle class. To do this, we can use the 

In [1]:
from dakar_rallydj.getter import get_withdrawals

withdrawals_df, withdrawn_competitors_df, withdrawn_teams_df = get_withdrawals()

withdrawn_teams_df.head()

Unnamed: 0,team.bib,team.brand,team.model,team.vehicle,team.vehicleImg,team.clazz,team.w2rc
9,202,MINI,JCW RALLY 3.0I,X-RAID MINI JCW TEAM,https://img.aso.fr/core_app/img-motorSports-da...,96c0869600e0013dbf5f86f60e5c4da4,False
10,206,TOYOTA,HILUX IMT EVO,TOYOTA GAZOO RACING,https://img.aso.fr/core_app/img-motorSports-da...,96c0869600e0013dbf5f86f60e5c4da4,False
17,208,TOYOTA,HILUX,GURTAM TOYOTA GAZOO RACING BALTICS,https://img.aso.fr/core_app/img-motorSports-da...,96c0869600e0013dbf5f86f60e5c4da4,False
28,213,MD,OPTIMUS,MD RALLYE SPORT,https://img.aso.fr/core_app/img-motorSports-da...,f00d7ec8d2d96e9cf11aa515109376cf,False
0,219,DACIA,SANDRIDER,THE DACIA SANDRIDERS,https://img.aso.fr/core_app/img-motorSports-da...,96c0869600e0013dbf5f86f60e5c4da4,True


In [2]:
from dakar_rallydj.getter import get_clazz

clazz_df = get_clazz()
clazz_df.head()

Unnamed: 0,label,position,reference,refueling,promotionalDisplay,shortLabel,_bind,_id,_parent,$group,color,tinyLabel,ar,en,es,fr,categoryClazz
10,+,4,2025-A-T1-+,0,True,cat.name.A_T1_+,allClazz-2025-A,96c0869600e0013dbf5f86f60e5c4da4,categoryGroup-2025-A:b49155b3f5670d2a907aa01e3...,categoryGroup-2025-A:b49155b3f5670d2a907aa01e3...,,,T1+: Prototype Cross-Country Cars 4x4,T1+: Prototype Cross-Country Cars 4x4,T1+: Prototype Cross-Country Cars 4x4,T1+ : Voitures Tout-terrain Prototypes 4x4,2025-A-T1
13,1,0,2025-A-T1-1,0,True,cat.name.A_T1_1,allClazz-2025-A,f666973e89db183ecfefc75c3af8ffb1,categoryGroup-2025-A:b49155b3f5670d2a907aa01e3...,categoryGroup-2025-A:b49155b3f5670d2a907aa01e3...,,,T1.1 Prototype Cross-Country Cars 4x4,T1.1 Prototype Cross-Country Cars 4x4,T1.1 Prototype Cross-Country Cars 4x4,T1.1 : Voitures Tout-terrain Prototypes 4x4,2025-A-T1
11,2,1,2025-A-T1-2,0,True,cat.name.A_T1_2,allClazz-2025-A,f00d7ec8d2d96e9cf11aa515109376cf,categoryGroup-2025-A:b49155b3f5670d2a907aa01e3...,categoryGroup-2025-A:b49155b3f5670d2a907aa01e3...,,,T1.2 Prototype Cross-Country Cars 4x2,T1.2 Prototype Cross-Country Cars 4x2,T1.2 Prototype Cross-Country Cars 4x2,T1.2 : Voitures Tout-terrain Prototypes 4x2,2025-A-T1
12,3,2,2025-A-T1-3,0,True,cat.name.A_T1_3,allClazz-2025-A,f071b5dbfd586a4ba46100196a98a9c4,categoryGroup-2025-A:b49155b3f5670d2a907aa01e3...,categoryGroup-2025-A:b49155b3f5670d2a907aa01e3...,,,T1.3 FIA: النتيجة,T1.3: SCORE,T1.3 : SCORE,T1.3 : SCORE,2025-A-T1
9,U,3,2025-A-T1-U,0,True,cat.name.A_T1_U,allClazz-2025-A,1501ebcbaf3ad27e72aecfba7faa8037,categoryGroup-2025-A:b49155b3f5670d2a907aa01e3...,categoryGroup-2025-A:b49155b3f5670d2a907aa01e3...,,,"T1.U: ""Ultimate"" Prototype Cross-Country Cars","T1.U: ""Ultimate"" Prototype Cross-Country Cars","T1.U: ""Ultimate"" Prototype Cross-Country Cars","T1.U : Voitures Tout-Terrain Prototypes ""Ultim...",2025-A-T1


In [4]:
import pandas as pd

clazz_map = pd.merge(withdrawn_teams_df[["team.bib", "team.clazz"]], clazz_df[[
                     "_id", "reference", "categoryClazz", "en"]], left_on="team.clazz", right_on="_id").drop(columns=["team.clazz", "_id"]).rename(columns={"en": "clazz_label"})

clazz_map.head()

Unnamed: 0,team.bib,reference,categoryClazz,clazz_label
0,202,2025-A-T1-+,2025-A-T1,T1+: Prototype Cross-Country Cars 4x4
1,206,2025-A-T1-+,2025-A-T1,T1+: Prototype Cross-Country Cars 4x4
2,208,2025-A-T1-+,2025-A-T1,T1+: Prototype Cross-Country Cars 4x4
3,213,2025-A-T1-2,2025-A-T1,T1.2 Prototype Cross-Country Cars 4x2
4,219,2025-A-T1-+,2025-A-T1,T1+: Prototype Cross-Country Cars 4x4


In [4]:
clazz_map['reference'].unique()

array(['2025-A-T1-+', '2025-A-T1-2', '2025-A-T1-1', '2025-A-T3-1',
       '2025-A-T4-SSV1', '2025-A-T4-T4', '2025-A-T5-1', '2025-A-T5-2'],
      dtype=object)

We can also pull in further information from the `groups` data.

In [6]:
from dakar_rallydj.getter import get_groups

groups_df = get_groups()
groups_df.head()

Unnamed: 0,promotionalDisplay,shortLabel,reference,tinyLabel,position,label,_bind,_origin,_id,_parent,color,ar,en,es,fr
8,True,cat.name.A_T1,2025-A-T1,ULT,0,T1,allGroups-2025,categoryGroup-2025-A,b49155b3f5670d2a907aa01e319876b8,category-2025:63b4f5da4591200d0a4cc239245eb03a,#EBBC4E,Ultimate,Ultimate,Ultimate,Ultimate
7,True,cat.name.A_T2,2025-A-T2,STK,1,T2,allGroups-2025,categoryGroup-2025-A,4dac064bf100bc806b91e7f2e7758297,category-2025:63b4f5da4591200d0a4cc239245eb03a,#C7C9C7,Stock,Stock,Stock,Stock
5,True,cat.name.A_T3,2025-A-T3,CHG,4,T3,allGroups-2025,categoryGroup-2025-A,15f329900afa29e3e6b099ae681ebe12,category-2025:63b4f5da4591200d0a4cc239245eb03a,#E04E39,Challenger,Challenger,Challenger,Challenger
6,True,cat.name.A_T4,2025-A-T4,SSV,5,T4,allGroups-2025,categoryGroup-2025-A,423ea731fdcba5cda62c8334985889b0,category-2025:63b4f5da4591200d0a4cc239245eb03a,#A7C6ED,SSV,SSV,SSV,SSV
9,True,cat.name.A_T5,2025-A-T5,TRK,6,T5,allGroups-2025,categoryGroup-2025-A,f1a437ac1135c9d9a5e33f5096f95259,category-2025:63b4f5da4591200d0a4cc239245eb03a,#2D2926,شاحنة,Truck,Camión,Camion


For convenience, let's merge some of that in:

In [9]:
clazz_map = pd.merge(clazz_map, groups_df[["reference", "tinyLabel", "label", "color", "en"]].rename(
    columns={"reference":"categoryClazz"}), on="categoryClazz").rename(columns={"en": "group_label"})

clazz_map.head()

Unnamed: 0,team.bib,reference,categoryClazz,clazz_label,tinyLabel,label,color,group_label
0,202,2025-A-T1-+,2025-A-T1,T1+: Prototype Cross-Country Cars 4x4,ULT,T1,#EBBC4E,Ultimate
1,206,2025-A-T1-+,2025-A-T1,T1+: Prototype Cross-Country Cars 4x4,ULT,T1,#EBBC4E,Ultimate
2,208,2025-A-T1-+,2025-A-T1,T1+: Prototype Cross-Country Cars 4x4,ULT,T1,#EBBC4E,Ultimate
3,213,2025-A-T1-2,2025-A-T1,T1.2 Prototype Cross-Country Cars 4x2,ULT,T1,#EBBC4E,Ultimate
4,219,2025-A-T1-+,2025-A-T1,T1+: Prototype Cross-Country Cars 4x4,ULT,T1,#EBBC4E,Ultimate
