Skip to content

MaxenceJacquemin/WTA-Protected-Ranking-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Protected Ranking in Women’s Professional Tennis: Epidemiological Trends and Career Sustainability

This repository contains the R scripts and dataset required to reproduce the analyses presented in our manuscript evaluating the Women's Tennis Association (WTA) Protected Ranking (PR) mechanism.

Overview

The Protected Ranking mechanism is a regulatory tool facilitating the reintegration of elite athletes following prolonged absences, such as injury or maternity leave. This project analyzes 21,779 Grand Slam matches (2000–2024) to evaluate the epidemiological incidence of PR usage and assess its competitive validity using Bradley-Terry models and natural spline-based logistic regression.

Repository Contents

  • Data/ : Contains the reconstructed and cross-validated dataset of Grand Slam matches (Protected Ranking Entries.xlsx, datos_combinados.rds and datos_combinados.RData). The first one is to correct missing data of PR entry. The second is the database used to provide the analysis.
  • Study/ : Contains the R code used for the whole analysis:
    • WTA_Analysis.Rmd which contains all the code used for the analysis as well as some comments on the analysis. Those comments were only draft for the preliminary redaction of the final paper. They might contain some mistakes. The main code that you will find is:
      • Reconstruction of the database due to missing data.
      • Descriptive statistics and epidemiological calculations (Incidence Proportion, CIR, RD).
      • Bradley-Terry modeling for entry category comparisons.
      • Linear and non-linear (cubic spline) logistic regression models.
      • Bootstrapped stepwise selection for predictors of PR victories.
      • Analysis on withdrawal of female athletes.
    • WTA_Analysis.htlm which is the htlm version of the WTA_Analysis.Rmd.
    • matrix_sel.RData : Pre-calculated bootstrap selection matrix to reduce compilation time.
    • libraries.R: All the libraries used.

Prerequisites

The analysis was conducted in R (version 4.4.2). To run the scripts, ensure you have the following packages installed: binom, BradleyTerry2, broom, car, caret, compareGroups, corrplot, data.table, DHARMa, dplyr, effects, emmeans, fmsb, forcats, ggplot2, knitr, lme4, MASS, mgcv, mgcViz, party, pheatmap, readxl, reshape2, splines, stringr, tidyr, fmsb.

Usage

To reproduce the analysis, clone this repository and open the primary .rmd or .R script in RStudio. Ensure that your working directory is set to the root of the cloned repository so that relative file paths (e.g., loading the Excel file) resolve correctly.

Citation

(Note: Citation details will be updated upon publication).

About

Code and analytical dataset for the epidemiological and multivariable statistical analysis of the Protected Ranking (PR) mechanism in WTA Grand Slam tournaments (2000–2024).

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages