Skip to content

Nested Data-Frame Flattening with better name repair #284

@pbulsink

Description

@pbulsink

There's lots of examples (nearly every function calling Jolpica data) that has a tidyr::unnest() function call. These sort out things like the Fastest Lab grouped data in race results, which are normally stored as a data.frame in data.frame

'data.frame':	20 obs. of  11 variables:
 $ number      : chr  "81" "4" "63" "1" ...
 $ position    : chr  "1" "2" "3" "4" ...
 $ positionText: chr  "1" "2" "3" "4" ...
 $ points      : chr  "25" "18" "15" "12" ...
 $ Driver      :'data.frame':	20 obs. of  8 variables:
  ..$ driverId       : chr  "piastri" "norris" "russell" "max_verstappen" ...
  ..$ permanentNumber: chr  "81" "4" "63" "33" ...
  ..$ code           : chr  "PIA" "NOR" "RUS" "VER" ...
  ..$ url            : chr  "[http://en.wikipedia.org/wiki/Oscar_Piastri](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#)" "[http://en.wikipedia.org/wiki/Lando_Norris](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#)" "[http://en.wikipedia.org/wiki/George_Russell_(racing_driver](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#))" "[http://en.wikipedia.org/wiki/Max_Verstappen](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#)" ...
  ..$ givenName      : chr  "Oscar" "Lando" "George" "Max" ...
  ..$ familyName     : chr  "Piastri" "Norris" "Russell" "Verstappen" ...
  ..$ dateOfBirth    : chr  "2001-04-06" "1999-11-13" "1998-02-15" "1997-09-30" ...
  ..$ nationality    : chr  "Australian" "British" "British" "Dutch" ...
 $ Constructor :'data.frame':	20 obs. of  4 variables:
  ..$ constructorId: chr  "mclaren" "mclaren" "mercedes" "red_bull" ...
  ..$ url          : chr  "[http://en.wikipedia.org/wiki/McLaren](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#)http://en.wikipedia.org/wiki/McLaren[http://en.wikipedia.org/wiki/McLaren](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#)" "http://en.wikipedia.org/wiki/McLaren" "[http://en.wikipedia.org/wiki/Mercedes-Benz_in_Formula_One](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#)" "[http://en.wikipedia.org/wiki/Red_Bull_Racing](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html#)" ...
  ..$ name         : chr  "McLaren" "McLaren" "Mercedes" "Red Bull" ...
  ..$ nationality  : chr  "British" "British" "German" "Austrian" ...
 $ grid        : chr  "1" "3" "2" "4" ...
 $ laps        : chr  "56" "56" "56" "56" ...
 $ status      : chr  "Finished" "Finished" "Finished" "Finished" ...
 $ Time        :'data.frame':	20 obs. of  2 variables:
  ..$ millis: chr  "5455026" "5464774" "5466123" "5471682" ...
  ..$ time  : chr  "1:30:55.026" "+9.748" "+11.097" "+16.656" ...
 $ FastestLap  :'data.frame':	20 obs. of  3 variables:
  ..$ rank: chr  "3" "1" "5" "2" ...
  ..$ lap : chr  "53" "53" "55" "56" ...
  ..$ Time:'data.frame':	20 obs. of  1 variable:
  .. ..$ time: chr  "1:35.520" "1:35.454" "1:35.816" "1:35.488" ...

Using tidyr::unnest() results in flattened data.frames but some of these columns end up with names like time...21. If any of the other data structure changes (an additional column is added elsewhere), the name then becomes time...22 (for example).

In the code, we rename my match (e.g. select("driver", "grid", "laps", ..., "fastest_lap" = "time...21"), but this causes problems in the code (see #272 #281 #277 and more).

I intend (at some point) to refactor out the tidyr::unnest() calls with a manually flattening function that will remain consistent (i.e. allow us to call "fastest" = "Fastest.time").

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions