Skip to content

al-obrien/pangoRo

Repository files navigation

pangoRo

R-CMD-check Codecov test coverage

COVID-19 lineage names can be confusing to navigate; there are many aliases and if you want to catch them all to examine further, it helps to have some additional tools…

{pangoRo} is an R package to support interacting with PANGO lineage information. The core functionality was inspired by a similar package called pango_aliaser created by Cornelius Roemer for the Python language.

Installation

You can install {pangoRo} from GitHub:

remotes::install_github('al-obrien/pangoRo')

Examples

The basic usage of {pangoRo} is to expand, collapse, and sort COVID-19 lineages. Start by creating the pangoro object that links to the latest (or cached) PANGO reference. This is then passed to subsequent operations as reference.

library(pangoRo)

# Create pangoro object
my_pangoro <- pangoro()
#> Loading alias table from PANGO webiste...

Collapse

With a vector of PANGO lineages, provide fully collapsed output.

# Vector of COVID-19 lineages to collapse
cov_lin <- c('B.1.617.2', 'BL.2', 'B.1.1.529.2.75.1.2', 'BA.2.75.1.2', 'XD.1')

# Collapse lineage names as far as possible
collapse_pangoro(my_pangoro, cov_lin)
#> [1] "B.1.617.2" "BL.2"      "BL.2"      "BL.2"      "XD.1"

Can also define how far to collapse each input.

collapse_pangoro(my_pangoro, cov_lin, max_level = 1)
#> [1] "B.1.617.2"   "BL.2"        "BA.2.75.1.2" "BL.2"        "XD.1"

Expand

# Vector of COVID-19 lineages to expand
cov_lin <- c('B.1.617.2', 'B.1.617.2.6', 'AY.4', 'AY.39', 'BL.2', 'BA.1', 'AY.2', 'XD.1')

# Expand lineage names as far as possible
exp_lin <- expand_pangoro(my_pangoro, cov_lin)
exp_lin
#>            B.1.617.2          B.1.617.2.6                 AY.4 
#>          "B.1.617.2"        "B.1.617.2.6"        "B.1.617.2.4" 
#>                AY.39                 BL.2                 BA.1 
#>       "B.1.617.2.39" "B.1.1.529.2.75.1.2"        "B.1.1.529.1" 
#>                 AY.2                 XD.1 
#>        "B.1.617.2.2"               "XD.1"

Sort

Perform a pseudo-sort on the lineage names.

# Sort lineages
sort_pangoro(my_pangoro, exp_lin)
#>                 BA.1                 BL.2            B.1.617.2 
#>        "B.1.1.529.1" "B.1.1.529.2.75.1.2"          "B.1.617.2" 
#>                 AY.2                 AY.4          B.1.617.2.6 
#>        "B.1.617.2.2"        "B.1.617.2.4"        "B.1.617.2.6" 
#>                AY.39                 XD.1 
#>       "B.1.617.2.39"               "XD.1"

Split the lineages by their lowest alias codes and sort within each grouping

collapsed_full <- collapse_pangoro(my_pangoro, cov_lin, aliase_parent = TRUE) 
grps <-  split(collapsed_full, sapply(strsplit(collapsed_full, split = '\\.'), `[[`, 1))
lapply(grps, function(x) sort_pangoro(my_pangoro, x))
#> $AY
#> [1] "AY"    "AY.2"  "AY.4"  "AY.6"  "AY.39"
#> 
#> $BA
#> [1] "BA.1"
#> 
#> $BL
#> [1] "BL.2"
#> 
#> $XD
#> [1] "XD.1"

Detect recombinant lineages

Although initial recombinant variants are typically obvious based upon their X prefix, their children may not be (e.g. EG.1).

is_recombinant(my_pangoro,
               c('EG.1', 'EC.1', 'BA.1', 'XBB.1.9.1.1.5.1', 'B.1.529.1'))
#> [1]  TRUE FALSE FALSE  TRUE FALSE

About

R toolset to work with PANGO lineages

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Packages

No packages published

Languages