Assign categorical Lazarin 2014 Tiers #4

bschilder · 2024-04-16T14:35:37Z

@NathanSkene suggested we should use the Lazarin 2014 Tier system. But I pointed out that the reason we switched to a continuous severity score is because it provides a quantitative way of sorting the phenotypes. Also, we don't exactly recapitulate the Lazarin criteria with the GPT annotations, it's more like we were inspired by Lazarin 2014 to generate some our own somewhat similar criteria.

One mid-ground might be to create a rule-based function that attempts to approximate the Lazarin 2014 Tiers. It won't be exactly the same, but it might be useful for grouping our phenotypes into discrete severity categories.

bschilder · 2024-05-14T10:06:50Z

So after rereading Lazarin 2014, my understanding of Tiers is a bit different. Basically clinical characteristic can be assigned tiers (1-4). The tiers are then mapped onto severity categories (Mild, Moderate, Severe, Profound) like so:

So perhaps it would make more sense to map our phenotypes onto these severity categories instead of the tiers

bschilder · 2024-05-14T10:11:59Z

Lazarin 2014 also struggled with the same ambiguity we're facing regarding the role of available treatments:

Availability of treatment is not a measure of the severity of an untreated disease. However, it was rated as highly important (more so than any sensory deficit); thus, while it is not sensible to include it in an assessment of untreated severity, it is reasonable to consider it in conjunction with severity when considering disease inclusion criteria. Unfortunately, the survey's design makes it difficult to interpret responses to this characteristic: it is not clear whether respondents believed that the presence or absence of treatment was of importance.

One thing we do improve upon over Lazarin is the issue of "expressivity". We basically capture a rough approximation of this with the never/rarely/often/always classifications.

bschilder · 2024-05-14T10:29:31Z

Mapping our metrics onto Tiers is a bit challenging since they're quite different:
(from Table 1 on Lazarin 2014)

Here's my closest approximation. Notable issues:

Our metric "congenital onset" doesn't directly map onto any of the Lazarin criterion. Something can be congenital and not necessarily cause death at an early age. That said, it's still an important feature and thus perhaps worth considering adding to our tier assignments.
Our metric "physical malformation" maps onto multiple Lazarin criteria; internal physical malformation (Tier 2), and dysmorphic features (Tier 3). We currently can't distinguish between these two situations, where internal malformations are more likely to be severe since they affect organ systems. Assigning our "physical malformation" onto Tier 2 only for now.

tiers_dict <- list(
  ## Tier 1
  death=1, 
  intellectual_disability=1,
  # congenital_onset=1,
  ## Tier 2
  impaired_mobility=2, 
  physical_malformations=2,
  ## Tier 3
  blindness=3,  
  sensory_impairments=3,
  immunodeficiency=3, 
  cancer=3, 
  ## Tier 4
  reduced_fertility=4
)

NathanSkene · 2024-05-14T10:40:47Z

Good spot, hadn’t noted that flow chart before. Makes sense to use their system (maybe with ranking within e.g. profound, based on how many other tier 1 or 2 there are, I agree with this: tiers_dict <- list( ## Tier 1 death=1, intellectual_disability=1, ## Tier 2 impaired_mobility=2, physical_malformations=2, ## Tier 3 blindness=3, sensory_impairments=3, immunodeficiency=3, cancer=3, ## Tier 4 reduced_fertility=4 ) From: Brian M. Schilder ***@***.***> Date: Tuesday, 14 May 2024 at 11:29 To: neurogenomics/gpt_hpo_annotations ***@***.***> Cc: Skene, Nathan G ***@***.***>, Mention ***@***.***> Subject: Re: [neurogenomics/gpt_hpo_annotations] Assign categorical Lazarin 2014 Tiers (Issue #4) This email from ***@***.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address. Mapping our metrics onto Tiers is a bit challenging since they're quite different: (from Table 1 on Lazarin 2014) image.png (view on web)<https://github.com/neurogenomics/gpt_hpo_annotations/assets/34280215/e93dafb7-7e9d-4f8b-9aae-9011d6711312> Here's my closest approximation. Notable issues: * Our metric "congenital onset" doesn't directly map onto any of the Lazarin criterion. Something can be congenital and not necessarily cause death at an early age. * Our metric "physical malformation" maps onto multiple Lazarin criteria; internal physical malformation (Tier 2), and dysmorphic features (Tier 3). We currently can't distinguish between these two situations, where internal malformations are more likely to be severe since they affect organ systems. Assigning our "physical malformation" onto Tier 2 only for now. tiers_dict <- list( ## Tier 1 death=1, intellectual_disability=1, ## Tier 2 impaired_mobility=2, physical_malformations=2, ## Tier 3 blindness=3, sensory_impairments=3, immunodeficiency=3, cancer=3, ## Tier 4 reduced_fertility=4 ) — Reply to this email directly, view it on GitHub<#4 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AH5ZPE2CQLVLMRSSWSRCQEDZCHRSBAVCNFSM6AAAAABGJPYC62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBZHA2DINBVG4>. You are receiving this because you were mentioned.Message ID: ***@***.***>

bschilder · 2024-05-14T11:27:50Z

Severity class can be Mild, Moderate, Severe, or Profound.
I've also generated a severity class score, which is just the proportion of metrics that meet our threshold of often/always. This provides a way to rank phenotypes within each severity class as well.

res_coded <- HPOExplorer::gpt_annot_codify()

map_severity_class <- function(r,
                               tiers_dict = list(
                                ## Tier 1
                                death=1, 
                                intellectual_disability=1,
                                # congenital_onset=1,
                                ## Tier 2
                                impaired_mobility=2, 
                                physical_malformations=2,
                                ## Tier 3
                                blindness=3,  
                                sensory_impairments=3,
                                immunodeficiency=3, 
                                cancer=3, 
                                ## Tier 4
                                reduced_fertility=4
                               ),
                               inclusion_values=c(2,3), # i.e. often, always
                               return_score=FALSE){
  tiers <- unique(unlist(tiers_dict))
  tier_scores <- lapply(stats::setNames(tiers,paste0("tier",tiers)),
                        function(x){
    tx <- tiers_dict[unname(unlist(tiers_dict)==x)]
    counts <- r[,sapply(.SD, function(v){v %in% inclusion_values}), 
               .SDcols = names(tx)]
    list(
      counts=counts,
      proportion=sum(counts)/length(tx)
    )
  })
  mean_proportion <- sapply(tier_scores, function(x)x$proportion)|>mean()
  assigned_class <- if(sum(tier_scores$tier1$counts)>1){
    c("profound"=mean_proportion)
  } else if (sum(tier_scores$tier1$counts)>0 ||
             sum(c(tier_scores$tier2$counts,tier_scores$tier3$counts))>3){
    c("severe"=mean_proportion)
  } else if(sum(tier_scores$tier3$counts)>0){
    c("moderate"=mean_proportion)
  } else{
    c("mild"=mean_proportion)
  }  
  if(return_score){
    return(assigned_class)
  } else{
    return(names(assigned_class))
  }
}

res_coded$annot_coded[,severity_class:=map_severity_class(.SD), by=.I]
res_coded$annot_coded[,severity_class_score:=map_severity_class(.SD, return_score = TRUE), by=.I]

I checked that there's a correspondence between our severity scores and the severity classes assigned in this way, and indeed there is:

bschilder · 2024-05-14T15:10:06Z

Now described in Results and Methods under new section " Severity classes".

Added the violin plot to the supp as well.

bschilder self-assigned this Apr 16, 2024

bschilder closed this as completed May 14, 2024

bschilder added a commit that referenced this issue May 14, 2024

Update manuscript to address issues: #1 #2 #3 #4 #5 #6 #7 #8 #9

8dbf16c

bschilder mentioned this issue May 16, 2024

Rework the phenotype severity heatmap figure #12

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assign categorical Lazarin 2014 Tiers #4

Assign categorical Lazarin 2014 Tiers #4

bschilder commented Apr 16, 2024

bschilder commented May 14, 2024 •

edited

Loading

bschilder commented May 14, 2024

bschilder commented May 14, 2024 •

edited

Loading

NathanSkene commented May 14, 2024 via email

bschilder commented May 14, 2024

bschilder commented May 14, 2024

Assign categorical Lazarin 2014 Tiers #4

Assign categorical Lazarin 2014 Tiers #4

Comments

bschilder commented Apr 16, 2024

bschilder commented May 14, 2024 • edited Loading

bschilder commented May 14, 2024

bschilder commented May 14, 2024 • edited Loading

NathanSkene commented May 14, 2024 via email

bschilder commented May 14, 2024

bschilder commented May 14, 2024

bschilder commented May 14, 2024 •

edited

Loading

bschilder commented May 14, 2024 •

edited

Loading