Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Updated analysis: What is the right disease label/grouping to use for interaction plots? #917

Closed
jaclyn-taroni opened this issue Jan 20, 2021 · 6 comments
Assignees
Labels
question Further information is requested updated analysis

Comments

@jaclyn-taroni
Copy link
Member

What analysis module should be updated and why?

In #915, the interaction plots module is being updated to use broad_histology because integrated_diagnosis is less complete in v18 and harmonized_diagnosis has many different values. I don't think this is quite right.

The question is: What is the right disease label/grouping to use for the interaction plots module for the co-occurrence information to be useful? I suspect that the "right" grouping might come from dropping harmonized_diagnosis values with small sample sizes and combining others. If that's the case, we may need to rethink how that module is structured, e.g., by adding a notebook that explains combining labels, etc. if it is not a more general step to be used in other modules.

What input data should be used? Which data were used in the version being updated?

SNV data and broad_histology labels from v18 are being used #915.

@jaclyn-taroni jaclyn-taroni added question Further information is requested updated analysis labels Jan 20, 2021
@jharenza
Copy link
Collaborator

jharenza commented Jul 16, 2021

Regarding this question @jaclyn-taroni , we have since created some logic to make a cancer_group, which is narrower than broad_histology but broader than integrated_diagnosis or harmonized_diagnosis, so this could be good to use - we can add this to the histology file for #1048. In these, we removed benign and non-tumors and have 22 groups. Below are the number of unique patient and samples per cancer_group. This may change in OpenPBTA due to slight consensus/subtyping differences. Thoughts?

                                       Var1 Freq
1        Adamantinomatous Craniopharyngioma   22
4          Atypical Teratoid Rhabdoid Tumor   32
7                                  Chordoma    6
10                 Choroid plexus papilloma   16
11                      CNS Embryonal tumor   14
13                        Craniopharyngioma   16
14         Diffuse intrinsic pontine glioma   10
15                   Diffuse midline glioma   82
16   Dysembryoplastic neuroepithelial tumor   26
18 Embryonal tumor with multilayer rosettes    7
19                               Ependymoma   97
21                            Ewing sarcoma   10
29                 Glial-neuronal tumor NOS    7
31            High-grade glioma/astrocytoma  103
33             Low-grade glioma/astrocytoma  310
35                          Medulloblastoma  127
37                               Meningioma   32
38              Metastatic secondary tumors    5
43                   Neurofibroma/Plexiform   23
50                                  Sarcoma    6
51                               Schwannoma   19
53                                 Teratoma   10

@jaclyn-taroni
Copy link
Member Author

Adding this to the histology file for v20 for this purpose sounds good 👍🏻

@kgaonkar6
Copy link
Collaborator

Just wanted to add the original ticket for assigning cancer_group here for reference.

@kgaonkar6
Copy link
Collaborator

Should I close this since cancer_group was added in #1128? I'll add another ticket to update figures/mapping-histology-labels.Rmd to generate the new display_group hex_codes.

@jharenza
Copy link
Collaborator

sure! we also probably only want to plot N>=5 cancer groups - display group will stay in some figures, but we will also need cancer group as well for others

@kgaonkar6
Copy link
Collaborator

I believe this is closed with #1142 and #1159. Please open if required.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested updated analysis
Projects
None yet
Development

No branches or pull requests

3 participants