Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nonhuman genetic disease pattern #7794

Merged
merged 3 commits into from
Aug 17, 2024
Merged

nonhuman genetic disease pattern #7794

merged 3 commits into from
Aug 17, 2024

Conversation

sabrinatoro
Copy link
Collaborator

Addresses #6085

I also updated the existing patterns for genetic diseases in human to point to the specific "human disease" and not general "disease"

- gene

equivalentTo:
text: '%s and ''has material basis in germline mutation in'' some %s'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmungall and @matentzn
I did not add the taxon here because the taxon is included in the gene record.
In your opinion, do you think we should add taxon information here? Is there a reason to add this redundancy?
Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need either

  1. a property chain 'has material basis in germline mutation in' o 'in taxon' -> 'in taxon', in RO
  2. the ensure the genus is already species-specific. Presumably this is not always the case as some PS analogs will be cross-mammal, cross-vertebrate, etc
  3. just make a var for taxon even if it is duplicative with either/both genus/differentia

But going back to your Q from slack about infectious disease classification:

  1. I would in general try and keep the ladders parallel. If some order/class/etc level virus is a useful grouping for H, it should also be for NH? But I appreciate this may not always be the case, esp for agents that never infect human, in which case
  2. create a non_human_disease_by_infectious_agent DP
    • this would only use NH terms as genus
    • the taxon of the host would need to go in as a variable
    • classification would follow NCBITaxonomy
    • we can worry about making the xp-analog relations complete via rules later

If you do go with 2 it points to doing the same this for this PR, e.g. including a taxon var

@matentzn
Copy link
Member

Best add to Tech call agenda

@twhetzel
Copy link
Collaborator

Added, but the next Tech call is not until Friday, June 21

@sabrinatoro
Copy link
Collaborator Author

@katiermullen @matentzn @cmungall
I updated the pattern to include the taxon information (decided to go the redundancy route).
Could one of you please review and approve? Thank you!

Copy link
Collaborator

@katiermullen katiermullen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sabrinatoro I have one tiny comment. Can you please take a look? Otherwise, it looks great to me and can be merged.

@@ -22,38 +24,45 @@ annotationProperties:
vars:
disease: '''disease'''
gene: '''gene'''
taxon: "'taxon'"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my education - is this supposed to be '''taxon''' and not "'taxon'"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably both are ok.. but definetly always the same. Good eyes!

Copy link
Member

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!


classes:
disease: MONDO:0005583
gene: SO:0001217
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct but maybe an unnecessary complication for the matching system. Consider using SO:0000704 instead so that DOSDP match always works even if the source does not distinguish between protein codeing and not protein coding genes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I will also update the human pattern

@@ -22,38 +24,45 @@ annotationProperties:
vars:
disease: '''disease'''
gene: '''gene'''
taxon: "'taxon'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably both are ok.. but definetly always the same. Good eyes!

- gene

equivalentTo:
text: '%s and (''has material basis in germline mutation in'' some %s) and (''in taxon'' some %s)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note here that there are two ways you can model this:

  1. you can push the taxon into the gene. assuming you always use hgnc identifiers for humans and ncbigenes for animals, and you always use the correct gene id pertaining to a taxon specific gene, this is enough, no need for recording the taxon specifically to distinguish from human disease
  2. however, you likely want the taxon to separate grouping classes "dog disease", which means having the taxon can be helpul (unless the genes in our ingest have them already!). You might also need this for non-gene non-human-animal diseases, so maybe just curate redundantly. If you do, however leave it like you have it, not that technically, you may want to add the human taxon to the human diseases.

Long story short: its probably best to leave it has you have it, just adding some considerations in for future decision making.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a lot of discussion, we decided to keep the redundancy. But I agree, we might decide to remove this redundancy at a later time

@sabrinatoro sabrinatoro merged commit dff7a44 into master Aug 17, 2024
1 check passed
@sabrinatoro sabrinatoro deleted the issue-6085-240611 branch August 17, 2024 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants