Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query Selector Duplicates #120

Closed
sstemann opened this issue Feb 16, 2023 · 17 comments
Closed

Query Selector Duplicates #120

sstemann opened this issue Feb 16, 2023 · 17 comments
Assignees
Labels
autocomplete UI - feature request future enhancement identified by user UI - term selection identification of the specific node and context to be selected for a query

Comments

@sierra-moxon
Copy link
Member

potentially related issues:

@gaurav
Copy link

gaurav commented Feb 17, 2023

I'm planning to generate some kind of report of identically named concepts from the Name Resolver for issue TranslatorSRI/NameResolution#40

@sierra-moxon
Copy link
Member

@gaurav - ok to assign this to you for a bit while you generate the report? That might help us figure out the problem or how widespread the problem could be?

@gaurav
Copy link

gaurav commented Feb 24, 2023

That is A-okay!

@sstemann
Copy link
Author

sstemann commented Mar 9, 2023

UI team is also working on fixes for this

@sstemann sstemann added UI - term selection identification of the specific node and context to be selected for a query UI - feature request future enhancement identified by user labels Mar 9, 2023
@sierra-moxon sierra-moxon added this to the July 31 milestone Jun 1, 2023
@sierra-moxon
Copy link
Member

Screen Shot 2023-06-05 at 12 57 21 PM

@sierra-moxon sierra-moxon modified the milestones: B: July 31 , A: June 30 Jun 6, 2023
@gaurav
Copy link

gaurav commented Jun 23, 2023

Andy:

  • Turning off everything that isn't MONDO helps, because we would expect a single ontology have consistent names.
  • Adding in HP might cause this problem to return.
  • Therefore, identifying cliques with identical preferred names isn't a high priority for us right now.

@Genomewide
Copy link

Genomewide commented Jun 23, 2023

Screen shot showing the fix for the above description by @gaurav
image

this is currently in CI.

@sstemann
Copy link
Author

the disease and chemical query inputs are improved, i'm not sure about the "What genes are up/down regulated by: [chemical]"? search.

image

Looks like one is HMDB:HMDB0259186 (https://ui.test.transltr.io/results?l=Triclopyr&i=HMDB:HMDB0259186&t=3&q=938bddae-c95f-4a88-adf3-3712a5a3674e) and the other is PUBCHEM.COMPOUND:41428 (https://ui.test.transltr.io/results?l=Triclopyr&i=PUBCHEM.COMPOUND:41428&t=3&q=b5a43e9a-6e26-4c22-b7ee-231df4c1e645)

From node norm it looks like HMDB:HMDB0259186 isn't matching to anything. Can we tell if this is a systematic issue with HMDB CURIES? Does HMDB have chemicals that other sources don't that need to be included in MVP2?

{
"HMDB:HMDB0259186": {
"id": {
"identifier": "HMDB:HMDB0259186",
"label": "TRICLOPYR"
},
"equivalent_identifiers": [
{
"identifier": "HMDB:HMDB0259186",
"label": "TRICLOPYR"
}
],
"type": [
"biolink:SmallMolecule",
"biolink:MolecularEntity",
"biolink:ChemicalEntity",
"biolink:NamedThing",
"biolink:Entity",
"biolink:PhysicalEssence",
"biolink:ChemicalOrDrugOrTreatment",
"biolink:ChemicalEntityOrGeneOrGeneProduct",
"biolink:ChemicalEntityOrProteinOrPolypeptide",
"biolink:PhysicalEssenceOrOccurrent"
]
}
}

@sstemann
Copy link
Author

@dnsmith124 i'm still seeing this use case in CI - should this go back to name resolver?

image

@dnsmith124
Copy link
Collaborator

I think so @sstemann

Paging @gaurav! See above

@gaurav
Copy link

gaurav commented Jul 28, 2023

@cbizon is working on Drug Conflation that should help with this. We're hoping to be able to report on this late next week. Thanks for reporting this!

I think it would also be helpful if we could add links to let people look up these concepts in an ontology: what do you think, @dnsmith124?

@dnsmith124
Copy link
Collaborator

@gaurav I've actually already implemented linkouts for NCBIGENE, PUBCHEM, HMDB and CHEMBL, should be live on CI by next week!

@sstemann
Copy link
Author

sstemann commented Aug 10, 2023

@gaurav i'm back on the Disease selection, when i type "develop" the first responses i get are "Developmental and Epileptic encephalopathy, 40" , but when i type out to "development" the list updates, i'm also wondering about when a disease has variations that are numbered, how the numbering is ordered. this might be more related to #85

image

image

@sstemann
Copy link
Author

for drugs the link out seems to be more spotty
image

@sstemann
Copy link
Author

seems improved in prod 9/6
image

@Genomewide
Copy link

@dnsmith124 I think we can work on the ordering of the autocomplete a bit. One tune-up as suggested above is the numbered subclasses come back in order. Also, any exact match comes to the top. I am thinking of a full exact match to the string. So, 'Alz' would exactly match all the numbered Alzheimer diseases but 'Alzheimer diseases' may only match one?

Related is also having the human gene show up at the top of the list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autocomplete UI - feature request future enhancement identified by user UI - term selection identification of the specific node and context to be selected for a query
Projects
None yet
Development

No branches or pull requests

6 participants