Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promiscuous compounds #617

Open
sandrine-muller-research opened this issue Oct 24, 2023 · 11 comments
Open

Promiscuous compounds #617

sandrine-muller-research opened this issue Oct 24, 2023 · 11 comments
Assignees
Labels
needs review this ticket needs a broad group of people to review and assign next steps because it crosses teams retest needed

Comments

@sandrine-muller-research
Copy link
Collaborator

sandrine-muller-research commented Oct 24, 2023

Scrutinizing a bit more the results (first 5 pages of 50 compounds or so) for several queries (~10), I have started to see some ChemicalEntities that seem to be promiscuous (coming as an answer for a lot of queries) :
sirolimus
valproic acid
gadolinium
curcumin
Ergocalciferol
Ozone
Nicotine
Tacrolimus
Mercury
propofol
Potassium chloride
aspirin
tramadol
propofol
Sodium chloride
Calcium
codeine
Methylene blue
Propranolol
Ethanol
Oxygen
Analgesics
Ibuprofen
They seem to come as results because they are linked to a lot of diseases and score up too high to my taste. I wonder if @MarkDWilliams 's cardinality could help lower them down?

@sandrine-muller-research
Copy link
Collaborator Author

sandrine-muller-research commented Nov 20, 2023

Hydrogen
Bisphenol a
Aspirin
Nitrogen
Hydrogen peroxide
Metformin
Estradiol
Morphine
Zinc
Imatinib
Formaldehyde

@sierra-moxon sierra-moxon added needs review this ticket needs a broad group of people to review and assign next steps because it crosses teams group2 labels Jan 23, 2024
@cbizon
Copy link
Collaborator

cbizon commented Jan 23, 2024

Needs investigation to determine amount and cause? Perhaps this would be something good to define some tests around.

@cbizon cbizon added deferred/will not fix this release and removed needs review this ticket needs a broad group of people to review and assign next steps because it crosses teams labels Jan 23, 2024
@khanspers
Copy link

Additional compounds that are frequently returned as results for various queries:

  • Lipopolysaccharide
  • Cobalt chloride

@sierra-moxon
Copy link
Member

adding "needs review" so that we can assign a volunteer or group of volunteers to design tests and/or otherwise determine next steps.

@sierra-moxon sierra-moxon added the needs review this ticket needs a broad group of people to review and assign next steps because it crosses teams label May 17, 2024
@sierra-moxon
Copy link
Member

from TAQA:
dexamethisone
vitamin A
chemotherapeutic agents
are two more (transcription-bombs seem to link everything together).
TNF is a gene all over the place (but the evidence is sound)

what we need to do:

  • go thru a series of queries programmatically and answer the questions:

is it always one ARA that is returning one of these, or a set of these? is it from one source or many sources? (this could change what we want to do about it).

is there evidence for each of these or are they in the middle of hops -- highly connected nodes, etc...

do ARAs check the inverse? do we also check what this drug can treat vs. the other way - it might affect the results that come back. drugs coming back on this thing, but look for the things by drug to see if there are still results.

@sierra-moxon sierra-moxon removed O&O issue ordering & organizing issue feature request (not UI) needs review this ticket needs a broad group of people to review and assign next steps because it crosses teams labels May 17, 2024
@Genomewide
Copy link

Genomewide commented May 22, 2024

I ran 37 diseases through what may treat. Attempted to have a diversity of disease types(blindness, constipation, AD, webbed toes, PTSD, etc.). See table below with diseases and PKs.
Spread sheet with summary of results that appeared in 10 or more of the 37 https://docs.google.com/spreadsheets/d/1-6JyZeqEKQbDDh-gl5cygFIWVLqJTUCK4fYxYTVQNIQ/edit?usp=sharing

I included a breakout by ARA and included the number of responses that they gave 1 or more answers for. Not sure how to count things if they did not provide any results.

For the screenshot below:
Aragorn returned valproic acid 3 times and had results for 13 of the queries.
Improving agent returned valproic acid 6 time and had results for 28 of the queries. So, twice as many instances of the common result but also twice as many queries with results.

I don't know if this is bad or good or neither? However, if you are a diabetic on metformin being given cyclosporin to suppress your immune system while controlling your seizures with valproic acid, you may be immortal!

image

Here are the diseases and the PKs
<style type="text/css"></style>

resultPK  
87d4eb93-8e4b-4955-8337-7333661e2be9 Alzheimer disease
0ac4941a-31d2-43ef-8e03-ad0d5255dae0 amyotrophic lateral sclerosis
7d586a9d-0ec7-4995-988b-a69508079327 antisocial personality disorder
c78f91cf-f49a-4c3d-ad31-924cefc2e255 asthma
07dd4bc7-f15e-43e5-b9ea-042c85e0e1f0 atopic eczema
700932eb-ca88-4fc3-9613-d43884d334ec attention deficit-hyperactivity disorder
98bf2629-1773-4b67-b3d9-03a4d9ae0bb8 autism spectrum disorder
2f91fc50-c72f-41ae-ba35-726f42202e19 blindness (disorder)
d4ee111c-5239-4ff9-88d1-b5e5db443ac7 breast cancer
7914d1fd-4598-4c48-afe9-28cd6eaefbcd Chronic constipation
31f0701e-ab18-4ee4-bc80-1d15f7e4649b chronic obstructive pulmonary disease
582d695f-a038-4c16-b3c3-173cda1b007a common cold
7bf839cd-b1a4-4038-8b41-30ba7fde4c1d Cystic acne
6c535d3b-f7db-403e-bb82-f6e3ade32aa7 cystic fibrosis
77bcb34c-b259-4985-b958-d705fd7b10d3 dental caries
a3f1baa6-e770-4f7e-9719-a1d3c102f9a7 depressive disorder
59104990-cc3c-4ae6-982b-db23fb5b4b46 diabetes insipidus
8901a072-9329-4172-b15b-7ede64f561cc diabetes mellitus
faa841f6-bda7-4302-bd9b-b6c73c1f9c59 dissociative amnesia
d66328fb-4942-482d-8b8e-dfe4de348861 gastroesophageal reflux disease
65e48ec5-665c-4cfe-9248-de7c6318bd67 generalized anxiety disorder
3aa82c04-b1e2-4d28-a114-59220a629fea Heart murmur
0fdf5581-96da-494d-be4b-ff7e961723f5 hemorrhoid
08398164-dbdf-4e97-b589-fdd75968e1db Huntington disease
24f4069c-2893-437e-b218-53e666cd08b6 impetigo
68e3944c-d06b-4f8d-8779-0cce684da7c2 Ischemic stroke
81879c43-e025-4729-982b-179d42b43e2c lung cancer
92501dbd-7117-4ef7-a5c2-de64874fc08e manic bipolar affective disorder
d79d5108-6538-4814-95ba-d60f1e653f29 meningitis
9f75caee-1c13-4a56-bbfc-574ae6115cda migraine with aura
698b6f1a-ed82-4ea8-a6f7-3719ded3c453 muscular dystrophy
77679e30-e435-454b-b756-83dc72fdc03a pleurisy
75df6e52-8c6b-450b-8502-cf0a62f1bd20 post-traumatic stress disorder
2f138a27-9498-46dd-ab83-f6c9232affa4 pulmonary fibrosis
6ac5109b-8fe2-49ba-a1b2-7f664624029d social phobia
00c6b6ca-b5f3-4850-971d-075f29ee414b sunburn
84446e96-4fda-420f-9023-a8d0d8ba6db5 Toe syndactyly

@Genomewide
Copy link

I did not do this for MVP2. Do we need that? I can redo this process for that as well, but will wait for feedback on what mistakes I made here.

@sierra-moxon sierra-moxon added the needs review this ticket needs a broad group of people to review and assign next steps because it crosses teams label Jun 1, 2024
@sandrine-muller-research
Copy link
Collaborator Author

I think those nodes have a high degree (or perhaps a high first degree). Perhaps the node "cardinal" (not sure it was the correct naming, sorry I have a blank) that @MarkDWilliams created could help? They are ways to correct for the node degree effects but I do not think it could be done easily. At list inform the user about the "connectivity" of the node or of its close neighbors could be helpful? (have an icon of something that says that is is potentially a false positive).

@sierra-moxon
Copy link
Member

from TAQA:

  • [Sandrine] - it would be nice to know how connected these compounds are to the result - the hypothesis here is that we are going through highly connected nodes.
  • [Sui] - common graph issue - if we didn't already decrease the highly connected nodes we would see water, etc. all the time.
  • [Sandrine] - maybe there is a secondary issue where the chemicals aren't just promiscuous in biology, but specifically in disease treatment (see Andy's note)
  • [Sui] - lots of times these come from SEMMED or possibly text mining in general.
  • [Chris] - I think that this is the right approach - run a broad set of tests and aggregate - Could this be pulled out of the tests that are alreaedy being run? breadth of the diseases Andy tested is nice here. For discussion in the testing team. @maximusunc ?

@dkoslicki
Copy link
Member

Not at all arguing against the fact that this really should be addressed, but is this an actual deployment showstopper? I ask since such a pervasive issue (but subtle, as these "promiscuous" results are mixed in with better quality ones), and one that many ARAs are experiencing leads me to believe that this is something that will not have a quick fix. I.e instead of a showstopper, what about making it one of the sprint priorities?

@sandrine-muller-research
Copy link
Collaborator Author

I agree with your analysis. I put this tag so it gets a chance to be discussed at TAQA to choose collectively a proper action (one of the sprints or next phase? ) I am happy to remove showstopper

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs review this ticket needs a broad group of people to review and assign next steps because it crosses teams retest needed
Projects
None yet
Development

No branches or pull requests

7 participants