Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current "related compounds" is ambiguous #2484

Open
Adafede opened this issue Apr 20, 2024 · 2 comments
Open

Current "related compounds" is ambiguous #2484

Adafede opened this issue Apr 20, 2024 · 2 comments
Assignees
Labels
enhancement some suggestions to improve Scholia

Comments

@Adafede
Copy link
Contributor

Adafede commented Apr 20, 2024

Is your feature request related to a problem? Please describe.
Currently, the compounds listed as having the same connectivity encompass a broad range of different things including isotopomers

Describe the solution you'd like
Either clarifying it or split them in subcategories.
If we split them, I am happy to rewrite the respective queries using InChI and not InChIKey to strip the different respective layers.
We could also make use of P3364 and P6185

Describe alternatives you've considered
Letting things as they are right now (but removing the "including the compound itself (see a52e3e7)

Additional context
Trying to improve the chemical aspect

@egonw

@Adafede Adafede added the enhancement some suggestions to improve Scholia label Apr 20, 2024
@egonw egonw self-assigned this Apr 24, 2024
@egonw
Copy link
Collaborator

egonw commented Apr 24, 2024

I need to think about this a bit more. I like the idea, but need to overthink the implications.

@Adafede
Copy link
Contributor Author

Adafede commented Apr 24, 2024

I also overthought again about it, and here is what came to my mind (WIP):

So keeping the same table but with an additional column, being "stereoisomer, isotopomer, etc." based on the matching layers:

PREFIX target: <http://www.wikidata.org/entity/Q41576>

# title: related chemical structures
SELECT ?mol ?molLabel ?InChI ?InChIKey ?CAS ?ChemSpider ?PubChem_CID ?layer_b ?layer_t ?layer_m ?layer_s WITH {
  SELECT ?queryKey ?srsearch ?filter WHERE {
    target: wdt:P235 ?queryKey .
    BIND(CONCAT(SUBSTR($queryKey,1,14), " haswbstatement:P235") AS ?srsearch)
    BIND(CONCAT("^", SUBSTR($queryKey,1,14)) AS ?filter)
  }
} AS %MOLS WITH {
  SELECT ?mol ?InChIKey WHERE {
    INCLUDE %MOLS
    SERVICE wikibase:mwapi {
        bd:serviceParam wikibase:endpoint "www.wikidata.org";
        wikibase:api "Search";
        mwapi:srsearch ?srsearch;
        mwapi:srlimit "max".
        ?mol wikibase:apiOutputItem mwapi:title.
      }
    ?mol wdt:P235 ?InChIKey .
    FILTER (REGEX(STR(?InChIKey), ?filter))
    FILTER (?InChIKey != ?queryKey)
  }
} AS %MOLS2 {
  INCLUDE %MOLS2
  ?mol wdt:P234 ?InChI .
  # WIP
  BIND(REPLACE(?InChI, "/{0}.*?/b", "/") AS ?layer_b)
  BIND(REPLACE(?InChI, "/{0}.*?/t", "/") AS ?layer_t)
  BIND(REPLACE(?InChI, "/{0}.*?/m", "/") AS ?layer_m)
  BIND(REPLACE(?InChI, "/{0}.*?/s", "/") AS ?layer_s)
  OPTIONAL { ?mol wdt:P231 ?CAS }
  OPTIONAL { ?mol wdt:P661 ?ChemSpider }
  OPTIONAL { ?mol wdt:P662 ?PubChem_CID }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement some suggestions to improve Scholia
Projects
None yet
Development

No branches or pull requests

2 participants