Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data portal - environmental local scale search issue #1173

Open
aclum opened this issue Mar 7, 2024 · 5 comments
Open

data portal - environmental local scale search issue #1173

aclum opened this issue Mar 7, 2024 · 5 comments
Labels
type: bug Something isn't working

Comments

@aclum
Copy link

aclum commented Mar 7, 2024

The data portal is showing counts for a parent class without showing the child classes they belong to. I was using the data portal yesterday to find the environmental local context terms for NEON soil samples from Colorado (search filter)
When I then try to navigate to the term value I see astronomical body part has a count of 517 but none of the children have any counts, compared to fiat object part which has a child, environmental zone which can be used to navigate down to the actual terms. One of the expected values several leaves down is 'area of gramanoid or herbaceous vegetation'
See
image
for a view of no counts for children of astronomical body part
@turbomam confirmed that if he looks at the the ontology json independently the terms should be able to be navigated to via astronomical body part. Based on this we believe the issue is on the nmdc-server side rather than with the nmdco-classes.json file.

Mark's comments:
Ontology Access Kit can be used to check nmdco-classes.json independently of the DataPortal

runoak --input pronto:nmdco-classes.json tree 'area of gramanoid or herbaceous vegetation'
  • [] http://www.w3.org/2002/07/owl#Thing
    • [i] BFO:0000001 ! entity
      • [i] BFO:0000002 ! continuant
        • [i] BFO:0000004 ! independent continuant
          • [i] BFO:0000040 ! material entity
            • [i] BFO:0000024 ! fiat object part
              • [i] ENVO:01000408 ! environmental zone
                • [i] ENVO:01001199 ! terrestrial environmental zone
                  • [i] ENVO:01001305 ! vegetated area
                    • [i] ENVO:00000106 ! grassland area
                    • [i] ENVO:01000888 ! area of gramanoid or herbaceous vegetation
            • [i] ENVO:01000813 ! astronomical body part
              • [i] ENVO:01001199 ! terrestrial environmental zone
                • [i] ENVO:01001305 ! vegetated area
                  • [i] ENVO:00000106 ! grassland area
                    • [i] ENVO:01000888 ! area of gramanoid or herbaceous vegetation
@aclum aclum added the type: bug Something isn't working label Mar 7, 2024
@jeffbaumes
Copy link
Collaborator

I believe what is happening is that the repeated subtree starting at "ENVO:01001199 ! terrestrial environmental zone" is (arbitrarily) only placed under "ENVO:01000408 ! environmental zone" and not "ENVO:01000813 ! astronomical body part". The current implementation never repeats subtrees. The real data structure for the ontology is a complex inter-dependent directed acyclic graph and simplifying assumptions were made for the UI.

This is the relevant issue describing the complexity and tradeoffs.

This notebook shows the general approach we decided to take to "treeify" a complex directed acyclic graph. Note that it states:

The first step is to make the directed acyclic graph into a tree. We do this by arbitrarily taking the first parent node from each node as the true parent, and discarding the rest of the parent links

Note the comment from @cmungall in the linked issue:

ontology group defines initial exclusion sets (e.g. astronomical body part)

This makes me think that "ENVO:01000813 ! astronomical body part" should not be a part of the tree presented to the user, perhaps for exactly this reason that it would lead to a duplicated ontology subtree. So that would be my proposed solution to this particular inconsistency.

@aclum
Copy link
Author

aclum commented Mar 7, 2024

@turbomam
Copy link

turbomam commented Mar 7, 2024

Good comments about DAGs vs trees, @jeffbaumes . I misspoke in today's meeting.

@turbomam
Copy link

turbomam commented Mar 7, 2024

I think this should be driven by user stories. What are the searching or browsing patterns we expect?

Would anybody ever search for an intermediate node that has been left out of the exposed hierarchy? If so, then maybe we should omit intermediate nodes.

Would anybody skip the searching step and just browse through the subclasses? If so, then we should probably include all paths to a leaf.

The OBO foundry community is really active in developing tools for this kinds of thing and the obo-community Slack workspace is really active. If none of you want to join that, I can pass on any questions or requests you have, if you share them with me.

@aclum
Copy link
Author

aclum commented Mar 9, 2024

My use case from this week was browsing through sub-classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants