Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protein Forms table for PR:000050215 shows siblings, but shouldn't #164

Open
nataled opened this issue Sep 27, 2019 · 5 comments
Open

Protein Forms table for PR:000050215 shows siblings, but shouldn't #164

nataled opened this issue Sep 27, 2019 · 5 comments
Assignees

Comments

@nataled
Copy link
Collaborator

@nataled nataled commented Sep 27, 2019

Look into why https://proconsortium.org/app/entry/PR%3A000050215/ has siblings in the Protein Forms table (which also shows up in the top table as '3' organism-gene terms included; should be '1').

@nataled
Copy link
Collaborator Author

@nataled nataled commented Sep 25, 2020

Issue is caused, in this case, because the parent for these org-gene terms is not gene, it's family. The pipeline goes up to the parent and finds all children. This works only if the parent is gene for org-gene. Perhaps need to put a restriction in place where the siblings are only shown if the parent is gene. In general, the protein forms table should show the term of interest and it's children (from that same species). That's it.

Problematic cases would include:

  1. org gene with family parent (this example)
  2. org gene that have siblings from same taxon
  3. org gene where strains, cultivars, etc don't have distinct taxa indicated (this example)

2&3 might be the same

@nataled find examples of these

@nataled nataled added Discuss and removed Discuss labels Dec 4, 2020
@nataled
Copy link
Collaborator Author

@nataled nataled commented Dec 18, 2020

Per @Julie-Cowart: issue exists because the proper display would depend on the reasoned version, but web page doesn't run off the reasoned version (it runs off the database, which is based on the unreasoned version).

One suggestion is to run a SPARQL query to get the required sibling information instead of using the database.

@nataled
Copy link
Collaborator Author

@nataled nataled commented Feb 26, 2021

Karen: Can we use the modularizer for this?
Julie: The module would require loading all of PRO each time, and would thus be quite slow.

@nataled
Copy link
Collaborator Author

@nataled nataled commented Jun 4, 2021

For @nataled: Find number of cases where this happens.
For @karenross: Evaluate how something like S. pombe (NCBITaxon:4896) vs S. pombe 972h- (NCBITaxon:284812) should look. Should the child org-gene term be shown here if the parent is org-gene in a different taxon? Would it make sense to include child taxa?

@nataled
Copy link
Collaborator Author

@nataled nataled commented Jun 4, 2021

The number of cases where there's an organism-gene child with a family parent is 573 in the latest release (v63).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants