Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Improve Networkx MultiDiGraph Metadata #62

Closed
1 task done
callahantiff opened this issue Oct 11, 2020 · 4 comments
Closed
1 task done

Enhancement: Improve Networkx MultiDiGraph Metadata #62

callahantiff opened this issue Oct 11, 2020 · 4 comments
Assignees
Labels
ISMB challenge release v2.0.0 noting work and issues related to release v2.0.0

Comments

@callahantiff
Copy link
Owner

callahantiff commented Oct 11, 2020

TASK

Task Type: CODEBASE

Improve the node and edge metadata when outputting the Networkx MultiDiGraph versions of each build. Thanks to @rkboyce, who suggested that we could make very small changes to the current Network graph and drastically improve the usability of the output structure.

TODO

Impacted Scripts:

  • knowledge_graph.py
  • converts_rdflib_to_networkx() in utils/kg_utils.py

Needed Functionality:

  • Add a helper function to utils/kg_utils.py that can be called by converts_rdflib_to_networkx(). The helper function will set graph attributes for edges:
    • key: a unique value for each predicate with respect to the triple it appears in, could be a hash of the triple. Just need to ensure that it is unique
    • weight: default to 0

@rkboyce, can you please verify that I have covered the needed changes that we discussed this week correctly above?

I will also be implementing a few changes to the OWL-NETS architecture (issue #56) and will be storing the collapsed semantic information from the full graph as attributes of the transformed OWL-NETS graph, likely in the form of edge and and node dictionary entries.

@callahantiff callahantiff added enhancement New feature or request knowledge graph release v2.0.0 noting work and issues related to release v2.0.0 labels Oct 11, 2020
@callahantiff callahantiff added this to the Methods Manuscript milestone Oct 11, 2020
@callahantiff callahantiff added this to Needs to be Done in Coding Tasks via automation Oct 11, 2020
@callahantiff callahantiff self-assigned this Oct 11, 2020
@callahantiff callahantiff added ISMB challenge and removed enhancement New feature or request knowledge graph labels Dec 21, 2020
@callahantiff callahantiff added this to To do in ISMB Bio-Ontologies Challenge via automation Dec 21, 2020
@rkboyce
Copy link

rkboyce commented Jan 6, 2021

Hi @callahantiff - I agree with the summary for the most part. My suggestion is to make the key some identifier unique across the knowledge graph. Could be just an incremented integer unique to each relation with respect to the triple that it occurs in. I like to use 'predicate' for the URIRef that represents the edge relationship (which will likely be from an ontology e.g. RO and not unique), and weight should be 0.0 as you indicated.

@callahantiff
Copy link
Owner Author

Thanks so much @rkboyce, that's exactly what I needed to know!

@callahantiff callahantiff mentioned this issue Jan 10, 2021
4 tasks
@callahantiff
Copy link
Owner Author

Done! Note that this representation now includes keys for nodes and edges and has a default weight of 0.0:

  • Node key: str(http://purl.obolibrary.org/obo/CHEBI_35406)
  • Relation key: MD5 hash of triple ensures that each key is unique with respect to the triple it occurs in ➞
     hash(
          'http://purl.obolibrary.org/obo/CHEBI_35406' + 
          'http://www.w3.org/2000/01/rdf-schema#subClassOf' +
          'http://purl.obolibrary.org/obo/CHEBI_29067'
              )
    

callahantiff added a commit that referenced this issue Jan 13, 2021
callahantiff added a commit that referenced this issue Jan 13, 2021
@callahantiff
Copy link
Owner Author

Completed as part of #84

Coding Tasks automation moved this from Needs to be Done to Completed Jan 19, 2021
ISMB Bio-Ontologies Challenge automation moved this from To do to Done Jan 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ISMB challenge release v2.0.0 noting work and issues related to release v2.0.0
Development

No branches or pull requests

2 participants