Skip to content
This repository has been archived by the owner on Sep 24, 2019. It is now read-only.

Configure annotation/namespace definitions #92

Closed
abargnesi opened this issue Jan 21, 2016 · 3 comments
Closed

Configure annotation/namespace definitions #92

abargnesi opened this issue Jan 21, 2016 · 3 comments

Comments

@abargnesi
Copy link
Member

Background

Annotations and namespaces are catalogs for biological entities. They are generally modeled one-to-one with life science databases (Entrez Gene, HGNC, GO, NCBI Taxonomy, etc.). When you express knowledge in BEL you will want to use standard names to increase connectedness within your BKN (BEL Knowledge Network). Annotations and namespaces are defined in the header of BEL script (or XBEL) files. A document-local keyword is provided to refer to the namespace within BEL terms. The namespace values can be retrieved by downloading the provided URL. Here is an example definition:

DEFINE ANNOTATION Anatomy AS URL "http://host/anatomy.belanno"
DEFINE NAMESPACE EGID AS URL "http://host/entrez-gene-ids.belns"

Or for a full example, see the small corpus.

OpenBEL is moving to RDF representations for biological entities and BEL nanopubs. With RDF we are encouraged to use well-known URIs and for biological entities those are likely from identifiers.org. See #65 for more information on identifiers.org. Defining URLs for annotations and namespaces makes this transition difficult.

Proposal

In a discussion (notes) with @juliakozlovsky and @sanea we discussed an intermediate solution that will allow configuration of annotations and namespaces including both a URL, for OpenBEL framework compatibility, and RDF URI.

This is a general solution to #65.

We proposed the following:

  • Support configurable annotations when creating a BEL::Script::Parser class.
  • Construct file format to capture keyword, URL, and RDF URI prefix (e.g. http://identifiers.org/hgnc/ for HGNC).
  • Provide a command-line option to pass to the bin/bel2rdf.rb command that expects to receive a file with this format. I propose -r and --resource-override for the name.
  • Convert resource override file to objects that the parser can use to set annotations and namespaces.
@abargnesi
Copy link
Member Author

We also need to map source annotation/namespace references (i.e. in the source document) to chosen ones in the output (e.g. identifiers.org). This will allow the translator to rewrite the references from old to new as it translates to RDF.

I'm dealing with this problem as well in #111. I should be able to use your file format directly if it contained this mapping of old to new annotation/namespace references.

Thoughts, @sanea, @juliakozlovsky?

@abargnesi
Copy link
Member Author

Regarding the resource mapping file format I think we need the following:

  • two categories, one for annotations, one for namespaces
  • map namespace from Name/URL to Name/URL/RDF URI
  • map annotation from Name/Type/Domain to Name/Type/Domain/RDF URI

I suggest using the YAML format since it's human readable/writable and supported in the Ruby standard library.

Example remapping OpenBEL namespaces to include identifiers.org RDF URIs:

namespaces:
  - remap:
      from:
        prefix:  "HGNC"
        url:     "http://resource.belframework.org/belframework/20150611/namespace/hgnc-human-genes.belns"
      to:
        prefix:  "HGNC"
        url:     "http://resource.belframework.org/belframework/20150611/namespace/hgnc-human-genes.belns"
        rdf_uri: "http://identifiers.org/hgnc/"
  - remap:
      from:
        prefix:  "EGID"
        url:     "http://resource.belframework.org/belframework/20150611/namespace/entrez-gene-ids.belns"
      to:
        prefix:  "EGID"
        url:     "http://resource.belframework.org/belframework/20150611/namespace/entrez-gene-ids.belns"
        rdf_uri: "  http://identifiers.org/ncbigene/"

Example remapping annotations:

annotations:
  - remap:
      from:
        keyword:  "Species"
        type:     "url"
        domain:   "http://resource.belframework.org/belframework/20150611/annotation/species-taxonomy-id.belanno"
      to:
        keyword:  "Species"
        type:     "url"
        domain:   "http://resource.belframework.org/belframework/20150611/annotation/species-taxonomy-id.belanno"
        rdf_uri:  "http://identifiers.org/taxonomy/"
  - remap:
      from:
        keyword:  "TextLocation"
        type:     "list"
        domain:
          - Value1
          - Value2
          - Value3
      to:
        keyword: "TextLocation"
        type:    "pattern"
        domain:  "Value[0-9]+"

Both annotations and namespaces can be combined in one file. e.g.

annotations:
  # Annotations to remap.
namespaces:
  # Namespaces to remap.

Thoughts, @sanea, @rumilbaybikov, @juliakozlovsky

@abargnesi
Copy link
Member Author

Work completed in #118.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant