Skip to content

Latest commit

 

History

History
38 lines (36 loc) · 4.15 KB

bitstore.org

File metadata and controls

38 lines (36 loc) · 4.15 KB

Bitstore

1 Introduction

Bitstore is a new approach to terminology development that aspires to support collaborative federated terminologies at web scale. It departs somewhat from traditional description logic based systems in that it makes a clear separation between definitions (syntax) and inferencing (semantics), taking fully into account that terminology construction is as much about document and content management as it is about building taxonomies, semantic navigation, and making inferences about ontologies. It is fairly accurate to say that the focus to date has always been on data structures motivated by the use of description logics.

1.1 Motivation

Over the years, as we’ve used description logic based systems for terminology work at NCI, many of the issues and problems we’ve encountered have little or nothing to do with description logic. Certainly building ontologies with consistent semantics that enable modeling in a principled and rigorous fashion is critical. However there are many other aspects of the process, collaborative workflow, capturing history and auditing trails of edits, encoding business rules specific to individual ontologies, for example; A class must have a preferred-name, display-name, and a full-syn complex property.

Much of this has little to do with description logic, yet to date the approaches we’ve taken have all made use of data structures, database schemas and so forth that are designed to support primarily the inferencing capabilities of the particular logic.

1.2 Overview

These notes are intended to document the prototyping and implementation of bitstore. The next section provides a high level description of the architecture and is followed by detailed notes on the implementation. As each approach to the major pieces is tried, metrics will be employed to determine the efficacy and adherence to the stated goals. This project is somewhat ambitious in that the goal is to take ontylog up an order of magnitude in scalability whilst also reworking the fundamentals of terminology development to better support schema evolution, federation and collaboration.

2 Architecture

2.1 Concepts as Documents

The basic design of Bitstore is to use CouchDB to store concepts, each concept becomes a distinct couchdb document with a unique identifier, in a couchdb database. So a database roughly corresponds to a terminology. In the language of OWL one might consider a database a namespace. Concepts relate to one another through relations such as isa,*subsumes*,*gene-encodes*,*is-located-in*, etc. For each relation there is a unique document that stores metalevel information, for example the provenance of the relation or mathematical properties such as transitivity or whether it has an inverse. For example one might have treats, which has an an inverse is_treated_by, or part_of, which is declared to be transitive. This set of documents that describe the relations used in the terminology can be viewed as a schema of sorts.

2.2 Ontylog

An additional store is used for the actual instances of the relations between concepts. This is a triple store where each element in a triple is a unique id corresponding to a document. These represent RDF triples, more or less, and are used to encode relationships such as Aspirin isa drug or gene encodes protein.

2.3 FTI

3 Implementation

3.1 enhancements to CouchDB

3.1.1 couch httpd

3.1.2 couch server supervisor

I should be able to specify new daemons in the ini file and have the sup load them as secondary_services. For some reason I think they need to be built into the app spec for this to work. Otherwise they don’t seem to able to see the couch_app.

  • For now I’m just loading them directly as primaries and have couch start them and mnesia

3.1.3 jquery couch enhancements

3.2 Bitstore

3.2.1 ontylog

  • triple store for dags
  • classification and role inheritance as graph operations
  • separation of ontylog from graphs

3.2.2 concepts as documents

3.2.3 FTI