Skip to content
Ieuan Scanlon edited this page Oct 11, 2023 · 9 revisions

Concept Library Documentation

The Concept Library is a system for storing, managing, sharing, and documenting clinical code lists in health research.

The specific goals of this work are:

  • Store code lists along with metadata that captures important information about quality, author, etc.
  • Store version history and provide a way to unambiguously reference a particular version of a code list.
  • Allow programmatic interaction with code lists via an API, so that they can be directly used in queries, statistical scripts, etc.
  • Provide a mechanism for sharing code lists between projects and organizations.

Our live website is available here


The Concept Library is available to users of the SAIL Databank by default, log in using your SAIL credentials to the website outside of the gateway or inside the gateway at

Non-SAIL users can request access by using the contact form.


Our goal is to create a system that describes research study designs in a machine-readable format to facilitate rapid study development; higher quality research; easier replication; and sharing of methods between researchers, institutions, and countries.


A significant aspect of research using routinely collected health records is defining how concepts of interest (including conditions, treatments, symptoms, etc.) will be measured. This typically involves identifying sets of clinical codes that map to a variable that the researcher wants to measure, and sometimes a set of rules as well (e.g. a sufferer from a disease may be defined as someone who has a diagnosis code from list A and a medication from list B, but excluding anyone who has a code from list C). A large part of the analysis work may involve consulting clinicians, investigating the data, and creating and testing definitions of clinical concepts to be used.

Often the definitions that are created are of interest to researchers for many studies, but there are barriers to easily sharing them. The definitions may be embedded within study-specific scripts, such that it is not easy to extract the part that may be of general interest. Also, often researchers do not fully document how a concept was created, its precise meaning, limitations, etc. Crucial information may be lost when passing it to other researchers, resulting in mistakes. Often there simply is no mechanism to discover and share work that has been done previously, leading researchers to waste time and resources reinventing the wheel. In theory, when research is published, information on the precise methods used should be included, but in reality this is often inadequate.