Bio2RDF is an open-source project that uses Semantic Web technologies to build and provide the largest network of Linked Data for the Life Sciences. Bio2RDF defines a set of simple conventions to create RDF(S) compatible Linked Data from a diverse set of heterogeneously formatted sources obtained from multiple data providers. The online version of Bio2RDF is meant to be demonstrative of semantic web technologies, and is not currently kept up to date. If you want the latest version, you can check out the code and build it yourself!

Bio2RDF Release 3 (July 2014) Release Notes:

  • ~11 billion triples across 35 datasets
    • new datasets include: clinicaltrials.gov, dbSNP, GenAge, GenDR, LSR, OrphaNet, PubMed, SIDER, WormBase)
    • locally hosted endpoints: chembl, linkedSPL, pathwaycommons, reactome, wikipathways
  • more complete dataset statistics
  • hundreds of bug fixes to improve overall representation of datasets.
  • every URI is typed as an instance of an owl:Class, owl:ObjectProperty, or owl:DatatypeProperty, as well as typed as an instance of a Resource in the dataset and linked to a description from the Life Science Registry (LSR)
  • CORS-enabled, text indexed, SPARQL 1.1 endpoints using Virtuoso 7.2.0 --> access it at http://bio2rdf.org/sparql
  • and as always, open source scripts and downloadable content:
  • check out the latest linked open data diagram