A tool to extract metadata from DICOM files and convert them to RDF - so that the metadata can easily be made searchable.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.



Author: Michael Brunnbauer, Bonubase GmbH (www.bonubase.com)
The author's email address is brunni@netestate.de
This is free software, see the file LICENCE for details

dicom2rdf.py generates RDF/XML files from metadata found in DICOM files (.dcm)
DICOM = Digital Imaging and Communications in Medicine http://dicom.nema.org/

dicom2rdf.py needs Python 2.x and the following Python modules:

pydicom: http://code.google.com/p/pydicom/

rdflib: https://github.com/RDFLib


 $ ./dicom2rdf.py file1.dcm file2.dcm file3.dcm ...

Will generate file1.rdf file2.rdf file3.rdf ...

We strongly recommend to:

1) De-identify DICOM files before conversion.

2) Not making the resulting triples public (for example by publishing the 
   RDF/XML files on the web or operating an open SPARQL endpoint).

Ontology / Namespace

The ontology used can be found at http://purl.org/healthcarevocab/v1
It has been generated by  the program found in the directory gen_ontology. 
See http://purl.org/healthcarevocab/v1help for an explanation of the basic 
concepts of the ontology.

The source files containing dictionaries have been generated from the 
DICOM standard by the parsers found in gen_source.


A warning of the form

 SOP Class 1.2.840.10008... not found

means that dicom2rdf does not know what IOD is assigned to the given SOP 
Class UID. The list defined in sopclasses.py is still incomplete. All 
Attributes in this information object will be related to the information 
object instead of the correct information entities. Please search for the 
UID in PS 3.4 of the DICOM standard:


Then search for the IOD name in iods.py, add the assignment to sopclasses.py
and give us a note.

A Warning of the form

 IE ... not in IOD for ...

means that an attribute was used in an information object that is not
defined in the corresponding information object definition. This warning 
is caused by non standard conformant DICOM files or an irregularity in our
parsing of PS 3.3.

URI creation

-Every dataset having a SOP Instance UID will get a urn:oid: URI with that UID

-Every Study, Series or Frame of Reference with a UID will also get a
 urn:oid: URI

-Every other entity will get a generated hash URI relative to the created
 RDF/XML document

Triples generated

-A URI will be created for the DICOM file with the filename as rdfs:label
 and dcterms:format http://purl.org/NET/mediatypes/application/dicom
 Triples will be added for the Attributes that pydicom has identified as 
 relating to the DICOM file.

-A URI will be determined or created for the main information object 
 which is asserted to be in the class corresponding to its SOP Class UID
 (a urn:oid URI for the class is generated from the SOP Class UID).

-For every attribute in the current information object, the information 
 entity it relates to according to the information object definition is 
 determined. A URI is determined or created for this IE and membership 
 in the corresponding IE class is asserted. If the information entity cannot be 
 determined unambiguously, the subject of the triple will be the information 
 object itself. The information object will be connected to the information 
 entities with dcterms:subject.

-For every item in a sequence, a URI will be determined or created and the
 attributes for that item relate to the item by default. If the item is an
 information object, attributes can relate to an information entity instead
 like with the main information object.

-Extra triples may be generated using other ontologies, e.G. with
 foaf:familyName and foaf:givenName for components of the PN VR.