-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roadmap for pacsanini #46
Comments
I have attached here an ER diagram of the database that I have in mind. Note that it is not fully denormalized. The rationale for this is that it will allow for the database to serve as a sort of data warehouse in which the images table is the central table on which all types of queries will be possible. The other tables serve a way to structure the DICOM data in a "DICOM-friendly" way. The reason why the studies_find and the images table are linked is because when the storescp server receives new DICOM images, they will be persisted in the images collection. That will be the confirmation that the data is obtained and that the studies_find table can be updated. I am open to change this and link the studies_find table to the studies table. |
After some reflection, I think that the following changes should be made to the database schema:
|
The implementation of the database should therefore be done in the following way:
|
With the release of 0.2.0, this issue can be closed. |
Roadmap/features
The pacsanini project now feels somewhat mature in terms of the functionalities it offers:
The functionalities that are most interesting and could be improved the most are most probably the DICOM parsing, C-FIND operation, and storescp parsing functionalities.
DICOM parsing
The parsing of DICOM files now mainly outputs results in CSV format (sqlite too). CSV output serves research purposes well but the next step would be to be able to parse DICOM files into a database. In this way, structured data can be accessed across multiple servers simultaneously and without the need of copying data from/to servers or creating a NFS system.
The tricky part of having a database is choosing the right type. I think that a SQL database (probably postgres) should be the best. SQL is a mature standard and given the multiple engines that exist (postgres, mariadb, mysql, ...), users should have the choice to use whatever tool they prefer most. Furthermore, libraries such as sqlalchemy provide great abstraction for implementing this all.
Another tricky part shall be defining a general schema for the database. This is tricky because DICOM attributes can vary between modalities. A general solution would therefore be to expose mainly mandatory attributes and store the DICOM file's metadata as a JSON object (BLOB, JSON, JSONB (preferred)).
The database structure should also make sense for storing data received from the storescp server (more on that later).
Overall, the database would have the following tables, which would fare well with the DICOM data representation model:
C-FIND operations
To be able to give a bird's eye view of the data pipeline status, c-find results should be persisted in the database as well. A table named
studies_find
would be put into place that would store basic DICOM attributes of resources. In addition, two columns would be introduced: found_on and retrieved_on. found_on corresponds to the date at which the data was returned from a C-FIND request. moved_on corresponds to the data at which the data was retrieved from the PACS with a C-MOVE request.storescp server
The storescp server was originally conceived to accept callbacks/plugins. This means that in addition to persisting data users can pass callbacks to perform additional actions on the data. One such action that should be facilitated is the parsing of DICOM metadata into the database. A system that updates the
studies_find
table and parses the DICOM file at the same time would be good.Furthermore, the storescp server should be able to accept callbacks that will run before and after the data is persisted on disk.
The text was updated successfully, but these errors were encountered: