-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Format dicom #52
Format dicom #52
Conversation
…for the gain. ENH: combined loop to reduce file opens.
Thanks @tcpan. I think we should go ahead and merge, but we will have to do some restructuring afterwards because our current framework expects each module in the @bemoody @briangow my suggestion would be either:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tcpan , thanks so much for this - preparing the DICOM format for benchmarking clearly took a lot of work!
I've left a couple of comments in the code. One other thought is that it might be cleaner to move the dcm_* files under a separate folder (perhaps formats/dcm_utils/dcm_*) such that the only files at the formats/ level are the ones that get called for benchmarking. This is completely up to you though, no worries if you prefer to leave them where they are. @tompollard beat me to this suggestion above.
waveform_benchmark/formats/dicom.py
Outdated
# import json | ||
|
||
# ======== organize the waveform chunks (tested) | ||
print("INPUT pleth", waveforms['Pleth']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this line should get commented out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
# MultichannelRespiratoryWaveform: RESP, >1, , unconstrained, DCID 3005 “Respiration Waveform” , SS/SL | ||
# | ||
|
||
CHANNEL_TO_DICOM_IOD = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add the other channel types from the proposed suite of waveforms for benchmarking? See Waveform suite characterization here: #11 (comment) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this needs to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, Brian, sorry for the delay. Was working on house stuff all day. Will do re subfolder. Is formats the right place to put that?
Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: Brian Gow ***@***.***>
Sent: Saturday, May 4, 2024 6:40:15 PM
To: chorus-ai/chorus_waveform ***@***.***>
Cc: Tony Pan ***@***.***>; Mention ***@***.***>
Subject: Re: [chorus-ai/chorus_waveform] Format dicom (PR #52)
@briangow commented on this pull request.
@tcpan<https://github.com/tcpan> , thanks so much for this - preparing the DICOM format for benchmarking clearly took a lot of work!
I've left a couple of comments in the code. One other thought is that it might be cleaner to move the dcm_* files under a separate folder (perhaps formats/dcm_utils/dcm_*) such that the only files at the formats/ level are the ones that get called for benchmarking. This is completely up to you though, no worries if you prefer to leave them where they are.
________________________________
In waveform_benchmark/formats/dicom.py<#52 (comment)>:
- # dicom.save_as("/mnt/c/Users/tcp19/Downloads/Compressed/test_waveform.dcm", write_like_original=False)
+ def write_waveforms(self, path, waveforms):
+ fs = FileSet()
+
+ # one series per modality
+ # as many multiplexed groups as allowed by modality
+ # one instance per chunk
+
+ # one dicomdir per study? or series?
+ studyInstanceUID = uid.generate_uid()
+ seriesInstanceUID = uid.generate_uid()
+ prefix, ext = os.path.splitext(path)
+ # import json
+
+ # ======== organize the waveform chunks (tested)
+ print("INPUT pleth", waveforms['Pleth'])
Perhaps this line should get commented out.
________________________________
In waveform_benchmark/formats/dicom.py<#52 (comment)>:
+# name: modality name, maximum sequences, grouping, SOPUID, max samples, sampling frequency, source, datatype,
+# TwelveLeadECGWaveform: ECG, 1-5, {1: I,II,III; 2: aVR, aVL, aVF; 3: V1, V2, V3; 4: V4, V5, V6; 5: II}, 16384, 200-1000, DCID 3001 “ECG Lead”, SS
+# GeneralECGWaveform: ECG, 1-4, 1-24 per serquence, ?, 200-1000, DCID 3001 “ECG Lead”, SS
+# General32BitECGWaveform: ECG, 1-4, 1-24 per serquence, ?, by confirmance statement, DCID 3001 “ECG Lead”, SL
+# AmbulatoryECGWaveform: ECG, 1, 1-12, maxsize of wvaeform data attribute, 50-1000, DCID 3001 “ECG Lead”, SB/SS
+# HemodynamicWaveform: HD, 1-4, 1-8, maxsize of wvaeform data attribute, <400, , SS
+# CardiacElectrophysiologyWaveform: EPS, 1-4, , <=20000, DCID 3011 “Electrophysiology Anatomic Location” , SS
+# ArterialPulseWaveform: HD, 1, 1, ? , <600, DCID 3004 “Arterial Pulse Waveform” , SB/SS
+# RespiratoryWaveform: RESP, 1, 1, ? , <100, DCID 3005 “Respiration Waveform”, SB/SS
+# ScalpEEGWaveform: EEG, 1 (interruption as separate instances), 1-64, , unconstrained, DCID 3030 “EEG Lead”, SS/SL
+# ElectromyogramWaveform: EMG, unconstrained, 1-64, , unconstrained, DCID 3031 “Lead Location Near or in Muscle” or DCID 3032 “Lead Location Near Peripheral Nerve”, SS/SL
+# SleepEEGWaveform: EEG, unconstrained, 1-64, , unconstrained, DCID 3030 “EEG Lead” , SS/SL
+# MultichannelRespiratoryWaveform: RESP, >1, , unconstrained, DCID 3005 “Respiration Waveform” , SS/SL
+#
+
+CHANNEL_TO_DICOM_IOD = {
Do we need to add the other channel types from the proposed suite of waveforms for benchmarking? See Waveform suite characterization here: #11 (comment)<#11 (comment)> .
—
Reply to this email directly, view it on GitHub<#52 (review)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAHFAKKH54S6CYQJY6LHO4LZAVPU7AVCNFSM6AAAAABHGQEHJCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDAMZZGU2TMMZTGA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@tcpan , it's up to you. If you feel that these files are useful beyond this repository perhaps they deserve a dedicated place as @tompollard suggested in this 2nd thought on the topic. It would probably make sense to create a general utils module in this repository at some point. If you prefer that option feel free to set it up or let me know and I'd be happy to do so. As far as I'm concerned, having them in a subfolder under formats/ is also fine. |
…sing of specified tags only. FIX: input chunks of the same channels were recorded as separate channels in the channel definition. ENH: optimization with deferred dataset reading, and avoid using pydicom's reading functions where possible. ENH: moved support files to format/dcm_utils
I have checked in some updates, with some fixes and some new tests for different chunking strategy (likely not to make a huge difference). Note that to get the channel metadata, each file is opened and scanned. A more efficient way to do this would be to add the channel metadata as a private tag on the DICOMDIR file. This would still be standards compliant, but I have not implemented this. |
Thanks @tcpan ! |
I think the cleanest option is to store If we want to go for the quick and easy option, my preference would be to move them to a submodule of the |
Updated read and write. Better standard compliance for write. Reading now allows random data access.