New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Add RLE encoding #730
Conversation
Hello @scaramallion! Thanks for updating the PR.
Comment last updated on September 23, 2018 at 05:24 Hours UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had a quick look through. Do you think you could add some docstring lines (to rle_encode() probably) to lay out the logic it is going through? -- e.g. through frames-> planes -> segments ->rows. The would help me follow a little better for sure.
And very naive question -- is it because of the dicom standard flavor of RLE that it needs to be broken down like this? I don't know how dicom rle is encoded, so wondering where it differs from simple direct rle like on text, etc.
Essentially each sample of each plane with a frame needs to be split into segments (one for each byte of sample data), and each row in the segment "shall" be encoded separately. I've started out by splitting it all up to make it easier to test but may end up consolidating a bit once I'm happy with it (this is still very much a WIP). There are other requirements (max replicate/literal run is 128 bytes long, anything longer needs to be split) which make things complicated. I'll probably remove rle_encode in favour of rle_encode_frame, too. |
Conversion script for native to RLE: #!/usr/bin/env python
import argparse
import os
from pydicom import dcmread
from pydicom.encaps import encapsulate
from pydicom.pixel_data_handlers.rle_handler import rle_encode_frame
from pydicom.uid import RLELossless
parser = argparse.ArgumentParser(description="RLE encode an unencoded DICOM dataset")
parser.add_argument('dcm_in', type=str, help="The unencoded DICOM dataset")
args = parser.parse_args()
with open(args.dcm_in, 'rb') as fp:
fpath = os.path.abspath(args.dcm_in)
fdir, fname = os.path.split(fpath)
ds = dcmread(fp, force=True)
arr = ds.pixel_array
nr_frames = getattr(ds, 'NumberOfFrames', 1)
if nr_frames == 1:
encoded = rle_encode_frame(arr)
encapsulated = encapsulate([encoded])
else:
frames = []
for frame in arr:
frames.append(rle_encode_frame(frame))
encapsulated = encapsulate(frames)
ds.PixelData = encapsulated
ds[0x7fe00010].is_undefined_length = True
ds.file_meta.TransferSyntaxUID = RLELossless
# RLE Lossless is always Planar Configuration 1
ds.PlanarConfiguration = 1
# Encapsulated transfer syntaxes are explicit VR little endian encoding
ds.is_implicit_VR = False
ds.is_little_endian = True
ds.save_as(os.path.join(fdir, 'rle_{}'.format(fname)),
write_like_original=False) |
Reference Issue
Related to #713
What does this implement/fix? Explain your changes.
Benchmarks
Python 3.6, time per 100 runs
Not particularly fast, but I suppose we have to start somewhere. The 8-bit 1 sample/pixel data is probably most representative of real-world data.