Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyDicom Encapsulated Files #814

Closed
Neuroforge opened this issue Mar 7, 2019 · 6 comments
Closed

PyDicom Encapsulated Files #814

Neuroforge opened this issue Mar 7, 2019 · 6 comments
Labels

Comments

@Neuroforge
Copy link

Description

Hello,

I wish to replicate the functionality of dcmtk's pdf2dcm tool. Are there samples or tests showing how to make an encapsulated file with Pydicom?

Docs:
https://support.dcmtk.org/docs/pdf2dcm.html
Source:
https://github.com/InsightSoftwareConsortium/DCMTK/blob/master/dcmdata/apps/pdf2dcm.cc

Versions

Darwin-18.2.0-x86_64-i386-64bit
Python 3.6.1 |Anaconda custom (64-bit)| (default, May 11 2017, 13:04:09)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
pydicom 1.2.2

@scaramallion
Copy link
Member

I'm not aware of any examples or tests but it should just be a matter of following the IOD for Encapsulated PDF Storage, most notably by including elements from the Encapulsated Document Series and Encapsulated Document modules (plus of course the other mandatory modules).

@rohithkumar31
Copy link

rohithkumar31 commented Jun 5, 2019

Hi Neuroforge,
I had a similar usecase recently where I wanted to convert pdf to encapsulated dicom. It's possible in pydicom by updating 'EncapsulatedDocument' and 'MIMETypeOfEncapsulatedDocument' tag.

Attaching
pdf2dicom.zip
the sample code where I create a new encapsulated document from a pdf file:

Update:

Created a sample code in case people need an example of usage:
https://github.com/rohithkumar31/pdf2dicom

@darcymason
Copy link
Member

@Neuroforge, were you able to work with the example given? Perhaps this issue can be closed and people can find this by search engines?

@Neuroforge
Copy link
Author

Neuroforge commented Jun 30, 2019

Hello,

I used a similar approach to @rohithkumar31. Save the file to a temp file, convert to dicom and the uploaded to our PACS server.

Please feel free to close the issue.

@a-parida12
Copy link

I am currently in the process of creating a python package to do this as well as the possibility to convert pdf to dicom rgb format maybe you can take a look at the project too.

github: https://github.com/a-parida12/pdf2dcm

pypi: https://pypi.org/project/pdf2dcm/

@janukarhisa
Copy link

janukarhisa commented Aug 16, 2022

What worked for me is the "dcm2pdf" command (https://support.dcmtk.org/docs/dcm2pdf.html / https://command-not-found.com/dcm2pdf).

In python, you can execute the command os.system("dcm2pdf "dicom_path" "pdf_saving_path").

After that, you will be able to extract the text using "pdfplumber" :

        with pdfplumber.open(r"path/to/pdf") as pdf:
            first_page = pdf.pages[0] # if many pages present implement a loop
            print(first_page.extract_text(x_tolerance=1))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants