Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing Ghostscript with PDFBox #346

Closed
wants to merge 3 commits into from
Closed

Replacing Ghostscript with PDFBox #346

wants to merge 3 commits into from

Conversation

fakabbir
Copy link

Replacing Ghostscript with alternative opensource package #342

As an approach PDFBox can be used as an alternative for Ghostscript. At present PDFBox can be used via python using the wrapper provided by python-pdfbox. Recently, python-pdfbox added the functionality to convert PDF to images.

Using python-pdfbox PDF can be converted to images sequentially.

Also What about allowing user to choose from both the libraries, in case they already have Ghostscript license purchased ?

@vinayak-mehta
Copy link
Contributor

What advantages does pdfbox have over ghostscript?

@fakabbir
Copy link
Author

fakabbir commented Jul 6, 2019

PDFBox would make camelot more close to MIT license. Ghostscript is available as AGPL/commerical licensed product. If someone wants to use camelot(at present), he/she needs to download and install Ghostscript separately. This may or mayn't be feasible in certain cases.

In case we shift we PDFBox, which is an Apache license package, the user has an advantage of not installing dependencies separately. Doing pip install would fetch all the dependencies.

Also are you sure that using AGPL licensed package the way you did comes under MIT not in AGPL? I mean if you use AGPL package, by default means that you need to distribute it under AGPL license only.

PS: The concern with ghostscript is

  • the licensing term
  • installing the dependencies separately.

@vinayak-mehta
Copy link
Contributor

I understand the part about licensing, we want to remove ghostscript altogether. camelot-dev/camelot#13

Just went through python-pdfbox, it automatically downloads and caches the pdfbox jar file which should make installation easier for users, as installing ghostscript has been a pain on Windows. But then again, the users would need Java to use the library. An interesting tradeoff, we should definitely discuss about it here camelot-dev/camelot#27.

Can you please raise this PR here so that we can see if tests pass. You'll also have to edit setup.py to install python-pdfbox.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants