Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Replace gs subprocess call #133

Merged
merged 13 commits into from Jan 5, 2019
Merged

[MRG] Replace gs subprocess call #133

merged 13 commits into from Jan 5, 2019

Conversation

vinayak-mehta
Copy link
Contributor

@vinayak-mehta vinayak-mehta commented Oct 7, 2018

This is a cleaner way to interact with the ghostscript, need to see if this PyPI package is available in anaconda or not. Also need to suppress the unnecessary logs generated by ghostscript.

EDIT: If it isn't available, maybe we could vendorize it.

Closes #152.

Replace gs subprocess call

Update requirements.txt
@vinayak-mehta vinayak-mehta changed the title Replace gs subprocess call [WIP] Replace gs subprocess call Oct 7, 2018
@KOLANICH
Copy link
Contributor

KOLANICH commented Oct 17, 2018

There is a problem: when we use the API the file descriptor remains open, even if we delete the resulting object. Seems to be a descriptor leak. https://gitlab.com/pdftools/python-ghostscript/issues/13

@vinayak-mehta
Copy link
Contributor Author

@KOLANICH Thanks for finding and reporting the python-ghostscript issue! Looks like python-ghostscript isn't being maintained actively, the last commit was 9 months ago. Maybe we can start a fork and try to fix the issue ourselves. Do you have any other recommendations?

@vinayak-mehta
Copy link
Contributor Author

Another fix (like I mentioned in the #152) would be to remove ghostscript altogether, looking for pure Python packages that can convert a PDF into an image, without any tests failing.

@KOLANICH
Copy link
Contributor

KOLANICH commented Oct 18, 2018

Looks like python-ghostscript isn't being maintained actively, the last commit was 9 months ago.

The last public commit of the author (in another repo) is on July, 2018.

Maybe we can start a fork and try to fix the issue ourselves. Do you have any other recommendations?

1 I guess it makes no sense to fork (in the sense of hitting of a fork button) untill you have a patch ready to be merged back.
2 doing a separate fork may make sense only if the author is uncooperative or unresponsive or having another vision. He has an org on GL, if you are going to develop the lib on permanent basis, I guess you can ask for write privileges.

@vinayak-mehta
Copy link
Contributor Author

1 I guess it makes no sense to fork (in the sense of hitting of a fork button) untill you have a patch ready to be merged back.
2 doing a separate fork may make sense only if the author is uncooperative or unresponsive or having another vision. He has an org on GL, if you are going to develop the lib on permanent basis, I guess you can ask for write privileges.

Makes sense.

Did you try with ghostscript.Ghostscript(*args) as gs:? There's an example in their README which shows the context manager usage.

@vinayak-mehta
Copy link
Contributor Author

There is a problem: when we use the API the file descriptor remains open, even if we delete the resulting object. Seems to be a descriptor leak. https://gitlab.com/pdftools/python-ghostscript/issues/13

@KOLANICH How do you check if the file descriptor is still open after you delete the gs instance as shown in the issue?

@KOLANICH
Copy link
Contributor

How do you check if the file descriptor is still open after you delete the gs instance as shown in the issue?

The file cannot be deleted untill the interpreter is closed.

@codecov-io
Copy link

codecov-io commented Dec 18, 2018

Codecov Report

Merging #133 into master will increase coverage by 0.5%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #133     +/-   ##
=========================================
+ Coverage   86.48%   86.98%   +0.5%     
=========================================
  Files          13       13             
  Lines        1509     1491     -18     
  Branches      349      343      -6     
=========================================
- Hits         1305     1297      -8     
+ Misses        144      137      -7     
+ Partials       60       57      -3
Impacted Files Coverage Δ
camelot/parsers/lattice.py 94.11% <100%> (+4.36%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 73498a9...aad800b. Read the comment docs.

@vinayak-mehta
Copy link
Contributor Author

The file cannot be deleted untill the interpreter is closed.

@KOLANICH I don't face any error while deleting the output file while the interpreter is open. I ran the code from your issue in one terminal and tried to delete the output file from another terminal (which was successful). Am I missing any step?

I've vendorized the ghostscript code within an ext module. Only kept the parts that were absolutely required. I tried running the code from your issue (with the from camelot.ext.ghostscript import Ghostscript import) and was able to delete the output file.

On a separate note, I use the Ghostscript class as a context manager inside the _generate_image function.

@vinayak-mehta
Copy link
Contributor Author

I've vendorized this library since the latest stable version isn't available in Anaconda.

@vinayak-mehta vinayak-mehta changed the title [WIP] Replace gs subprocess call [MRG] Replace gs subprocess call Dec 18, 2018
@KOLANICH
Copy link
Contributor

KOLANICH commented Dec 18, 2018

I don't face any error while deleting the output file while the interpreter is open.

Do you use Windows or Linux? On Linux you can delete an opened file, it disappears from fs, but its space is freed only after the descriptor is freed. On Windows an opened (without taking special measures) file simply cannot be deleted.

@KOLANICH
Copy link
Contributor

KOLANICH commented Dec 18, 2018

I've vendorized this library since the latest stable version isn't available in Anaconda.

Why not to create an own package? Vendorizing is bloat.

@vinayak-mehta
Copy link
Contributor Author

Do you use Windows or Linux? On Linux you can delete an opened file, it disappears from fs, but its space is freed only after the descriptor is freed. On Windows an opened (without taking special measures) file simply cannot be deleted.

I'm on WSL, which is essentially Linux. I didn't take the space being freed into account. Will try on Windows without any special measures.

Why not to create an own package? Vendorizing is bloat.

That is why I kept only the code that is absolutely needed. Will submit a conda recipe to conda-forge when I get time, and remove this vendorized code when it gets accepted.

@vinayak-mehta
Copy link
Contributor Author

I was able to delete the PDF file in use from the file explorer, anaconda prompt and powershell on Windows when using the camelot.ext.ghostscript.Ghostscript as a context manager.

@vinayak-mehta vinayak-mehta merged commit efff61e into master Jan 5, 2019
@vinayak-mehta vinayak-mehta deleted the replace-gs-c-api branch January 5, 2019 06:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants