Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow PDBIO to write to an open IO object without closing it #4200

Merged
merged 3 commits into from
Dec 15, 2022

Conversation

jgreener64
Copy link
Contributor

  • I hereby agree to dual licence this and any previous contributions under both
    the Biopython License Agreement AND the BSD 3-Clause License.

  • I have read the CONTRIBUTING.rst file, have run pre-commit
    locally, and understand that continuous integration checks will be used to
    confirm the Biopython unit tests and style checks pass with these changes.

  • I have added my name to the alphabetical contributors listings in the files
    NEWS.rst and CONTRIB.rst as part of this pull request, am listed
    already, or do not wish to be listed. (This acknowledgement is optional.)

Recent changes to PDBIO in be77131 switched to using a with fhandle block for the open file handle. This is fine in the case of the file being opened by PDBIO, but another useful situation is having an open file handle or string object that you can read from or write to again.

This PR only closes the file handle if PDBIO opened it, allowing for example:

from Bio.PDB import *
from io import StringIO

parser = PDBParser()
structure = parser.get_structure("6YYT", "6YYT.pdb")
res = structure[0]["A"][31]

io_pdb = PDBIO()
io_pdb.set_structure(res)
io_str = StringIO()
io_pdb.save(io_str)
print(io_str.getvalue())
ATOM      1  N   VAL A  31     125.110  92.109  68.636  1.00 52.46           N  
ATOM      2  CA  VAL A  31     124.229  91.412  69.563  1.00 52.46           C  
ATOM      3  C   VAL A  31     122.990  92.264  69.821  1.00 52.46           C  
ATOM      4  O   VAL A  31     122.766  93.266  69.142  1.00 52.46           O  
ATOM      5  CB  VAL A  31     124.958  91.076  70.879  1.00 52.46           C  
ATOM      6  CG1 VAL A  31     126.011  90.005  70.641  1.00 52.46           C  
ATOM      7  CG2 VAL A  31     125.585  92.327  71.479  1.00 52.46           C  
TER       8      VAL A  31                                                       
END   

whereas on Biopython 1.79 the last line errors as io_str has been closed by save:

ValueError: I/O operation on closed file

Most of the code change is white space, it just adds a flag that closes the handle depending on whether open was called.

This initially came to my attention due to a downstream error in NGLView (https://bioinformatics.stackexchange.com/questions/20151/file-i-o-error-using-nglview-show-biopythonstructure) which uses this behaviour to access the string of a written PDB file without writing a file (https://github.com/nglviewer/nglview/blob/master/nglview/adaptor.py#L193-L200).

Bio/PDB/PDBIO.py Outdated
if write_end:
fhandle.write("END \n")

if should_close:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be able to use isinstance(file, str) here instead of a flag variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

Copy link
Member

@peterjc peterjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Tip: append ?w=1 on the diff view in GitHub to hide the indentation change which makes this look more dramatic than it really is.

@JoaoRodrigues
Copy link
Member

Looks good to me as well. I'll merge it. Thanks Joe!

@peterjc Nice trick with the ?w=1 to hide indentation, it took me a while to figure out what exactly had changed!

@JoaoRodrigues JoaoRodrigues merged commit 7ef5655 into biopython:master Dec 15, 2022
@peterjc
Copy link
Member

peterjc commented Dec 16, 2022

To help remember this, it's a w for whitespace, just like the diff -w ... command line option.

@jgreener64 jgreener64 deleted the file-handle branch December 16, 2022 11:22
@jgreener64
Copy link
Contributor Author

There is also a checkbox to hide white space under the settings dropdown in the diff viewer.

@peterjc
Copy link
Member

peterjc commented Dec 16, 2022

I see, via the cog icon - I wonder how long that's been there? Thanks!

@pippo1990
Copy link

pippo1990 commented Dec 16, 2022

Could you check if solution could be like PDBParser [PDBParser.py] handles it:

from Bio.File import as_handle ##github.com/biopython/biopython/blob/…

and in PDBIO.py save method #[https://github.com/biopython/biopython/blob/7ef56557bdca66eabd62334e61c4dc3153c997ff/Bio/PDB/PDBIO.py#L299]

change with fhandle to with as_handle(fhandle) , was referring to old PDBIO 1.80 that was using with fhandle

or betterr remove "if isinstance else ..." and change "with fhandle:" to "with as_handle(file, mode = 'w') as fhandle:"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants