Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bio.PDB - deepcopy() causes RecursionError on Disordered Entity #787

Closed
andrewguy opened this issue Mar 18, 2016 · 4 comments
Closed

Bio.PDB - deepcopy() causes RecursionError on Disordered Entity #787

andrewguy opened this issue Mar 18, 2016 · 4 comments

Comments

@andrewguy
Copy link
Contributor

I'm currently using the Python subprocess module to perform parallel computations using a Bio.PDB chain object. When running this with some structures, a RecursionError is raised during this process, which is due to the subprocess module performing a deepcopy operation on the Bio.PDB chain object.

When simply trying to deepcopy, I get the following (with the first two lines repeated many times):

 File "/home/user/Programs/anaconda3/lib/python3.5/site-packages/Bio/PDB/Entity.py", line 206, in __getattr__
    if not hasattr(self, 'selected_child'):
RecursionError: maximum recursion depth exceeded while calling a Python object

The particular pdb file I was working with at the time was 1zro.pdb, although this isn't the only file that has this issue.

Any suggestions as to a suitable workaround/fix?

Cheers,

Andrew

@peterjc
Copy link
Member

peterjc commented Mar 18, 2016

Suggested workaround: Pass the filename and chain identifier to each child process instead, and re-parse the PDB file.

Can you cut down your code to make a short self contained example script to help reproduce the problem?

@andrewguy
Copy link
Contributor Author

Hi Peter,

That's a good suggestion, I will reconstruct the PDB object for each child process.

This code should minimally reproduce the problem:

import urllib.request
from io import StringIO
from copy import deepcopy
import Bio.PDB

pdb_name = '1zro'
#Download file
pdb_url = "http://www.rcsb.org/pdb/files/" + pdb_name + ".pdb"
pdb_file = urllib.request.urlopen(pdb_url)
#Create string stream for Bio.PDB to read:
pdb_input_stream = StringIO(pdb_file.read().decode('utf-8'))

#Make Bio.PDB structure object
parser = Bio.PDB.PDBParser()
structure = parser.get_structure(pdb_name, pdb_input_stream)

#Attempt to perform a deepcopy on Bio.PDB Chain:
structure_copy = deepcopy(structure[0]['A'])

(Apologies, the file was 1zro, not 1zrl...)

@JoaoRodrigues
Copy link
Member

This has been referenced before in #302 . It seems to be a case only when disordered residues are present.. At the time this Python bug report was also linked to the thread.

As Peter suggested, a simple workaround is to just re-parse the file in each subprocess. I'll have a look in the coming days, see if we can close this for good without a major code overhaul..

@andrewguy
Copy link
Contributor Author

Just coming back to this issue with fresh eyes. The issue with deepcopy arises in Python 3, but not Python 2. It seems to be an issue with the line if hasattr(y, '__setstate__'): in the deepcopy code generating infinite recursion when disordered atoms are present. I'm not sure why this is the case though - simply calling hasattr(atom, '__setstate__') on a disordered atom doesn't raise any errors.

This has been seen in another project as well - cloudtools/troposphere#648

The solution proposed there is a bit hacky, but it works. They are just raising an AttributeError if __getattr(self, '__setstate__') is called. Have created a pull request with the relevant tests to illustrate this - #1075

@peterjc peterjc closed this as completed in f071a78 Mar 7, 2017
MarkusPiotrowski pushed a commit to MarkusPiotrowski/biopython that referenced this issue Oct 31, 2017
Squashed commit of pull request biopython#1075, closes biopython#787.

Raise AttributeError when `__getattr__('__setstate__')` is called to fix `copy.deepcopy` recursing under Python 3.6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants