Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leak in tarfile.py #43436

Closed
jensj mannequin opened this issue May 31, 2006 · 4 comments
Closed

Leak in tarfile.py #43436

jensj mannequin opened this issue May 31, 2006 · 4 comments

Comments

@jensj
Copy link
Mannequin

jensj mannequin commented May 31, 2006

BPO 1497962
Nosy @tim-one

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2006-06-19.02:21:32.000>
created_at = <Date 2006-05-31.06:42:30.000>
labels = []
title = 'Leak in tarfile.py'
updated_at = <Date 2006-06-19.02:21:32.000>
user = 'https://bugs.python.org/jensj'

bugs.python.org fields:

activity = <Date 2006-06-19.02:21:32.000>
actor = 'sf-robot'
assignee = 'none'
closed = True
closed_date = None
closer = None
components = ['None']
creation = <Date 2006-05-31.06:42:30.000>
creator = 'jensj'
dependencies = []
files = []
hgrepos = []
issue_num = 1497962
keywords = []
message_count = 4.0
messages = ['28687', '28688', '28689', '28690']
nosy_count = 3.0
nosy_names = ['tim.peters', 'sf-robot', 'jensj']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue1497962'
versions = []

@jensj
Copy link
Mannequin Author

jensj mannequin commented May 31, 2006

There is a leak when using the tarfile module and the
extractfile method.  Here is a simple example:
 
$ echo "grrr" > x.txt
$ tar cf x.tar x.txt
$ python
Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
[GCC 4.0.2 20050808 (prerelease) (Ubuntu
4.0.1-4ubuntu8)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> import gc
>>> import tarfile
>>> tar = tarfile.open('x.tar', 'r')
>>> f = tar.extractfile('x.txt')
>>> f.read()
'grrr\n'
>>> del f
>>> gc.set_debug(gc.DEBUG_LEAK)
>>> print gc.collect()
gc: collectable <ExFileObject 0xb73d4acc>
gc: collectable <dict 0xb73dcf0c>
gc: collectable <instancemethod 0xb7d2daf4>
3
>>> print gc.garbage
[<tarfile.ExFileObject object at 0xb73d4acc>, {'name':
'x.txt', 'read': <bound method ExFileObject._readnormal
of <tarfile.ExFileObject object at 0xb73d4acc>>, 'pos':
0L, 'fileobj': <open file 'x.tar', mode 'rb' at
0xb73e67b8>, 'mode': 'r', 'closed': False, 'offset':
512L, 'linebuffer': '', 'size': 5L}, <bound method
ExFileObject._readnormal of <tarfile.ExFileObject
object at 0xb73d4acc>>]
>>>

@jensj jensj mannequin closed this as completed May 31, 2006
@jensj jensj mannequin closed this as completed May 31, 2006
@jensj
Copy link
Mannequin Author

jensj mannequin commented Jun 1, 2006

Logged In: YES
user_id=716463

Problem is that the ExfileObject hat an attribute
(self.read) that is a method bound to itself
(self._readsparse or self._readnormal). One solution is to
add "del self.read" to the close method, but someone might
forget to close the object and still get the leak. Another
solution is to change the end of __init__ to:

  if tarinfo.issparse():
      self.sparse = tarinfo.sparse
  else:
      self.sparse = None

and add a read method:

  def read(self, size=None):
      if self.sparse is None:
          return self._readnormal(size)
      else:
          return self._readsparse(size)

@tim-one
Copy link
Member

tim-one commented Jun 1, 2006

Logged In: YES
user_id=31435

There's no evidence of a leak here -- quite the contrary.
As the docs say, DEBUG_LEAK implies DEBUG_SAVEALL, and
DEBUG_SAVEALL results in _all_ cyclic trash getting
appended to gc.garbage. If you don't mess with
gc.set_debug(), you'll discover that gc.garbage is empty at
the end.

In addition, note that the DEBUG_LEAK output plainly says:

gc: collectable ...

That's also telling you that it found collectable cyclic
trash (which it would have reclaimed had you not forced it
to get appended to gc.garbage instead). If gc had found
uncollectable cycles, these msgs would have started with

gc: uncollectable ...

instead.

Most directly, if I run your tarfile open() and file
extraction in an infinite loop (without messing with
gc.set_debug()), the process memory use does not grow over time.

Unless you have other evidence of an actual leak, this
report should be closed without action. Yes, there are
reference cycles here, but they're of kinds cyclic gc reclaims.

@sf-robot
Copy link
Mannequin

sf-robot mannequin commented Jun 19, 2006

Logged In: YES
user_id=1312539

This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant