Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing histogram leaks resources #397

Closed
brunokim opened this Issue Feb 28, 2019 · 4 comments

Comments

Projects
None yet
2 participants
@brunokim
Copy link

brunokim commented Feb 28, 2019

Hey, folks. I've tried using the histogram function for a picture and hit some rough edges. First, simply accessing it leaks resources, which is reproducible with the following program

n = 8
with Image(filename='tests/assets/beach.jpg') as img:
  for _ in range(n):
    _ = img.histogram

I've also tried swapping the order of "with" and "for". The issue seems to be that a 'PixelWand' object is initialized with the object and never properly handled at

self.pixels = library.MagickGetImageHistogram(
.

The other issue is that I can't even iterate over the values of this histogram:

from collections import Counter
with Image(filename='tests/assets/beach.jpg') as img:
  _ = Counter(img.histogram)

/home/bkim/Projects/wand/wand/color.py:113: OptionWarning: unrecognized color `0,0.00784314,0.0117647' @ warning/color.c/GetColorCompliance/1052
self.raise_exception()
Traceback (most recent call last):
File "consume-beach-histogram.py", line 22, in
print(Counter(img.histogram.values()))
File "/usr/lib/python3.6/collections/init.py", line 535, in init
self.update(*args, **kwds)
File "/usr/lib/python3.6/collections/init.py", line 622, in update
_count_elements(self, iterable)
File "/usr/lib/python3.6/_collections_abc.py", line 761, in iter
for key in self._mapping:
File "/home/bkim/Projects/wand/wand/image.py", line 5601, in
for i in xrange(self.size.value))
File "/home/bkim/Projects/wand/wand/color.py", line 116, in init
raise ValueError(msg)
ValueError: Unrecognized color string "0,0.00784314,0.0117647"

My guess is that the function used to build the dict keys in __iter__ (PixelGetColorAsString) is not the same as the one used in __getitem__ (PixelGetColorAsNormalizedString).

@emcconville

This comment has been minimized.

Copy link
Owner

emcconville commented Feb 28, 2019

Hey, thanks for the research!

Looks like the memory leak is caused by python deleting the ctypes container, and not allowing ImageMagick to deallocate the resources. Should be something closer too...

self.pixels = library.DestroyPixelWands(self.pixels, self.size.value)  # <= this is missing
del self.size, self.pixels

As far as the ValueError: Unrecognized color string "0,0.00784314,0.0117647" error. That would be caused by attempting to match PixelGetColorAsNormalizedString between the __iter__ and __getitem__ method. In ImageMagick "normalized" means values between 0 and 1.0, and in this instance, a Color object can not be allocated from those values alone. The PixelGetColorAsString follows color compliance (colorspace, channel count, part-size) to ensure that color-strings can be reconstituted. Perhaps you were adjusting some code? Either way, being able to use Counter(hist). most_common(nth) is very helpful, so I'll spend some time to set-up a test case.

I believe the whole HistogramDict class can be refactored to reduce some overhead. I also believe that the current implementation can have orphaned color values due to rounding errors with HDRI systems.

@brunokim

This comment has been minimized.

Copy link
Author

brunokim commented Feb 28, 2019

Sorry, I was indeed trying to fix HistogramDict in a local copy.

I've added the following line to test_histogram in test_image.py and can reproduce the error:

assert list(h.values()) == [5000, 5000]

raises 'ValueError: Unrecognized color string "0,1,0"'

It seems that this code predates having Color as hashable objects, so it may make sense to forego the string conversion and use the object itself as key, WDYT?

@emcconville

This comment has been minimized.

Copy link
Owner

emcconville commented Mar 1, 2019

Spent a little time on this. Checkout branch issue_397, and see if there's any improvements.

It seems that this code predates having Color as hashable objects, so it may make sense to forego the string conversion and use the object itself as key, WDYT?

Color hashing was apart of the same release as HistogramDict, but I can't speak about the original authors intent & implementation. The whole string conversion is related to the Color internal buffer mapping to ImageMagick-6's MagickPixelPacket structure, and the lack of a color-count. In ImageMagick-7, this switched to PixelInfo structure that does map color-counts. The refactor on issue_397 branch fixes the allocation & value errors, but still needs more testing against the ValueError test cases. Also it's painfully slow for ImageMagick-6, but lightning quick on ImageMagick-7, so a bit more adjustments might be needed.

Tagging #295 & #166 as possible fix might address previously reported issues.

@emcconville emcconville added this to the 0.5.2 milestone Mar 15, 2019

@emcconville

This comment has been minimized.

Copy link
Owner

emcconville commented Mar 15, 2019

Fix applied for memory leak.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.