Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Memory leak with concurrent.futures.ThreadPoolExecutor's map #85754

Open
or12 mannequin opened this issue Aug 19, 2020 · 6 comments
Open

Potential Memory leak with concurrent.futures.ThreadPoolExecutor's map #85754

or12 mannequin opened this issue Aug 19, 2020 · 6 comments
Labels
3.7 (EOL) end of life extension-modules C modules in the Modules dir topic-multiprocessing type-bug An unexpected behavior, bug, or error

Comments

@or12
Copy link
Mannequin

or12 mannequin commented Aug 19, 2020

BPO 41588
Nosy @brianquinlan, @pitrou, @aeros

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2020-08-19.13:55:15.731>
labels = ['extension-modules', '3.7', 'performance']
title = "Potential Memory leak with concurrent.futures.ThreadPoolExecutor's map"
updated_at = <Date 2020-10-29.20:55:16.766>
user = 'https://bugs.python.org/or12'

bugs.python.org fields:

activity = <Date 2020-10-29.20:55:16.766>
actor = 'aeros'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Extension Modules']
creation = <Date 2020-08-19.13:55:15.731>
creator = 'or12'
dependencies = []
files = []
hgrepos = []
issue_num = 41588
keywords = []
message_count = 1.0
messages = ['375647']
nosy_count = 4.0
nosy_names = ['bquinlan', 'pitrou', 'aeros', 'or12']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'resource usage'
url = 'https://bugs.python.org/issue41588'
versions = ['Python 3.7']

@or12
Copy link
Mannequin Author

or12 mannequin commented Aug 19, 2020

I've been debugging a high memory consumption in one of my scripts and traced it back to the concurrent.futures.ThreadPoolExecutor.

When further investigating and playing around, I found out that when using concurrent.futures.ThreadPoolExecutor with the map function, and passing a dictionary to the map's function as an argument, the memory used by the pool won't be freed and as a result the total memory consumption will continue to rise. (Seems like it also happens when passing a list and maybe even other types).

Here is an example of a code to recreate this issue:

#!/usr/bin/env python3

import os
import time
import psutil
import random
import concurrent.futures

from memory_profiler import profile as mem_profile

p = psutil.Process(os.getpid())

def do_magic(values):
    return None

@mem_profile
def foo():
    a = {i: chr(i) for i in range(1024)}
    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as pool:
        proccessed_data = pool.map(do_magic, a)

def fooer():
    while True:
        foo()
        time.sleep(1)

fooer()

@or12 or12 mannequin added 3.7 (EOL) end of life extension-modules C modules in the Modules dir performance Performance or resource usage labels Aug 19, 2020
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@Robert-Lebedeu
Copy link

Still no fix? :(
I had the same issue and I've been forced to patch it using submit method and then waiting for the result of each feature.

@iritkatriel iritkatriel added type-bug An unexpected behavior, bug, or error and removed performance Performance or resource usage labels Sep 12, 2022
@Sxderp
Copy link

Sxderp commented Oct 19, 2022

I ran into this same issue (with a ProcessPool). And after a bit of probing it seems that it's more a generator issue. Unused generators (what .map() returns) are not being garbage collected. If I do list(executor.map(..)) which exhausts the generator I am no longer seeing unbounded memory growth. Of course I'm uselessly slowing down my code, but it's better than having to recreate the ProcessPool / use submit.


Edit:
I think my issue may not have been an actual memory leak, but a misunderstanding of the .map function. I realized that since it's a generator it doesn't actually wait for all the tasks to finish unless it's been iterated on. Thus I was submitting more tasks before they could get completed.

@CaledoniaProject
Copy link

I can't believe this remains unfixed. Any workarounds?

@Robert-Lebedeu
Copy link

@vstinner @iritkatriel

Any ideas on how to fix this issue?

@vstinner
Copy link
Member

I don't have the bandwidth to dig into multiprocessing issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life extension-modules C modules in the Modules dir topic-multiprocessing type-bug An unexpected behavior, bug, or error
Projects
Status: No status
Status: No status
Development

No branches or pull requests

5 participants