Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP MemoryError with a lot of available memory gc not called #43690

Closed
markmat mannequin opened this issue Jul 19, 2006 · 20 comments
Closed

PEP MemoryError with a lot of available memory gc not called #43690

markmat mannequin opened this issue Jul 19, 2006 · 20 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@markmat
Copy link
Mannequin

markmat mannequin commented Jul 19, 2006

BPO 1524938
Nosy @loewis, @pitrou, @briancurtin
Files
  • unnamed
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-08-26.21:21:29.235>
    created_at = <Date 2006-07-19.02:46:19.000>
    labels = ['interpreter-core', 'performance']
    title = 'PEP MemoryError with a lot of available memory gc not called'
    updated_at = <Date 2010-08-26.21:21:29.234>
    user = 'https://bugs.python.org/markmat'

    bugs.python.org fields:

    activity = <Date 2010-08-26.21:21:29.234>
    actor = 'loewis'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-08-26.21:21:29.235>
    closer = 'loewis'
    components = ['Interpreter Core']
    creation = <Date 2006-07-19.02:46:19.000>
    creator = 'markmat'
    dependencies = []
    files = ['18583']
    hgrepos = []
    issue_num = 1524938
    keywords = []
    message_count = 20.0
    messages = ['29202', '29203', '29204', '29205', '29206', '29207', '29208', '29209', '86769', '90581', '91312', '114225', '114230', '114237', '114262', '114416', '114424', '114425', '114987', '115026']
    nosy_count = 9.0
    nosy_names = ['loewis', 'jimjjewett', 'pitrou', 'illume', 'markmat', 'brian.curtin', 'swapnil', 'ysj.ray', 'Itai.i']
    pr_nums = []
    priority = 'low'
    resolution = 'wont fix'
    stage = 'needs patch'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue1524938'
    versions = []

    @markmat
    Copy link
    Mannequin Author

    markmat mannequin commented Jul 19, 2006

    Also the gc behavior is consistent with the
    documentation, I beleave it is wrong. I think, that Gc
    should be called automatically before any memory
    allocation is raised.

    Example 1:
    for i in range(700):
    a = [range(5000000)]
    a.append(a)
    print i

    This example will crash on any any PC with less then
    20Gb RAM. On my PC (Windows 2000, 256Mb) it crashes at
    i==7.
    Also, this example can be fixed by addition of a call
    to gc.collect() in the loop, in real cases it may be
    unreasonable.

    @markmat markmat mannequin added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jul 19, 2006
    @illume
    Copy link
    Mannequin

    illume mannequin commented Jul 19, 2006

    Logged In: YES
    user_id=2042

    Perhaps better than checking before every memory allocation,
    would be to check once a memory error happens in an allocation.

    That way there is only the gc hit once there is low memory.

    So...

    res = malloc(...);
    if(!res) {
        gc.collect();
    }
    
    res = malloc(...);
    if(!res) {
        raise memory error.
    }

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jul 23, 2006

    Logged In: YES
    user_id=21627

    This is very difficult to implement. The best way might be
    to introduce yet another allocation function, one that
    invokes gc before failing, and call that function in all
    interesting places (of which there are many).

    Contributions are welcome and should probably start with a
    PEP first.

    @markmat
    Copy link
    Mannequin Author

    markmat mannequin commented Jul 23, 2006

    Logged In: YES
    user_id=1337765

    This is exectly what I meant.
    For my recollection, this is the policy in Java GC. I never
    had to handle MemoryError in Java, because I knew, that I
    really do not have any more memory.

    @markmat
    Copy link
    Mannequin Author

    markmat mannequin commented Jul 23, 2006

    Logged In: YES
    user_id=1337765

    Sorry, my last comment was to illume (I am slow typer :( )

    @jimjjewett
    Copy link
    Mannequin

    jimjjewett mannequin commented Aug 2, 2006

    Logged In: YES
    user_id=764593

    Doing it everywhere would be a lot of painful changes.

    Adding the "oops, failed, call gc and try again" to to
    PyMem_* (currently PyMem_Malloc, PyMem_Realloc, PyMem_New,
    and PyMem_Resize, but Brett may be changing that) is far
    more reasonable.

    Whether it is safe to call gc from there is a different
    question.

    @markmat
    Copy link
    Mannequin Author

    markmat mannequin commented Aug 3, 2006

    Logged In: YES
    user_id=1337765

    Another problem related to the above example: there is a
    time waste due to a memory swap before the MemoryError.
    Possible solution is to use a dynamic memory limit: GC is
    called when the limit is reached, then the limit is adjusted
    according to the memory left.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Aug 3, 2006

    Logged In: YES
    user_id=21627

    The example is highly constructed, and it is pointless to
    optimize for a boundary case. In the average application,
    garbage collection is invoked often enough to reclaim memory
    before swapping occurs.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 28, 2009

    Lowering priority since, as Martin said, it shouldn't be needed in
    real-life situations.

    @pitrou pitrou added performance Performance or resource usage and removed type-feature A feature request or enhancement labels Apr 28, 2009
    @markmat
    Copy link
    Mannequin Author

    markmat mannequin commented Jul 16, 2009

    It looks like the severity of this problem is underestimated here.

    A programmer working with a significant amount of data (e.g SciPy user)
    and uses OOP will face this problem. Most OOP designs result in
    existence of some loops (e.g. two way connections). Some object in those
    loops will include huge amount of data which were allocated by a single
    operation if the program deals with some kind of algorithms (signal
    processing, image processing or even 3D games).

    I apologize that my example is artificial. I had a real-life program of
    8000 lines which was going into swap for no apparent reason and then
    crashing. But instead of posting those 8000 lines, I posted a simple
    example illustrating the problem.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 5, 2009

    I'm not sure what we should do anyway. Your program will first swap out
    and thrash before the MemoryError is raised. Invoking the GC when memory
    allocation fails would avoid the MemoryError, but not the massive
    slowdown due to swapping.

    @Itaii
    Copy link
    Mannequin

    Itaii mannequin commented Aug 18, 2010

    Hi all,

    I'm joining Mark's assertion - this is a real issue for me too. I've stumbled into this problem too.
    I have a numpy/scipy kind of application (about 6000+ lines so far) which needs to allocate alot of memory for statistics derived from "real life data" which is then transformed a few times by different algorithms (which means allocating more memory, but dumping the previous objects).

    Currently I'm getting MemoryError when I try to use the entire dataset, both on linux and on windows, python 2.5 on 64bit 4gb mem machine. (The windows python is a 32bit version though cause it needs to be compatible with some dlls. This is the same reason I use python 2.5)

    @ysjray
    Copy link
    Mannequin

    ysjray mannequin commented Aug 18, 2010

    How about calling gc.collect() explicitly in the loop?

    @Itaii
    Copy link
    Mannequin

    Itaii mannequin commented Aug 18, 2010

    Sure, that's what i'll do for now. Its an ok workaround for me, I was just
    posting to support the
    notion that its a bug (lets call it usability bug) and something that people
    out there do run into.

    There's also a scenerio where you couldn't use this workaround - for example
    use a library
    precompiled in a pyd..

    On Wed, Aug 18, 2010 at 6:13 PM, Ray.Allen <report@bugs.python.org> wrote:

    Ray.Allen <ysj.ray@gmail.com> added the comment:

    How about calling gc.collect() explicitly in the loop?

    ----------
    nosy: +ysj.ray


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue1524938\>


    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Aug 18, 2010

    Anybody *really* interested in this issue: somebody will need to write a PEP, get it accepted, and provide an implementations. Open source is about scratching your own itches: the ones affected by a problems are the ones which are also expected to provide solutions.

    @Itaii
    Copy link
    Mannequin

    Itaii mannequin commented Aug 20, 2010

    You are right, ofcourse... I haven't got the time for doing the right thing,
    But I've found another workaround that helped me though and might be helpful
    to others.

    (not sure its for this thread though but...) Windows on default limits the
    amount of memory
    for 32 bit processes to 2GB. There's a bit in the PE image which tells 64
    bit windows
    to give it 4GB (on 32 bit windows PAE needs to be enabled too) which is
    called
    IMAGE_FILE_LARGE_ADDRESS_AWARE. There's a post-build way to enable
    it with the editbin.exe utility which comes with visual studio like this:
    editbin.exe /LARGEADDRESSAWARE python.exe

    It works for me since it gives me x2 memory on my 64 bit os.
    I have to say it could be dangerous since it essentially says no where in
    python code
    pointers are treated as negative numbers. I figured this should be right
    since there's a 64 bit
    version of python...

    On Wed, Aug 18, 2010 at 9:27 PM, Martin v. Löwis <report@bugs.python.org>wrote:

    Martin v. Löwis <martin@v.loewis.de> added the comment:

    Anybody *really* interested in this issue: somebody will need to write a
    PEP, get it accepted, and provide an implementations. Open source is about
    scratching your own itches: the ones affected by a problems are the ones
    which are also expected to provide solutions.

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue1524938\>


    @swapnil
    Copy link
    Mannequin

    swapnil mannequin commented Aug 20, 2010

    Mark, are you sure that the above program is sure to cause a crash. I had absolutely no problem running it with Python 3.1.2. With Python 2.6.5, PC went terribly slow but the program managed to run till i==14 without crashing. I did not wait to see if it reaches 700. I'm running it on XP.

    @briancurtin
    Copy link
    Member

    (not sure its for this thread though but...) Windows on default limits
    the amount of memory for 32 bit processes to 2GB. There's a bit in
    the PE image which tells 64 bit windows to give it 4GB (on 32 bit
    windows PAE needs to be enabled too) which is called
    IMAGE_FILE_LARGE_ADDRESS_AWARE. There's a post-build way to enable
    it with the editbin.exe utility which comes with visual studio like
    this: editbin.exe /LARGEADDRESSAWARE python.exe

    See bpo-1449496 if you are interested in that.

    @markmat
    Copy link
    Mannequin Author

    markmat mannequin commented Aug 26, 2010

    This is what I got on computer with 512 MB RAM:

    Mandriva Linux 2009.1
    =============================
    Python 2.6.1 (r261:67515, Jul 14 2010, 09:23:11) [GCC 4.3.2]
    -----> Python process killed by operating system after 14

    Microsoft Windows XP Professional
    Version 5.1.2600 Service Pack 2 Build 2600
    =============================================
    Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)]
    -----> MemoryError after 10

    Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)]
    -----> MemoryError after 10

    Python 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)]
    -----> MemoryError after 10

    Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit (Intel)]
    -----> Sucessfull finnish in no time!!!

    Unfortunately I cannot test the original program I had the problem with, because since the original post (2006) I changed the employer. Now I use Matlab :(

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Aug 26, 2010

    Ok, I'm closing this as "won't fix". The OP doesn't have the issue anymore; anybody else having some issue please report that separately (taking into account that you are likely asked to provide a patch as well).

    @loewis loewis mannequin closed this as completed Aug 26, 2010
    @loewis loewis mannequin closed this as completed Aug 26, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants