Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent failure: src/arena.c:351: Failed assertion: "p[i] == 0" #766

Closed
WesleyHartTX opened this issue Apr 13, 2017 · 6 comments · Fixed by #780
Closed

Intermittent failure: src/arena.c:351: Failed assertion: "p[i] == 0" #766

WesleyHartTX opened this issue Apr 13, 2017 · 6 comments · Fixed by #780
Labels
Milestone

Comments

@WesleyHartTX
Copy link

b973ec7 on rc-4.5.0. Also confirmed as failing on 4.4.0.

Environment is Ubuntu 16.04 LTS, kernel 4.4.0-57-generic, glibc 2.23-0ubuntu7, binutils 2.26.1-1ubuntu1~16.04.3.

--with-jemalloc-prefix=je_ --with-malloc-conf=lg_chunk:20 --enable-stats --enable-debug

The asserts only seem to occur when doing aligned allocations in an arena with custom chunk_hooks. Passing MALLOCX_TCACHE_NONE to je_mallocx (our default) makes the failures intermittent, omitting it seems to make them much more likely.

aligntest.c.txt

@jasone jasone added the notabug label Apr 14, 2017
@jasone
Copy link
Member

jasone commented Apr 14, 2017

The mprotect() call in my_chunk_commit() does not zero memory, but the commit hook is required to provide zeroed memory. From the manual:

A chunk commit function conforms to the chunk_commit_t type and commits zeroed physical memory to back pages within a chunk of given size at offset bytes, extending for length on behalf of arena arena_ind, returning false upon success.

I added memset(addr, 0, length); just before the my_chunk_commit() return, and it resolves the issue.

@WesleyHartTX
Copy link
Author

Jason,

Thanks for the quick response!

I've actually toyed with putting that memset() call in our production code; but I neglected to include it when I was assembling the reproducer test. Apologies.

With the memset() in place, I see a (probably) separate problem where the chunk that's allocated for the first je_mallocx() call isn't always re-used after the allocation in it is free'd. The result is that after a few 100,000 alloc/free pairs, I exhaust my limited memory pool. The perplexing things is that on some runs, I get perfect reuse and all allocations fit in that initial chunk, and on some runs I don't.

Do you want full details here, or shall I open a separate issue?

@jasone
Copy link
Member

jasone commented Apr 14, 2017

We can use this issue to track the exhaustion symptom. Incidentally, you're probably better off using mmap() than memset() to zero the newly committed memory.

Please capture the output of malloc_stats_print() at the time memory is exhausted, to get some visibility into memory usage, fragmentation, and retained virtual memory.

@WesleyHartTX
Copy link
Author

Memory exhaustion reproducer:

exhaust.c.txt

It may take multiple runs to trigger the out of memory issue.

Output of malloc_stats_print():

stats.txt

@jasone jasone added this to the 5.0.0 milestone Apr 18, 2017
jasone added a commit to jasone/jemalloc that referenced this issue Apr 18, 2017
When allocating runs with alignment stricter than one page, commit after
trimming the head/tail from the initial over-sized allocation, rather
than before trimming.  This avoids creating clean-but-committed runs;
such runs do not get purged (and decommitted as a side effect), so they
can cause unnecessary long-term run fragmentation.

Do not commit decommitted memory in chunk_recycle() unless asked to by
the caller.  This allows recycled arena chunks to start in the
decommitted state, and therefore increases the likelihood that purging
after run deallocation will allow the arena chunk to become a single
unused run, thus allowing the chunk as a whole to be discarded.

This resolves jemalloc#766.
@jasone jasone mentioned this issue Apr 18, 2017
@jasone
Copy link
Member

jasone commented Apr 18, 2017

#775 fixes this for the master branch. I'll try to fix it on the dev branch tomorrow.

@jasone
Copy link
Member

jasone commented Apr 18, 2017

Actually, #775 breaks both macOS and Windows, so I'll have to dig into that first tomorrow.

jasone added a commit to jasone/jemalloc that referenced this issue Apr 18, 2017
When allocating runs with alignment stricter than one page, commit after
trimming the head/tail from the initial over-sized allocation, rather
than before trimming.  This avoids creating clean-but-committed runs;
such runs do not get purged (and decommitted as a side effect), so they
can cause unnecessary long-term run fragmentation.

Do not commit decommitted memory in chunk_recycle() unless asked to by
the caller.  This allows recycled arena chunks to start in the
decommitted state, and therefore increases the likelihood that purging
after run deallocation will allow the arena chunk to become a single
unused run, thus allowing the chunk as a whole to be discarded.

This resolves jemalloc#766.
jasone added a commit to jasone/jemalloc that referenced this issue Apr 19, 2017
This avoids creating clean committed pages as a side effect of aligned
allocation.  For configurations that decommit memory, purged pages are
decommitted, and decommitted extents cannot be coalesced with committed
extents.  Unless the clean committed pages happen to be selected during
allocation, they cause unnecessary permanent extent fragmentation.

This resolves jemalloc#766.
jasone added a commit that referenced this issue Apr 20, 2017
This avoids creating clean committed pages as a side effect of aligned
allocation.  For configurations that decommit memory, purged pages are
decommitted, and decommitted extents cannot be coalesced with committed
extents.  Unless the clean committed pages happen to be selected during
allocation, they cause unnecessary permanent extent fragmentation.

This resolves #766.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants