New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aggressive decommit mode alloc may trigger core in 2.8.0 #1227
Comments
This reverts commit be3da70. There are reports of crashes and false-positive OOMs from this patch. Crashes under aggressive decommit mode are understood, but I have yet to get confirmations whether false-positive OOMs were seen under aggressive decommit or not. Thus lets revert for now. Updates issue #1227 and issue #1204.
|
Much thanks for high quality bug report and also thanks for the patch. This is actually even trickier a bit. And I think your patch doesn't fully fix the problem. The problem is that after we decommited our "main" span that is being deleted, we mark it as sitting on freelist, where in fact it isn't properly inserted there yet. Then when we run decommit for adjacent spans, we drop lock. And while that lock is dropped other operations may "see" our original span and merge with it, corrupting memory in the process. I think better fix would be to just not bothering with decommitting adjacent spans at all. We will rarely face adjacent spans not on returned free list anyways. And all we need from aggressive decommit is dealing with memory we're freeing. All this "adjacent spans" business is not important and tricky to get right. There also another reports of issues with problematic commit. Which may be caused by same bug, or maybe some other bug. I looked through the code and I think it there should be no other bugs. But in any case I've asked if those other reports also do aggressive decommit. For now I have reverted problematic commit. |
|
Got it, I'm not fully understand how span release work and do some tricky patch. I will close this issues. |
Related issues in gperftools: - gperftools/gperftools#1204 - gperftools/gperftools#1227
Summary: Downgrade gperftools to 2.7 to avoid hitting tcmalloc bugs (gperftools/gperftools#1204, gperftools/gperftools#1227) and because we have already extensively tested tcmalloc from gperftools 2.7. We will still use the old Linuxbrew-based third-party archive for ASAN/TSAN builds because the new Linuxbrew-based GCC 5 third-party archive does not have a Clang 7 toolchain anymore (Linuxbrew is being removed from an increasing number of build types). But ASAN/TSAN builds are non-production and do not use tcmalloc anyway so it is OK to use an archive that has gperftools 2.8. Also fix find_or_download_thirdparty.sh to take BUILD_ROOT into account. Test Plan: Jenkins Reviewers: bogdan, tvesely, steve.varnau Reviewed By: steve.varnau Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D11633
Summary: Downgrade gperftools to 2.7 to avoid hitting tcmalloc bugs (gperftools/gperftools#1204, gperftools/gperftools#1227) and because we have already extensively tested tcmalloc from gperftools 2.7. We will still use the old Linuxbrew-based third-party archive for ASAN/TSAN builds because the new Linuxbrew-based GCC 5 third-party archive does not have a Clang 7 toolchain anymore (Linuxbrew is being removed from an increasing number of build types). But ASAN/TSAN builds are non-production and do not use tcmalloc anyway so it is OK to use an archive that has gperftools 2.8. Also fix find_or_download_thirdparty.sh to take BUILD_ROOT into account. Test Plan: Jenkins Reviewers: bogdan, tvesely, steve.varnau Reviewed By: steve.varnau Subscribers: ybase
Summary: Original differential revision: https://phabricator.dev.yugabyte.com/D11633 Original commit: e5d4a27 Downgrade gperftools to 2.7 to avoid hitting tcmalloc bugs (gperftools/gperftools#1204, gperftools/gperftools#1227) and because we have already extensively tested tcmalloc from gperftools 2.7. We will still use the old Linuxbrew-based third-party archive for ASAN/TSAN builds because the new Linuxbrew-based GCC 5 third-party archive does not have a Clang 7 toolchain anymore (Linuxbrew is being removed from an increasing number of build types). But ASAN/TSAN builds are non-production and do not use tcmalloc anyway so it is OK to use an archive that has gperftools 2.8. Also fix find_or_download_thirdparty.sh to take BUILD_ROOT into account. Test Plan: Jenkins: urgent, rebase: 2.4 Reviewers: bogdan, tvesely, steve.varnau Reviewed By: steve.varnau Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D11655
Summary: Original revision: https://phabricator.dev.yugabyte.com/D11633 Original commit: e5d4a27 Downgrade gperftools to 2.7 to avoid hitting tcmalloc bugs (gperftools/gperftools#1204, gperftools/gperftools#1227) and because we have already extensively tested tcmalloc from gperftools 2.7. We will still use the old Linuxbrew-based third-party archive for ASAN/TSAN builds because the new Linuxbrew-based GCC 5 third-party archive does not have a Clang 7 toolchain anymore (Linuxbrew is being removed from an increasing number of build types). But ASAN/TSAN builds are non-production and do not use tcmalloc anyway so it is OK to use an archive that has gperftools 2.8. Also fix find_or_download_thirdparty.sh to take BUILD_ROOT into account. Note that we are using the updated third-party URL built for the 2.4 branch here, because the previous yugabyte-db-thirdparty commit we used in the 2.5.3 branch was yugabyte/yugabyte-db-thirdparty@45c97f4, which was also used in the 2.4 branch, and had the problematic gperftools version 2.8.0. The new commit we are using is https://github.com/yugabyte/yugabyte-db-thirdparty/commits/07aad696773b3db7976568a3c827e96d8c3d24c9, with gperftools downgraded to 2.7.0. Test Plan: Jenkins: urgent, rebase: 2.5.3 Reviewers: bogdan, tvesely, steve.varnau Reviewed By: steve.varnau Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D11665
Summary: Original revision: https://phabricator.dev.yugabyte.com/D11633 Original commit: e5d4a27 Downgrade gperftools to 2.7 to avoid hitting tcmalloc bugs (gperftools/gperftools#1204, gperftools/gperftools#1227) and because we have already extensively tested tcmalloc from gperftools 2.7. We will still use the old Linuxbrew-based third-party archive for ASAN/TSAN builds because the new Linuxbrew-based GCC 5 third-party archive does not have a Clang 7 toolchain anymore (Linuxbrew is being removed from an increasing number of build types). But ASAN/TSAN builds are non-production and do not use tcmalloc anyway so it is OK to use an archive that has gperftools 2.8. Also fix find_or_download_thirdparty.sh to take BUILD_ROOT into account. Test Plan: Jenkins: rebase: 2.7.1 Reviewers: bogdan, tvesely, steve.varnau Reviewed By: steve.varnau Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D11680
Summary: Original differential revision: https://phabricator.dev.yugabyte.com/D11633 Original commit: e5d4a27 Downgrade gperftools to 2.7 to avoid hitting tcmalloc bugs (gperftools/gperftools#1204, gperftools/gperftools#1227) and because we have already extensively tested tcmalloc from gperftools 2.7. We will still use the old Linuxbrew-based third-party archive for ASAN/TSAN builds because the new Linuxbrew-based GCC 5 third-party archive does not have a Clang 7 toolchain anymore (Linuxbrew is being removed from an increasing number of build types). But ASAN/TSAN builds are non-production and do not use tcmalloc anyway so it is OK to use an archive that has gperftools 2.8. Also fix find_or_download_thirdparty.sh to take BUILD_ROOT into account. We are using the updated third-party URL specifically built for the 2.6 branch here. The previous yugabyte-db-thirdparty commit we used in the 2.6 branch was yugabyte/yugabyte-db-thirdparty@ee4e2e4, and it had the problematic gperftools version 2.8. The new commit we are using is https://github.com/yugabyte/yugabyte-db-thirdparty/commits/d83a2e241523b48e9cd8b7bd5dd248e74bf0132c, with gperftools downgraded to 2.7. Test Plan: Jenkins: rebase: 2.6 Reviewers: bogdan, tvesely, steve.varnau Reviewed By: steve.varnau Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D11682
Summary: Downgrade gperftools to 2.7 to avoid hitting tcmalloc bugs (gperftools/gperftools#1204, gperftools/gperftools#1227) and because we have already extensively tested tcmalloc from gperftools 2.7. We will still use the old Linuxbrew-based third-party archive for ASAN/TSAN builds because the new Linuxbrew-based GCC 5 third-party archive does not have a Clang 7 toolchain anymore (Linuxbrew is being removed from an increasing number of build types). But ASAN/TSAN builds are non-production and do not use tcmalloc anyway so it is OK to use an archive that has gperftools 2.8. Also fix find_or_download_thirdparty.sh to take BUILD_ROOT into account. Test Plan: Jenkins Reviewers: bogdan, tvesely, steve.varnau Reviewed By: steve.varnau Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D11633
I want to alloc large memory in aggressive decommit mode, but it trigger an coredump:
tcmalloc alloc
0x84f1e0span and then delete it, in aggressive decommit mode, it will try merge prev span0x84f1b0, so callReleaseSpan(0x84f1b0).But
ReleaseSpan(0x84f1b0)will callRemoveFromFreeList(0x84f1e0), it is a new span and iter is empty, so erase it may core.This is my test source code:
The text was updated successfully, but these errors were encountered: