Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit liballoc for leaks in `Drop` impls when user destructor panics #67290

Merged
merged 11 commits into from Feb 26, 2020

Conversation

@jonas-schievink
Copy link
Member

jonas-schievink commented Dec 14, 2019

Inspired by #67243 and #67235, this audits and hopefully fixes the remaining Drop impls in liballoc for resource leaks in the presence of panics in destructors called by the affected Drop impl.

This does not touch Hash{Map,Set} since they live in hashbrown. They have similar issues though.

r? @KodrAus

@jonas-schievink

This comment was marked as resolved.

Copy link
Member Author

jonas-schievink commented Dec 14, 2019

vec_deque.rs is now too long for tidy. Should I split something out or just add // ignore-tidy-filelength?

@jonas-schievink jonas-schievink changed the title Audit liballoc for leaks in `Drop` impls Audit liballoc for leaks in `Drop` impls when user destructor panics Dec 14, 2019
@rust-highfive

This comment was marked as resolved.

Copy link
Collaborator

rust-highfive commented Dec 14, 2019

The job x86_64-gnu-llvm-7 of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-12-14T00:10:43.7112891Z ##[command]git remote add origin https://github.com/rust-lang/rust
2019-12-14T00:10:43.7310291Z ##[command]git config gc.auto 0
2019-12-14T00:10:43.7381639Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2019-12-14T00:10:43.7426752Z ##[command]git config --get-all http.proxy
2019-12-14T00:10:43.7559980Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/67290/merge:refs/remotes/pull/67290/merge
---
2019-12-14T00:16:24.7916637Z    Compiling serde_json v1.0.40
2019-12-14T00:16:27.0868042Z    Compiling tidy v0.1.0 (/checkout/src/tools/tidy)
2019-12-14T00:16:36.3711405Z     Finished release [optimized] target(s) in 1m 17s
2019-12-14T00:16:36.3804205Z tidy check
2019-12-14T00:16:36.7004675Z tidy error: /checkout/src/liballoc/collections/vec_deque.rs: too many lines (3018) (add `// ignore-tidy-filelength` to the file to suppress this error)
2019-12-14T00:16:38.8452169Z some tidy checks failed
2019-12-14T00:16:38.8452989Z Found 485 error codes
2019-12-14T00:16:38.8453294Z Found 0 error codes with no tests
2019-12-14T00:16:38.8453496Z Done!
2019-12-14T00:16:38.8453496Z Done!
2019-12-14T00:16:38.8453779Z 
2019-12-14T00:16:38.8453979Z 
2019-12-14T00:16:38.8455240Z command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor"
2019-12-14T00:16:38.8455846Z 
2019-12-14T00:16:38.8456238Z 
2019-12-14T00:16:38.8459035Z failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
2019-12-14T00:16:38.8459547Z Build completed unsuccessfully in 0:01:21
2019-12-14T00:16:38.8459547Z Build completed unsuccessfully in 0:01:21
2019-12-14T00:16:38.8509429Z == clock drift check ==
2019-12-14T00:16:38.8520316Z   local time: Sat Dec 14 00:16:38 UTC 2019
2019-12-14T00:16:38.8948448Z   network time: Sat, 14 Dec 2019 00:16:38 GMT
2019-12-14T00:16:38.8953945Z == end clock drift check ==
2019-12-14T00:16:40.2591988Z 
2019-12-14T00:16:40.2686065Z ##[error]Bash exited with code '1'.
2019-12-14T00:16:40.2713703Z ##[section]Starting: Checkout
2019-12-14T00:16:40.2715169Z ==============================================================================
2019-12-14T00:16:40.2715340Z Task         : Get sources
2019-12-14T00:16:40.2715383Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@Centril

This comment was marked as resolved.

Copy link
Member

Centril commented Dec 14, 2019

vec_deque.rs is now too long for tidy. Should I split something out or just add // ignore-tidy-filelength?

If you do ignore, please file an issue for yourself to split it later.

Copy link
Contributor

KodrAus left a comment

Thanks for working on this @jonas-schievink! I need to spend more time digging through the changes, especially to Vec. In general, I think we need comments on all of these drop guards that document what they're doing.

cc @rust-lang/libs do you have any thoughts on this? This PR makes collections try a bit harder to drop their contents in the presence of panics, so they behave in the same as ptr::drop_in_place and multiple locals when one of them panics during drop.

We've already done this for LinkedList and VecDeque. This PR includes:

  • BinaryHeap::drain_sorted (which is unstable)
  • BTreeMap::into_iter (used by BTreeMap during drop)
  • LinkedList::drain_filter
  • VecDeque::truncate
  • VecDeque::drain
  • Vec::into_iter
  • Vec::drain
src/liballoc/vec.rs Outdated Show resolved Hide resolved
@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Dec 16, 2019

I've historically hoped that it would never come to this myself, maintaining this sort of code is really hard and provides very little benefit in practice. In some sense though this was inevitable, so beyond auditing for correctness I don't think there's much we can do but merge this.

@Amanieu

This comment has been minimized.

Copy link
Contributor

Amanieu commented Dec 16, 2019

I would personally like to go the way C++ did and always abort on any unwind from a destructor (even if we weren't already unwinding). However I'm not quite sure whether that is a breaking change.

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented Dec 16, 2019

I haven't reviewed the implementation but if it does what it says (matching the behavior of ptr::drop_in_place in the case of panics) then I am on board with accepting this.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Dec 22, 2019

What does ptr::drop_in_place do, btw? In particular if a second panic happens after the first one was caught.

@jonas-schievink

This comment has been minimized.

Copy link
Member Author

jonas-schievink commented Dec 22, 2019

Double panics always abort

@bors

This comment was marked as resolved.

Copy link
Contributor

bors commented Dec 23, 2019

☔️ The latest upstream changes (presumably #67540) made this pull request unmergeable. Please resolve the merge conflicts.

@jonas-schievink jonas-schievink force-pushed the jonas-schievink:leak-audit branch from d9da5f6 to 9e5578e Dec 23, 2019
@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

Mark-Simulacrum commented Jan 19, 2020

It sounds like we need to come to consensus about whether to accept this patch, or instead aim to opt for the behavior C++ has, i.e., aborting on panic in destructors unconditionally.

I think we should explore making destructors abort on panic, but in a separate PR (I would be interested in code size and possible performance benefits). For now I would land this PR -- it seems to be consistent with what we have been doing elsewhere (I have not verified the implementation in too much detail, but it seems to be correct -- I think there are tests for every new added impl, but they appear to be in different files so it's a bit hard to be certain without comparing one by one).


In terms of the PR itself, it seems like it would be nice to have some primitive for what amounts to Iterator::for_each(drop), but continuing on panic, rather than duplicating similar code across different Drop impls.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jan 19, 2020

☔️ The latest upstream changes (presumably #67758) made this pull request unmergeable. Please resolve the merge conflicts.

@jonas-schievink jonas-schievink force-pushed the jonas-schievink:leak-audit branch from 9e5578e to 52d6c90 Jan 19, 2020
@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Jan 19, 2020

The job x86_64-gnu-llvm-7 of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2020-01-19T19:31:08.9563979Z ========================== Starting Command Output ===========================
2020-01-19T19:31:08.9565475Z [command]/bin/bash --noprofile --norc /home/vsts/work/_temp/9f4e992f-892f-4f96-a74e-ca851a57db42.sh
2020-01-19T19:31:08.9565508Z 
2020-01-19T19:31:08.9567941Z ##[section]Finishing: Disable git automatic line ending conversion
2020-01-19T19:31:08.9573874Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/67290/merge to s
2020-01-19T19:31:08.9575504Z Task         : Get sources
2020-01-19T19:31:08.9575536Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-01-19T19:31:08.9575568Z Version      : 1.0.0
2020-01-19T19:31:08.9575601Z Author       : Microsoft
---
2020-01-19T19:31:09.9569875Z ##[command]git remote add origin https://github.com/rust-lang/rust
2020-01-19T19:31:09.9581477Z ##[command]git config gc.auto 0
2020-01-19T19:31:09.9583779Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2020-01-19T19:31:09.9585690Z ##[command]git config --get-all http.proxy
2020-01-19T19:31:09.9592121Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/67290/merge:refs/remotes/pull/67290/merge

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jan 27, 2020

Thanks for your patience @jonas-schievink! It looks like we're generally on-board with doing this, so I've got some time this week to give it a thorough 👀

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented Feb 9, 2020

I agree with #67290 (comment) -- let's land this change once the implementation is reviewed, and then look into what it would take to make all panic inside Drop behave like a double panic. I think it should be doable. We'd be turning currently panicking programs into aborting programs which is no different from what this PR does.

@ssomers

This comment has been minimized.

Copy link
Contributor

ssomers commented Feb 10, 2020

Am I right that in the case of drain_filter, the fix and tests are not about leaks, but about the desired stamina of the DrainFilter iterators. The way it has been and is in master, inside a DrainFilter's own Drop,

  • When a panic occurs in the predicate, DrainFilter quits abruptly. It does not drain the element that freaked out the predicate, it does not do anything for any further element in the container. That behaviour stays (and is intuitive to me: you asked us to drain with a predicate, the predicate acts like a clown, so we're taking the easy way out).
  • When the predicate said "yes, drain", but a panic occurs in the drop of a drained element, DrainFilter also quits. That's not a leak, because the remaining elements are still happily owned by the container. That behaviour is about to change: DrainFilter will invoke the predicate on all further elements, and drain and drop them if the predicate says to.
Copy link
Contributor

KodrAus left a comment

From a safety perspective this looks valid to me. We're mostly just shifting existing logic into guards, and I can't spot any invalid behavior when those guards are triggered partway through a dropping operation, since they remain internally consistent up to that point.

@Dylan-DPC

This comment has been minimized.

Copy link
Member

Dylan-DPC commented Feb 25, 2020

@KodrAus what's blocking this from getting merged?

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Feb 25, 2020

@Dylan-DPC I wanted to look at @ssomers comment about DrainFilter more closely, but am happy to push forward now. The path we're on in this PR is:

  • Panics during drop in all standard containers will try to continue dropping elements. If a subsequent element panics then, because we're already unwinding, we abort.
  • Chances are if one element panicked then the next one will too, so a panicky drop in a collection will just end up aborting in practice.
  • Panics in collections outside of drop don't do anything special.
  • The next step will be considering making panic in drop always abort. I imagine that would need an RFC.

@bors r+

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Feb 25, 2020

📌 Commit e5987a0 has been approved by KodrAus

@ssomers

This comment has been minimized.

Copy link
Contributor

ssomers commented Feb 26, 2020

My comments were really confined to linked_list::DrainFilter. I now know that part of this PR is mimicking the behaviour of vec::DrainFilter, which is a good thing regardless of the ideal behaviour. For its implementation, vec::DrainFilter even has a panic_flag (which means "panic in predicate", not in drop). I don't doubt that vec::Drain (which takes a range argument) probably is easier to understand and to implement, if the specified range is always completely removed, rather than leaving half of it in the vector after an element's drop panicked.

But unlike Drain, the DrainFilters have this predicate to invoke, which could also panic. Everyone seems to agree that it's better they don't try to finish the job by invoking the predicate on the remaining elements. So you've already lost an equivalent of Drains guarantee. Why add code to provide some guarantee in the case of drop panic, other than container invariants and avoiding actual leaks? It's stating that:

v.drain_filter(|_| true); 

cannot leave any element behind because that would be as bad as a leak, while it's normal to see elements left behind after:

v.drain_filter(|_| {
    assert!(something_mostly_true());
    true
}); 

And if the guarantee is considered important enough to provide code for, it should be important enough to document (for vec and linked_list and probably more alike). Right now, drain_filter's documentation doesn't even say that a drop of the iterator will continue the drain, which I genuinely wondered about and which the documentation of the stable vec::drain does clarify.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Feb 26, 2020

⌛️ Testing commit e5987a0 with merge 892cb14...

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Feb 26, 2020

☀️ Test successful - checks-azure
Approved by: KodrAus
Pushing 892cb14 to master...

@bors bors added the merged-by-bors label Feb 26, 2020
@bors bors merged commit 892cb14 into rust-lang:master Feb 26, 2020
5 checks passed
5 checks passed
homu Test successful
Details
pr Build #20200119.34 succeeded
Details
pr (Linux mingw-check) Linux mingw-check succeeded
Details
pr (Linux x86_64-gnu-llvm-7) Linux x86_64-gnu-llvm-7 succeeded
Details
pr (Linux x86_64-gnu-tools) Linux x86_64-gnu-tools succeeded
Details
drop(pair);
mem::forget(guard);
}

unsafe {

This comment has been minimized.

Copy link
@programmerjake

programmerjake Mar 4, 2020

Should the stuff in this unsafe block also run when unwinding from user drop? It looks like it does more deallocation.

@RalfJung

This comment has been minimized.

Copy link
Member

RalfJung commented Mar 5, 2020

FWIW, Miri can also detect memory leaks and I am running it on the liballoc test suite. It seems you are covered in terms of tests here, but if you think Miri could be helpful, please let me know.

Also, the latest run of Miri actually failed with a memory leak -- something landed during the last 9 days that made that happen. I'll have some fun figuring out the details.^^

@jonas-schievink jonas-schievink deleted the jonas-schievink:leak-audit branch Mar 6, 2020
}

let mut map = BTreeMap::new();
map.insert("a", D);

This comment has been minimized.

Copy link
@RalfJung

RalfJung Mar 6, 2020

Member

According to Miri, an allocation made during this call leaks:

note: created allocation with id 143756
    --> /home/r/src/rust/rustc/src/liballoc/alloc.rs:81:47
     |
81   |     __rust_alloc(layout.size(), layout.align())
     |                                               ^ created allocation with id 143756
     |
     = note: inside call to `std::alloc::alloc` at /home/r/src/rust/rustc/src/liballoc/alloc.rs:169:22
     = note: inside call to `<std::alloc::Global as std::alloc::AllocRef>::alloc` at /home/r/src/rust/rustc/src/liballoc/alloc.rs:203:15
     = note: inside call to `alloc::alloc::exchange_malloc` at /home/r/src/rust/rustc/src/liballoc/boxed.rs:175:9
     = note: inside call to `std::boxed::Box::<alloc::collections::btree::node::LeafNode<&str, btree::map::test_into_iter_drop_leak::D>>::new` at /home/r/src/rust/rustc/src/liballoc/collections/btree/node.rs:211:43
     = note: inside call to `alloc::collections::btree::node::Root::<&str, btree::map::test_into_iter_drop_leak::D>::new_leaf` at /home/r/src/rust/rustc/src/liballoc/collections/btree/map.rs:1333:25
     = note: inside call to `std::collections::BTreeMap::<&str, btree::map::test_into_iter_drop_leak::D>::ensure_root_is_owned` at /home/r/src/rust/rustc/src/liballoc/collections/btree/map.rs:1067:9
     = note: inside call to `std::collections::BTreeMap::<&str, btree::map::test_into_iter_drop_leak::D>::entry` at /home/r/src/rust/rustc/src/liballoc/collections/btree/map.rs:838:15
note: inside call to `std::collections::BTreeMap::<&str, btree::map::test_into_iter_drop_leak::D>::insert` at alloc_miri_test/../liballoc/tests/btree/map.rs:1038:5
    --> alloc_miri_test/../liballoc/tests/btree/map.rs:1038:5
     |
1038 |     map.insert("a", D);
     |     ^^^^^^^^^^^^^^^^^^

### LEAK REPORT ###
Alloc 143756: 00 00 00 00 00 00 00 00 __ __ 05 00 __ __ __ __ 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ (192 bytes, alignment 8) (Rust)
                                                              └──────(143732)───────┘                         └──────(143850)───────┘                         └──────(143963)───────┘                         └──────(144092)───────┘                         └──────(144237)───────┘ 
Alloc 143732: 61 (1 bytes, alignment 1) (Static)
Alloc 143850: 62 (1 bytes, alignment 1) (Static)
Alloc 143963: 63 (1 bytes, alignment 1) (Static)
Alloc 144092: 64 (1 bytes, alignment 1) (Static)
Alloc 144237: 65 (1 bytes, alignment 1) (Static)

Is that expected, or should all memory be properly freed?

This comment has been minimized.

Copy link
@ssomers

ssomers Mar 6, 2020

Contributor

The root node of the tree in this simple test case is not freed. If the intention is to free all BTreeMap nodes as well, then the DropGuard in IntoIter::drop should include the rest of the normal body that frees the last pile of nodes of the tree (but doesn't need the is_shared_root test because the shared root doesn't have any elements to drop).

Most nodes of the tree have already been freed rather secretly in next calls, and as far as I see, these nodes have already been freed before the last key-value pair in them is dropped (which sounds peculiar, but the key and value dropped are in fact last minute copies). But to be sure of this, some test should cover trees with more than just a root node.

This comment has been minimized.

Copy link
@RalfJung

RalfJung Mar 6, 2020

Member

I'll open an issue around this

This comment has been minimized.

Copy link
@RalfJung

RalfJung Mar 6, 2020

Member

Issue is at #69769

This comment has been minimized.

Copy link
@jonas-schievink

jonas-schievink Mar 6, 2020

Author Member

Yeah looks like this is what @programmerjake pointed out above

This comment has been minimized.

Copy link
@RalfJung

RalfJung Mar 6, 2020

Member

You mean this comment, I presume? Yeah that seems related.

}
}

let v = vec![D(false), D(true), D(false)];

This comment has been minimized.

Copy link
@RalfJung

RalfJung Mar 6, 2020

Member

There's another memory leak in this test: the Vec backing store does not get deallocated. See #69770.

@ssomers ssomers mentioned this pull request Mar 29, 2020
0 of 5 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

You can’t perform that action at this time.