-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make it possible to temporarily store RefCounted object in a Ref/RefPtr inside its destructor #8748
Conversation
β¦tr inside its destructor https://bugs.webkit.org/show_bug.cgi?id=250746 Reviewed by NOBODY (OOPS!). This patch removes debug / security assertion that m_deletionHasBegun is not set in ref() / deref() to allow temporarily storing a RefCounted object in a Ref/RefPtr inside its destructor. Instead, assert that m_refCount is 1 at the end of destructor to make sure there is no outstanding external references at the end of its destruction. Apply the same fix to ThreadSafeRefCounted as well as Node. * Source/WTF/wtf/RefCounted.h: (WTF::RefCountedBase::ref const): (WTF::RefCountedBase::hasOneRef const): (WTF::RefCountedBase::~RefCountedBase): (WTF::RefCountedBase::derefBase const): * Source/WTF/wtf/ThreadSafeRefCounted.h: (WTF::ThreadSafeRefCountedBase::~ThreadSafeRefCountedBase): (WTF::ThreadSafeRefCountedBase::ref const): (WTF::ThreadSafeRefCountedBase::hasOneRef const): (WTF::ThreadSafeRefCountedBase::derefBase const): * Source/WebCore/dom/Node.cpp: (WebCore::Node::~Node): * Source/WebCore/dom/Node.h: (WebCore::Node::ref const): (WebCore::Node::deref const): * Tools/TestWebKitAPI/Tests/WTF/RefPtr.cpp: (TestWebKitAPI::StoreRefInDestructor::create): (TestWebKitAPI::StoreRefInDestructor::~StoreRefInDestructor): (TestWebKitAPI::StoreRefInDestructor::helperFunction): (TestWebKitAPI::WTF_RefPtr.RefInDestructor):
EWS run on current version of this PR (hash fdc168d)
|
Normally, the idea is that objects can be half-destroyed in the destructor (because subclasses). So it is unsafe to Ref them back, or to use them in general. Are RefCounted objects never subclassed? Or is there some other reason why this scenario is meaningful? |
As we deploy mechanical rules that require use of smart pointers, with local verifiability, we have started triggering cases where we ref and deref inside a destructor. Once our transition to smart pointers is complete, it will be impossible to call any non-self-member function that puts 'this' inside a local variable without triggering such a case. So, our options are:
Of those options, removing this assertion seems to be the least bad option. To the extent that the assertion may sometimes catch use after partial or complete destruction, probably ASan, Guard Malloc, and fuzzing are better testing options for discovering that error, and smart pointers are a better implementation option for preventing that error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me
Assuming that "non-self-member functions" that you are thinking about are virtual function calls, my implication that this is a must regardless of whether we hit an assertion or not. It's a bug to do this, so eliminating that is necessary, not a "bad option". |
Overall, this patch is a bad idea, as it eliminates a critically important safety check. Please find a better solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified that this assertion has caught serious bugs in the past, perhaps more than any other single runtime check in our code. I don't see how it makes sense to eliminate it.
Since you're standing in the way of progress on smart pointers, it's not sufficient to prove that this assertion has caught serious bugs in the past. You need to prove that this assertion has caught more serious bugs than smart pointers have avoided -- and that smart pointers would not also prevent those bugs. |
I'm not talking about virtual function calls. I'm talking about calling any function, including a stand-alone function or a member function on another object. It's not a bug to call a function inside a destructor. |
If you have a suggestion for how to require smart pointers everywhere without triggering a ref inside a destructor, please do write it down (so I can submit it to a journal and apply you for a Turing award). Hand-waving at "better" is not a meaningful form of patch review. |
I have a lot of words to say in response to "standing in the way" and "Turing award", which I will keep to myself for now. Perhaps a way to make this discussion constructive is to present an example where you think this is necessary. You said "calling any function", but the proposal in webkit-dev doesn't involve wrapping function arguments in RefPtrs. The latest proposal was "Use smart pointers in all local variables and heap allocated values." Can you demonstrate an example where this requires Ref'ing half-deleted objects? |
Wrapping each function argument in a Ref / RefPtr is the eventual goal.
For example, A whole bunch of code inside |
Why do we need to make this change now in support of an eventual goal?
Thank you, this is a good concrete example. I can't mentally trace the execution chain that gets us there, but the existing code smells bad:
Adding To be clear, I agree with the desire to stop using a raw pointer here. Do you envision next steps that would prevent these use-after-frees caused by half-destructed objects?
If I understand it correctly, changing that is also only an eventual goal. |
Because we want to deploy such code changes now as a way of evaluating either approach. Without this change being landed, doing that is rather difficult. We'd basically split the code base into two categories: one that gets called in all but inside destructors. One that gets used and should only be used inside destructors. That kind of code duplication seems rather counterproductive and may lead to new kinds of bugs.
No. The element in question isn't anywhere near the state of destruction but Our previous thought was that this is a bad code as you say, and we can mitigate this by avoiding to do "interesting" work inside destructors. The problem, we find it, is that there are just too many cases where this comes up.
It would definitely make this code safe, and that matters because
That's not really a use-after-free of memory region per se but rather use of a destructed object that's yet to be reclaimed by the heap allocator.
No, we want to make this change now, not in the future. |
This is not making sense to me. That's what branches are for, not trunk.
The destructed object can/will have pointers to deallocated memory. This is where the use after free is coming from. |
What do you mean by that? Could you elaborate?
Not with our smart pointers because we clear the pointer values inside destructors. Our experience with iso-heap proves that there is a clear advantage in not allowing re-use of deallocated memory. The benefit of systematically preventing the use of destructed but not yet deallocated memory is not yet clear. |
I'm not sure what kind of evaluation requires making changes on trunk, and not even behind a disabled feature flag. Typically, that's not how we evaluate. Starting with a change that disables an important security check is particularly suspect, and is not business as usual.
A lot of classes don't use our smart pointers for data members. They use std:unique_ptr, or HashTable, or other classes that don't zero out pointers in their destructors. This is just a plain UAF, nothing novel. |
We're talking about the deployment of smart pointers here, not an addition of new web facing API or behavior changes (e.g. live range selection). I don't think we could evaluate the value of deploying smart pointers without deploying them. Our prior experience with introducing more smart pointers in local variables proved to be an useful mitigation strategy for security bugs because we later found dozens of use-after-frees that were fixed by those deployment patches.
I'm not certain I'd agree with your characterization of "an important security check". We're at best checking that we're not storing an object that has started its destruction in a
That seems like an argument for re-introducing our own WTF::UniquePtr. In the case of HashTable, I'm failing to see why not landing this patch will mitigate any security bugs thereof since, again, the current assertion is only enabled in debug / ASAN builds. I'm adding new release assertions to catch cases where we were previously doing use-after-free so landing this patch seems decidedly in favor of preventing security bugs. |
So... What is the evaluation plan? Here is what I'd expect:
Progressing beyond stage 1 would be a big achievement. If "smart pointers everywhere" is applied to a hard case (e.g. affecting the Document destructor), it will likely cause enough code flow changes to require refactoring that's bigger than what you previously considered too complicated.
If making all destructed subclass members safe to use is a requirement for success here, that needs to be part of a plan, with a review of what it actually takes to get there. It's worth a separate discussion that isn't buried in review comments, but zeroing out the pointers is not a workable solution, in my opinion. Relying on zero pointer dereference as a security defense doesn't work in general, because the compiler will optimize out undefined behavior. It's particularly unlikely to work here, as much of the code gets inlined, thus being visible to the optimizer. |
Indeed, the assertion is fairly weak, and mostly tells one that they got too happy with overburdening their destructors. Yet, it caught actual security bugs enough times for us to be serious about what we replace it with. In general, those huge destructors like |
I agree with ap. The assertions have been useful. In
|
Hi all, it seems to me the controversial assertion doesnβt actually prevent inappropriate use of RefCounted objects within the destructor, or even keeping a (raw) pointer to them beyond their lifetime. It just prevents calling ref/deref, with the goal, I guess, of preventing confusing pseudo-resurrection scenarios where the object ends up destructed by the ref count is bumped back above 0 by the time destruction is done. I think pseudo-resurrection can be defended against in a way that still allows temporary refs during the destructor: add an assertion to the end of deref() that after the destructor is called, the object doesnβt have more than the expected refs. It might even be possible to make this a RELEASE_ASSERT, since this shouldnβt be the fast path of deref(). Or maybe it should go in ~RefCountedBase? If thatβs the most-base class, its destructor should be called last, right? (Maybe this doesnβt work with multiple inheritance). I guess another thing weβd want to ensure is that, even with temporary refs, we donβt re-enter the destructor, so deref() has to be smart enough to not delete the pointer again when a ref/deref pair occurs on an object thatβs already in the course of being destructed. I think these changes would add the same actual safety guarantees as the current assert (maybe better, if it could be a RELEASE_ASSERT and not just ASSERT_WITH_SECURITY_IMPLICATIONS), without conflicting with the purpose of this patch. |
Right, and it only does so in debug / ASAN builds.
This PR does exactly that in RefCountedBase.
This is already the case. See https://commits.webkit.org/216324@main for example. The last
Right. As a matter of fact, this PR does exactly that. In lieu of ASSERT_WITH_SECURITY_IMPLICATIONS, we have a release assert that ref count is 1 at the end of RefCountedBase's destructor. |
Right, the PR already does what's proposed. I don't think that it's useful as a replacement, as we need to prevent using data members from super-classes that are already destructed. Guarding against (pseudo-)resurrection or double deletes is separate. The goal and result of this PR is allowing more unsafe code, which is why I object to it. |
I donβt see how the old assert prevents using data members from super-classes that are already destructed. Downcast operators (whether built in or our custom ones) wouldnβt do a OTOH landing this patch and then mechanically deploying smart pointer correctness would allow us to find cases where any functions outside the class do anything at all with a half-destructed object and do a project to eliminate such cases comprehensively, because theyβd have to ref(). In theory we could do that incrementally first and then comprehensively deploy smart pointers. But I suspect many of the functions doing access without taking a ref during destruction are also used outside destruction, and their lack of ref may lead to latent memory corruption bugs in other situations. I donβt think itβs good to delay fixing those memory corruption bugs. Iβll add that itβs not clear to me how easy it would be to code around the need to call a nontrivial function in a destructor. Itβs probably a bigger project in the average case than deploying a smart pointer to a site that is missing one. Itβs probably not good to block a relatively easy project with security benefits behind a harder project with security benefits. Maybe what we need to do is propose and schedule a project for when we do the cleanup of suspiciously complex code in destructors. One more thought: deploying smart pointers comprehensively removes an entire class of bug (UAF of RefCounted objects), while the assert being removed here only finds some subset of a class of problems (inappropriate use of half-destructed objects) by chance. |
Tell me more! What subsystem is the plan fully implemented in, and how did it go?
Very good point. Let me think a bit. |
Even disregarding the argument elsewhere that this patch would only allow RefPtr in paths that donβt change lifetimeβ¦ This seems like a fully general argument against fixing correctness of ref counting in our code, which surely cannot be right. We canβt hold back from fixing security bugs simply because in theory it could cause indirect effects that might cause non-obvious regressions. We have testing to protect us from regressions. Also, this argument is no longer about this assert directly providing any protection. Itβs about keeping the assert specifically to impede the smart pointer checker project. That doesnβt seem like a good reason to r- the patch. If smart pointer checking is bad, actually, that is a case that needs to be made in the context of reviewing our overall plan for improving security, not this one patch review.
Ok, greatly narrows it to this rare case. (But Iβm not sure this is actually possible b/c once the derived class destructor has run, I donβt think anything will downcast the object from the base class to the derived class, C++ will treat is as if itβs now the base class. Which could still cause bugs, but they would be more subtle logic bugs.)
I wouldnβt bet on this being true. To the extent it is, itβs thanks to use of smart pointers (single ownership and weak pointer variants as well as ref counting variants).
Security bugs existing due to missing ref counting is proven (unless you believe we have fixed the last ever). Attackers looking to exploit WebCore grep for raw pointers. Smart pointers fixing individual security bugs has also been proven. And we also know code that correctly uses smart pointers has not been the site of this type of UAF type confusion bug, at least in bugs found by or reported to us. I am not sure what remains to prove. That adding smart pointers en masse wonβt change this? But we have already run that experiment by adding smart pointers en masse based on manual review, or on earlier versions of the tool. Not sure what could provide more evidence short of doing the whole project. |
Define 'subsystem', 'fully implemented', and the success criteria for 'how did it go'. We have 25607 uses of RefPtr, 6076 uses of RetainPtr, and 3861 uses of WeakPtr in WebKit. And, again, we adopted ARC, which has the same semantics as this patch (weaker, actually), in all of Safari. To the extent that we used to have use after free bugs all the time, and we now have them much more rarely, I would say it went really well. To repeat my question above, what additional adoption would be bigger than the adoption we've done so far, and yet still 'small' and yet still 'complete', and how would you evaluate whether it was ready to merge back to main? |
I don't think that's on me to define. What is your plan? Ryosuke said that this needed to be landed in order to evaluate, and sounds like there is no plan to evaluate anything, given this quoted question, and your other comments.
Adding RefPtrs where engineers found them to make sense demonstrates nothing about what will happen when they are deployed consistently as a consequence of formal policy. This policy is almost certainly going to be incomplete/wrong in its first iteration at least. Did you try to actually make any code at all to work the way it's envisioned for all WebKit code to converge towards? ARC is at best a philosophical example. The language is different enough (notably, well defined behavior of sending messages to nil), and toolchain support is on a different level. Still thinking about Ryosuke's and Maciej's comments. |
webkit.org/b/250637 / rdar://104274947 is an example of a security bug that got fixed by previous mass deployment of smart pointers. We've found numerous other cases where use-after-free bugs were fixed by other patches Jiewen and I landed a few years ago.
To be fair, we've found such an example in the past despite of these assertions. When Jiewen & I mass deployed smart pointers, new security bugs were introduced by the way of double deleting inside some destructors. This occurred because we used to let This patch improves upon this learning experience by introducing a new release assert for |
Sounds like this is an argument against deploying types of smart pointers that can affect object lifetime like Ref and RefPtr in general. To your credit, we know such a deployment could introduce new security bugs as I've outlined in #8748 (comment) until we made the aforementioned change to keep And again, even if that there was such a concern / risk, I'd argue that such a concern is orthogonal to the PR at hand here. This PR removes assertions which will be hit if someone attempts to store an object which has begun its destruction in |
Thank you Ryosuke, I really appreciate your constructive and specific answers. I agree that my concern about introducing bugs by changing object lifetime is orthogonal to this PR. But reading back through this discussion, I find myself at a loss - the debate is way too fluid, and it's not clear to me if positions are getting clarified as part of the debate, or if you, Geoff and Maciej are talking about different things. It feels more like the latter. May I confirm what the points are with you?
To illustrate why I'm asking this:
|
The reason we need to land this PR is so that we can make more functions return smart pointer types instead of raw pointers & references. The evaluation process we've used in the past and we would like to use here is:
It's hard to do (2) on a branch other than
The plan is for me to experiment deploying smart pointers in more places using new rules and observe what kind of new smart pointer features are needed, or what kind of regressions we may unexpectedly introduce.
Right. Jiewen's patch was in no way comprehensive deployment of smart pointers. Yet, it still informed us of benefit of deploying smart pointers in local variables and because it fixed a real security bug without us even noticing it. Jiewen and I have also landed other patches which made some functions return smart pointer types as well, and that's the basis of why I'm suggesting to do the same in new code.
Right, we want to deploy smart pointers everywhere eventually using the comprehensive rule we came up with a few years ago: https://lists.webkit.org/pipermail/webkit-dev/2020-September/031386.html I don't think the tooling is up to that level of adoption yet so I'm proposing to adopt a smaller scope of using smart pointers in local variables & member variables in new code - new code because deploying smart pointers in new code (hopefully) shouldn't introduce (perf or not) regressions in existing code. This PR will help this limited scope adoption by allowing local variables in functions that get called inside destructors to also use smart pointers. But most importantly, this PR will allow us to experiment with making more functions return smart pointers instead of raw pointers or references. As mentioned above, Jiewen has landed such a patch in the past, and we have some experience with it but one of the major roadblocks in deploying such code changers was the use of accessor functions getting called in destructors. Of course, we can deploy more smart pointers in local variables and data members with the existing
The way to avoid reference cycle is to use
I don't see why we can't adopt something like MiraclePtr if there is a valid use case for it in WebKit. Blink does employ a number of novel approaches to memory management including but not limited to LLVM plugin that automatically generates write barriers for generational garbage collection in oil pan (their garbage collector for C++ objects). Geoff and I have discussed about pros and cons of adopting such an approach in the past for example but we've concluded that what we're planning to do (as outlined in the webkit-dev thread) is pretty much isomorphic to what they did; it's just that we're doing it manually in the code base instead of doing it automatically by the way of a compiler plugin. |
Thank you, this is very helpful. Looks like there are two directions being explored in parallel, both sensible:
To me, a function that returns This would be impractical if the next step was to do automatic code replacement across the project, but with the actual next steps that are planned, this doesn't seem like a crazy thing to take a look at.
This is not very related to this PR, however reading the thread made me wish for two things:
|
Perhaps we should continue this discussion in webkit-dev or #8907. But we already have a lot of experience deploying smart pointers for local variables & member variables so I'm pretty confident that we can deploy this style rule although we may need to make exceptions for things like
Sure, let me go find such a code change.
Good points. Would you mind replying to the email thread? Perhaps I should write some document about how smart pointers should be used ideally, and outline our plan to get there? |
Here's a patch to make With this patch applied, we hit the following assertion when running layout tests:
|
When |
This assertion is about RefPtr, not WeakPtr. |
Look at the patch and the backtrace in #8748 (comment) . |
I don't see follow how that relates to Document's destructor at all. |
The assertion failure in #8748 (comment) is about |
I just realized that this patch isn't consistent with the plan posted to webkit-dev a couple weeks ago:
But then again, this exception for trivial functions is not in https://github.com/WebKit/WebKit/wiki/Smart-Pointer-Usage-Rules. Could you please clarify if it's still PoR? Will we have trivial functions return I've started looking at the crash in a debugger, and another observation is that |
Again, the plan here is to deploy smart pointers in more places than what I was suggesting in that thread. This PR is not needed to adopt the proposal to deploy smart pointers for data members & local variables. The point of this PR is to allow more deployment of smart pointers beyond those.
That's covered here:
|
C++ |
Sorry, I may not be following. https://github.com/WebKit/WebKit/wiki/Smart-Pointer-Usage-Rules also doesn't say that function returns will be smart pointers, as far as I can tell. So why do we want form-ref-ptr.patch, specifically this part:
|
We may want that as an alternative to (3) storing every function argument in a smart pointer. |
I see your point that it would be nice to tighten the behavior of WeakPtr, such that WeakPtr becomes null right at the beginning of destruction rather than some time later. Do you think that change should block further deployment of RefPtr? If so, why? |
There is no problem using more Ref and RefPtr unless you convert WeakPtr to RefPtr in the destruction phase. In WebKit/Source/WebCore/html/FormListedElement.cpp Lines 120 to 122 in c4ef868
|
Sure, I think you are restating the point that tightening the behavior of WeakPtr could make this assertion go away in this instance. Do you think that change should block further deployment of RefPtr? If so, why? Are you concerned about an attempt to retain an object past the end of destruction? Please note that this patch still asserts if a RefPtr retains an object past the end of destruction. Even if we tighten the behavior of WeakPtr, this assertion will also fire any time we adopt RefPtr where we were previously using raw pointer. Is your proposal that converting all raw pointers to WeakPtr should also block further deployment of RefPtr? If so, why?
Unfortunately, I believe what you are proposing here is a classic memory safety error -- exactly the kind of error we are trying to resolve by further deploying RefPtr. If you do not retain the pointer you are comparing, it may be recycled by the memory allocator, and then you can get a false positive ==. This is sometimes a serious security bug. You can try to stick to programming with raw pointers by reading every line of every function transitively called in this loop, and reasoning out that no line will free any objects. However, there are a few problems with that approach:
Does that change your mind? |
You are misunderstanding. Do you think formOwnerRemovedFromTree has the classic memory safety error because it's using a raw pointer? It has no problem even with a raw pointer or a weak pointer. The pointer is still valid in the function. We should discuss the problem with concrete examples, like formOwnerRemovedFromTree. |
Yes.
How did you reach this conclusion? Was it by reading every line of every function transitively called in this loop, and reasoning out that no line will free any objects (now or in the future)? If so, please see my explanation above for why that is not a sustainable way to program a web browser (or probably any other software).
Sure, it's sometimes helpful to look at concrete examples. However, if you only look at individual examples, you will "miss the forest for the trees" (not understand or appreciate a larger situation, problem, etc., because one is considering only a few parts of it). The motivation for deploying more idiomatic use of RefPtr, and writing coding style guidelines and verification tools that can locally verify appropriate use of RefPtr, is that even though engineers can and do reason about raw pointers successfully in many individual cases, on the whole the practice of doing so leads to hundreds of use after free bugs per year in the WebKit project. The task of more widely deploying RefPtr is motivated by that global pattern. Do you have a proposal for avoiding the next hundred use after free bugs in WebKit without deploying more idiomatic use of RefPtr? |
Show us the testcase that demonstrates the bug. |
Closing this in favor of https://commits.webkit.org/273805@main |
fdc168d
fdc168d