Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsdist: Fix a double-free when a DoH cross-protocol response is dropped #11075

Merged

Conversation

rgacogne
Copy link
Member

@rgacogne rgacogne commented Dec 7, 2021

Short description

Introduced in 1.7.0, and only when a cross-protocol response to a DoH query is dropped by a rule.

Checklist

I have:

  • read the CONTRIBUTING.md document
  • compiled this code
  • tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)

@rgacogne rgacogne added this to the dnsdist-1.7.0 milestone Dec 7, 2021
@Habbie Habbie self-requested a review December 7, 2021 16:47
Copy link
Member

@omoerbeek omoerbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks good.

I checked the places where DoHTCPCrossQuerySender and DoHCrossProtocolQuery objects are used, and noted one spot where an DoHTCPCrossQuerySender object is first created and then the get() is done: line 1328. I do not think it is wrong, but the pattern looked strange to me.

@rgacogne
Copy link
Member Author

rgacogne commented Dec 8, 2021

Thanks Otto! I'll review all the places where we touch the reference counter to check that we are doing the right thing. I'll also check one more time if we cannot use a smart pointer instead but h2o's API make that hard, if not impossible.

@rgacogne
Copy link
Member Author

rgacogne commented Dec 8, 2021

It was indeed leaking when a UDP response to a DoH query was truncated. I pushed three commits:

  1. fixing that leak
  2. making it easier to understand which function has ownership of the object
  3. wrapping the object in a unique_ptr whenever possible, reducing the amount of manual book-keeping

@rgacogne
Copy link
Member Author

rgacogne commented Dec 8, 2021

2171e7c does not play well with TSAN, investigating.

As far as I can tell this is not actually needed, as we decrement
it right away, but it prevents TSAN from reporting a race when the
UDP response comes very fast, is truncated, and the query is then
passed to a TCP worker. TSAN seems to think that the thread is still
sending the UDP query when we touch it again in the TCP worker, which
does not really make sense to me.
My guess is that the memory barrier needed to update the ref counter
makes TSAN happy, but I might be missing something.
@rgacogne
Copy link
Member Author

rgacogne commented Dec 9, 2021

Fixed for now, it looks like TSAN is confused by the fact that the answer from the backend might come back in a UDP responder thread before the send() syscall returned in the initial thread.

@omoerbeek
Copy link
Member

In doh.cc:

/* you can't touch du after this line, because it might already have been freed */

still du is accessed in the block below (this is not part of this PR but it caught my eye anyway). Either the comment should be adapted or something else is going on.

@rgacogne
Copy link
Member Author

The comment should say that du can't be touched if that call returns a non-negative value, right?

Copy link
Member

@omoerbeek omoerbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small remarks. Ref counting is inherently a bit of a scary subject as it is easy to get wrong, but I do think this approach is better than before.

pdns/dnsdist.cc Outdated Show resolved Hide resolved
pdns/dnsdistdist/doh.cc Show resolved Hide resolved
pdns/dnsdist-idstate.hh Outdated Show resolved Hide resolved
@@ -241,7 +241,7 @@ struct IDState
std::unique_ptr<QTag> qTag{nullptr}; // 8
boost::optional<uint32_t> tempFailureTTL; // 8
const ClientState* cs{nullptr}; // 8
DOHUnit* du{nullptr}; // 8
DOHUnit* du{nullptr}; // 8 (not a unique_ptr because we currently need to be able to peek at it without knowing taking ownership until later)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: use duNaked or something like that to make it very clear it's not a smart pointer when using it as other instances of du hold a unique_pointer.

@rgacogne
Copy link
Member Author

Thanks, Otto! I pushed a few commits following your advice, I'm still pondering the duNaked one, I can't decide whether I like that idea.

@omoerbeek
Copy link
Member

The comment should say that du can't be touched if that call returns a non-negative value, right?

yes, that is more clear

@rgacogne
Copy link
Member Author

I'm going to merge this without changing the duNaked stuff because I'd like to release rc1 quickly, and also because I hope to be able to use unique_ptrs everywhere in 1.8.0 anyway.

@rgacogne rgacogne merged commit ec10777 into PowerDNS:master Dec 15, 2021
@rgacogne rgacogne deleted the ddist-fix-dropped-doh-cross-responses branch December 15, 2021 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants