New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
auth: Do an ANY lookup for all types then filter #9007
Conversation
Our metronome templates currently use |
Yes, if only because this PR does currently up to two lookups into the records cache (ANY then specific type) because of the SOA case, but only one will go to the backend. The 'uncached' SOA query is not accounted for in |
Failure of ci/circleci: test-auth-regress-gpgsql is unrelated (ALIAS). |
Failure of test-auth-regress-gsqlite3 this time:
|
Ugh, ALIAS was quite stable in CircleCI for a while. I'll see if I can make the ALIAS target something local. |
I looked into that idea a bit, but it's complicated by the fact that some backends (bind and |
That optimisation has been requested for the pipebackend, I did not check if we do that today. |
The more I look into this "optimization" the less I'm actually convinced it ever worked that way in the bind backend. |
Unless I'm missing something huge, none of the backends I know of (including the not merged DLSO and Cassandra backends) overrides |
pdns/ueberbackend.cc
Outdated
|
||
for (auto& rec : anyRecs) { | ||
if (q.qtype.getCode() == QType::ANY || rec.dr.d_type == q.qtype.getCode()) { | ||
rrs.push_back(std::move(rec)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emplace_back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't the move-constructor be called in the exact same way in both cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possibly. i just wonder if it could be rrs.emplace_back(rec)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would allocate and copy two DNSName
s and one shared_ptr<DNSRecordContent>
instead of moving them, so I don't think that would be better.
After discussing it ith @mind04 on the IRC channel, this might cause issues with some setups involving multiple backends. Unless we are willing to take the risk of breaking some setups, we should either limit this new behaviour to single-backend setups (easiest) or make it configurable (enable this optimization for most multiple-backends setups while not breaking the other ones). |
If I read the diff right, only setups that have the same name in multiple backends could be affected? |
Yes, my worry is that it might be more common that we think for custom backends, in order not to implement DNSSEC there and delegate that to a second backend instead. That's quite ugly in my humble opinion since it requires duplicating the names and types in the second backend, but it clearly exists.. |
My Problem with this is: there are only rumors of this existing, and nobody can say if this would be a supported thing to do or not. The mere rumor of it existing prevents making stuff better for everyone else. |
Limiting this behaviour to singe backend setups is a nice middle ground. And maybe add a configuration option to enable this for multi backed setups as well. |
I just wonder if this approach can be extend to fetch all RRs of a zone from the backend and filtering the label afterwards. Then all RRs of a zone can be put in the query cache. And if PDNS knows know that the query cache either contains all or no RRs of a zone, it could defeat random subdomain attacks as the NXDOMAIN is a result of not finding the requested label in the query cache, with having to ask the backend for every query. |
There are several issues with that idea, mostly because the underlying database might be updated, and records might have different TTLs which might mean that we would need to invalidate the whole records cache after the shortest TTL expires.
|
Since when does PDNS Auth respect TTLs for packet/query cache? Actually I think that TTLs should NOT be considered for these caches. TTLs are designed for recursive DNS. Cache policies of the Auth are a decision of the DNS operator - regardless what the TTL says.
Lacks replication. I do not know LMDB details, but SOA checks and AXFR does not scale to millions of zones and hundred secondaries.
Sounds good. But - when you have a random subdomain attack, within a second the aggressiveNSEC(3) cache is filled with the complete zone - so not much difference then loading the whole zone. Anyways agressiveNSEC(3) caching sounds useful also to reduce the backend queries also without random subdomain attacks. Of course the technique should also be used for queries without the +DO bit set, to improve also non-DNSSEC queries. |
We have capped the TTD of the packet and the query caches with the lowest TTL for as long as I can remember.
I would agree with you if all backends could properly detect a change in the zone, which is not the case for database backends. That means we need to keep the TTD duration in the caches quite small, which reduces the effect of loading the whole zone in memory a lot.
The SOA, NSEC and corresponding RRSIG records are loaded, not all the records. This might make a significant difference.
Agreed. |
I was not aware of this. It should be mentioned in the docs.
I use a database backend. Actually, when I as DNS operator sets a query cache of X, but my customer sets a TTL of Y<X, the PDNS ignores the Admin and follows the user. IMO not fine. So, a TTL should be that TTL regardless of record values - IMO.
OF course it should be a config option. I guess for fast backends which do not suffer during random subdomain attacks there is no need to cache the whole zone as the whole zone is already in memory. And if someone with hughe zones uses a DB backend than it probably will not use this feature.
In my experience the size of "normal" RRs can be neglected compared to NSEC(3) and RRSIG RRs.
Of course there always exists a certain usecase were things may get worse than better. Hence, such features should always be a config option for power users. |
I pushed a commit making that behaviour configurable. It's enabled by default, even in multi-backend setups, because I believe it will be fine for most users and will bring a noticeable performance improvement. |
While I think the config option is a good idea, maybe we want to name it differently, so further performance improvements can be done under the same option. IIRC |
Agreed, I'll change the name and the description so it means that can all records for a given name should be unique to a backend. Or perhaps we should have a setting to declare that zones are not spread across multiple backends, which is a bit more strict but would allow more optimizations later? |
@Habbie I hoped you'd chime in here. I think "no overlays" would make sense as a general idea. Supporting real, delegated sub-zones should still work IMO? |
I believe this is good to merge once the option is renamed and defaulted to off! |
I pushed a commit for this. |
circle-ci build-auth failed with:
|
Most of our backends have a very high latency, meaning that it takes a long time to send a query and get the answer, regardless of whether we are asking for one type or several. Our code base often asks for a specific type, and our current code stores separately the answers for ANY queries and the ones for a specific type. This seems wasteful since the answer to an ANY query already contains the records for a more specific one, and our in-memory records cache is must faster than going to the backend. We could save a round-trip by looking for ANY answers when we don't find a specific one in the record cache, but our first query is often for the specific NS type because we are looking for a referral. This PR converts all lookups to an ANY lookup instead, making sure that we fill the cache as fast as possible to save round-trips to the backend later. My tests showed roughly one third less queries to the backend in simple cases, and probably more in DNSSEC cases, while achieving higher QPS boundaries (~ +30%). CPU usage is also significantly reduced while replying a real-world PCAP. The number of entries in the records cache is also significantly lower since we don't need to store a record twice, for ANY and for the exact type itself. We could easily enable that change for specific backends only if we believe it might have a negative effect on some of them, although testing with the bind backend showed a slight improvement there as well, even though lookups in the bind backend are already quite fast. I have not tested LMDB. We could reduce the number of round-trips to the backend even more by getting rid of the 'SOA' special case, since I'm not aware of any backend currently implementing it in a special way.
Counting the numbers of queries sent to the backend(s), instead of relying on the number of cache misses.
It controls whether we only send 'ANY' lookups to our backend, instead of a mix of 'ANY' and exact types. This behaviour is enabled by default since it should save a lot of round-trips for most setups, but can be disabled for multi-backends setups that require it.
Thanks, that confused me for a bit, but I have now seen the light and am rebasing :) |
Rebased, fixed, pushed. This passes locally. If I enable consistent backends, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All lookups with id = -1 (SOA and lookups in findNS()) may result in answers from multiple zones. Please add logic to prevent/detect answers from multiple zones in the cache.
d_question.qname=shorter; | ||
addNegCache(d_question); | ||
d_question.qname = shorter; | ||
addNegCache(d_question, d_question.qtype); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d_question.qtype and d_question.zoneId are uninitialized here when getAuth was called with cachedOk = false
Also passing d_question and d_question.qtype seems redundant. The same applies to addCache() a few lines down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All lookups with id = -1 (SOA and lookups in findNS()) may result in answers from multiple zones.
Will fix, thanks!
Also passing d_question and d_question.qtype seems redundant. The same applies to addCache() a few lines down.
We actually need to be passe to override the qtype in some cases, because d_question.qtype
holds the requested qtype
but when s_doANYLookupsOnly
is set we want to store records for ANY
in the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure you can update d_question.qtype with ANY when s_doANYLookupsOnly is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, how would we know the requested qtype then, so that we can filter the records in UeberBackend::get()
, if we don't store it in d_question.qtype
? d_handle.qtype
is already set to ANY
in that case so we do the correct lookup when we iterate over backends in UeberBackend::handle::get()
.
rr=*d_cachehandleiter++;; | ||
if (d_cached) { | ||
if (d_cachehandleiter != d_answers.end()) { | ||
rr = *d_cachehandleiter++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a check here to make sure the zone_id in rr matches the id passed to lookup()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QC.getEntry already uses the zoneId for the cache lookup, no?
I'm wondering how that works today? I'm not sure this PR changes that behaviour much, the SOA should not appear in more than one zones, the NSs will if we have a parent and a child zone but wasn't that already the case? |
Counting the labels would prevent caching entries for the root.
I gave this a stern look and would agree. IMO there are two remaining B->getSOA cases that could go away; and then we're down to a) FindNS and b) pdnsutil benchmark for callers passing |
I'm running this for today (in combination with #9464) and so far it's looking good. |
For some reason this needs a trivial rebase on master when merging locally, but I don't see why. |
I merged #9483 (inspired by, and partially taken from, this PR) instead, as it appears to give the same benefits, while changing a lot less code. Thanks! |
Short description
Most of our backends have a very high latency, meaning that it takes a long time to send a query and get the answer, regardless of whether we are asking for one type or several. Our code base often asks for a specific type, and our current code stores separately the answers for ANY queries and the ones for a specific type. This seems wasteful since the answer to an ANY query already contains the records for a more specific one, and our in-memory records cache is must faster than going to the backend. We could save a round-trip by looking for ANY answers when we don't find a specific one in the record cache, but our first query is often for the specific NS type because we are looking for a referral.
This PR converts all lookups to an ANY lookup instead, making sure that we fill the cache as fast as possible to save round-trips to the backend later.
My tests showed roughly one third less queries to the backend in simple cases, and probably more in DNSSEC cases, while achieving higher QPS boundaries (~ +30%). CPU usage is also significantly reduced while replying a real-world PCAP.
The number of entries in the records cache is also significantly lower since we don't need to store a record twice, for ANY and for the exact type itself.
We could easily enable that change for specific backends only if we believe it might have a negative effect on some of them, although testing with the bind backend showed a slight improvement there as well, even though lookups in the bind backend are already quite fast.
I have not tested LMDB.
We could reduce the number of round-trips to the backend even more by getting rid of the 'SOA' special case, since I'm not aware of any backend currently implementing it in a special way.
Checklist
I have: