-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: Refactor DNS handling for granular cache #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This refactoring significantly updates how DNS requests are processed,
improving caching efficiency and structure.
* Change the cache payload from `*dns.Msg` to `[]dns.RR` to allow
for granular caching of resource records per question.
* Iterate over all questions in a request, checking the blocklist and
cache individually.
* If an upstream request is necessary, extract only the relevant
answers from the response and cache them individually.
* Simplify error handling and response sending utilities.
Summary of ChangesHello @rm-hull, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant refactoring of the DNS handling mechanism, primarily focusing on improving caching efficiency and structure. The core change involves moving from caching entire DNS messages to caching individual resource records, enabling more precise and effective cache utilization. This also necessitated a rework of how incoming DNS requests are processed, allowing for per-question blocklist checks, cache lookups, and targeted upstream queries, ultimately leading to a more robust and optimized DNS resolution flow. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
||||||||||||||||||||||||
Removes temporary assertions checking for incorrect NXDOMAIN behavior in `TestDNSDispatcher_HandleDNSRequest_MultipleQuestions`. The test now correctly asserts that a successful response (`dns.RcodeSuccess`) is returned upon dispatching, along with the expected number of answers.
Pull Request Test Coverage Report for Build 19116119743Details
💛 - Coveralls |
Updates the DNS dispatcher to send all questions that were not
resolved by the internal cache to the upstream server in a
single request, rather than only sending the first question.
This commit also refactors the post-forwarding logic:
* Optimizes answer parsing by grouping answers received from
upstream into a map for fast lookup (`answerMap`).
* Ensures cache keys are consistently fully qualified using
`dns.Fqdn()`.
Refactor DNS blocking logic into a dedicated helper function, `sendNXDOMAIN`. When a query is fully blocked, the response now includes a Service of Authority (SOA) record in the authority section, alongside the `RcodeNameError`. This provides a more complete and compliant authoritative response for blocked domains.
This fixes an issue where forwarded upstream DNS requests were missing essential header fields. The upstream request is now properly constructed by: * Assigning a unique request ID using `dns.Id()`. * Copying the `RecursionDesired` flag from the original request. * Directly assigning the questions slice, removing verbose manual construction.
This updates how blocked DNS queries are handled to provide a more
compliant NXDOMAIN response signaling mechanism.
Previously, a separate function handled NXDOMAIN responses based on
whether all queries were blocked. Now:
* When a domain is blocked, an SOA record is immediately added to
the Authority section (`resp.Ns`).
* If the final response has no answers but includes authority
records, the response code is set to `RcodeNameError`.
* This refactors and simplifies the DNS handling logic.
Also updates documentation to simplify the suggested `go test`
command from `-bench=.` to `./...`.
This commit introduces granular metrics for tracking individual DNS
questions and refactors the core request handling logic.
* Adds `queryCounts` metric to track DNS questions broken down by
record type (A, CNAME, etc.) and whether they were blocked.
* Refactors the handler by extracting logic into `processQuestion`
and `resolveUpstream`.
* Updates blocked domain handling to return a SOA record in the
answer section and uses `RcodeSuccess` in the response, aligning
with typical sinkhole behavior.
Change the signature of `resolveUpstream` to return the `[]dns.RR`
answers directly, simplifying the function's responsibility.
This commit also:
* Adds tracking for the number of successful queries that are
forwarded to an upstream DNS server.
* Updates the `dns_request_count` help text to reflect the new
`forwarded` label.
Updates the DNS dispatcher to correctly relay specific error Rcdoes (like `RcodeBadSig`) returned by upstream resolvers instead of masking them as a generic `RcodeServerFailure`. This ensures that the client receives accurate status information from the configured upstream server when a lookup fails for reasons other than connectivity issues.
Renames the internal `handleError` method to `reportError`. This clarifies that the function's primary responsibility is logging errors and incrementing metrics, not fully handling error recovery. Also includes minor cleanup in `dns_test.go` (comment update and minor refactoring of test server setup order).
Ensure that when an upstream resolver returns an error (non-success
Rcode), the actual response code is returned to the caller.
This commit also improves test reliability:
* Refactors `TestDNSDispatcher_ResolveUpstream_BadRCode` to use
dynamic ports.
* Updates `waitForPort` to use a robust DNS exchange probe rather
than simple UDP dial checks.
This commit significantly refactors the DNS dispatcher unit tests to use dedicated, isolated local mock DNS servers managed by new helper functions (`startLocalDNS`, `dnsRecord`, and `waitForPort`). This eliminates complex, manually synchronized inline server setup logic within individual tests, greatly improving test reliability and maintainability. Minor cleanup changes were also applied to `internal/dns.go`: * Corrects the order of setting the response Rcode and reporting the error during upstream failures. * Removes redundant error logging in `resolveUpstream`.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a significant and well-executed refactoring of the DNS request handling logic. The change to a granular cache using []dns.RR and processing each question individually is a major improvement that correctly handles requests with multiple questions, including mixed allowed/blocked domains. The simplification of error handling and the extensive improvements to the test suite, especially the introduction of a local mock DNS server, greatly enhance the robustness and reliability of the codebase.
I've identified a few potential issues in the new logic concerning DNS correctness: one related to potentially misclassifying NODATA responses as NXDOMAIN, another related to cache key consistency, and a third regarding cache TTL handling which could lead to serving stale data. These are detailed in the specific comments. Overall, this is excellent work.
| if len(resp.Answer) == 0 && len(resp.Ns) > 0 { | ||
| resp.Rcode = dns.RcodeNameError | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic to set resp.Rcode to dns.RcodeNameError when there are no answers but there are records in the authority section (Ns) might be too aggressive. This could incorrectly convert a NODATA response (a valid domain name exists, but not for the queried record type) into an NXDOMAIN response (the domain name does not exist). A NODATA response typically has an RCODE of NOERROR, an empty answer section, and a SOA record in the authority section. This logic would incorrectly flag it as NXDOMAIN, which could cause issues for clients that rely on the distinction. Consider removing this block or making the logic more specific to only cases that are truly NXDOMAIN.
Ensures that DNS names used for caching keys are always fully qualified using `dns.Fqdn()`. This prevents mismatches when caching results for names that may or may not implicitly include the trailing dot. Additionally, this simplifies how the cache TTL is determined. We now strictly use the TTL provided by the upstream response header, removing complex logic involving `defaultTTL`.
This refactoring significantly updates how DNS requests are processed, improving caching efficiency and structure.
*dns.Msgto[]dns.RRto allowfor granular caching of resource records per question.
cache individually.
answers from the response and cache them individually.