Skip to content

Modernization: replace multiple hand-parsers with new memory-safe parser#581

Merged
bradh352 merged 57 commits intoc-ares:mainfrom
bradh352:dns_parser
Oct 25, 2023
Merged

Modernization: replace multiple hand-parsers with new memory-safe parser#581
bradh352 merged 57 commits intoc-ares:mainfrom
bradh352:dns_parser

Conversation

@bradh352
Copy link
Member

@bradh352 bradh352 commented Oct 23, 2023

New DNS record parsing code. The old code was basically just some helper macros and functions for parsing an entire DNS message. The caller had to know the RFCs to use the parsers, except for some pre-made exceptions. The new parsing code parses the entire DNS message into an opaque data structure in a memory safe manner with various accessors for reading and manipulating the data.

The existing parser helpers for the various record types were reimplemented as wrappers around the new parser.

The accessors allow easy iteration across the DNS record datastructure, and can be used to easily create dig-like output without needing to know anything about the various record types and formats as dynamic helpers are provided for enumeration of values and data types of those values.

At some point in the future, this new DNS record structure, accessors, and parser will be exposed publicly. This is not done at this point as we don't want to do that until the API is completely stable. Likely a write() function to output the DNS record back into an actual message buffer will be introduced with the stable API as well.

Some subtle bugs in the existing code were uncovered, some which had test cases which turned out to be bogus. Validation with third-party implementations (e.g. BIND9) were performed to validate such cases were indeed bugs.

Adding additional RR parsers such as for TLSA (#470) or SVCB/HTTPS (#566) are trivial now since focus can be put on only parsing the data within the RR, not the entire message. That said, as the new parser is not yet public, it isn't clear the best way to expose any new RRs (probably best to wait for the new parser to be public rather than hacking in another legacy function).

Some additional RRs that are part of DNS RFC1035 or EDNS RFC6891 that didn't have previously implemented parsers are now also implemented (e.g. HINFO, OPT). Any unrecognized RRs are encapsulated into a "RAW_RR" as binary data which can be inserted or extracted, but are otherwise not interpreted in any way.

Fix By: Brad House (@bradh352)

@bradh352 bradh352 mentioned this pull request Oct 23, 2023
@bradh352 bradh352 marked this pull request as draft October 23, 2023 12:51
@bradh352 bradh352 requested a review from bagder October 23, 2023 12:52
@bradh352 bradh352 marked this pull request as ready for review October 23, 2023 19:55
@bradh352
Copy link
Member Author

@bagder think this is good to go. Shouldn't be a considerable amount of risk, there may be areas where the new parser is more strict. The test cases provided pretty good coverage, and the live tests are passing.

The biggest one I can think of is if there is an unrecognized class or opcode, it will fail, but that would be a malformation we'd probably want to catch anyhow. Also, since it does a full message parse, instead of just pulling off bits it needs, malformed messages are caught rather than being missed, but I don't see how this would be a concern however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant