-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Parsing record contents directly #1530
Comments
I know where you are coming from and feel your pain. I tried doing something similar for CoreDNS's caching back in the day.
|
ok, read it. I see. Let me be clear that the RFC 1035 zone format is a horrendous format, even if you remove most of it, like If we go back to the original task, it's: if have some bytes and I want those bytes to be the rdata of this
FWIW as eluded in my first comment, I never found a elegant way of doing the later in Go, do we need a So dunno - open to ideas, but I think a customized parser that works on the text representation isn't the best thing, nor the speediest |
Thank you @miekg! You raise some very good points. Initially I would've said It seems the only lexer function called by each type in Thanks for all the feedback so far! |
Coming back to this... still on the dunno. Had kept it in the back of my mind, and my question is. Why is this data in txt format in the first place? It makes more sense to have this in (uncompressed!) wire format, parse that back in and put the RR metadata on it, and voila -> RR |
In our case it's coming from the API or a database column that needs to be searchable text. In most other cases, I would agree that storing wire format data is better. Maybe ideally we'd have something similar to #1507 that allows parsing to work without a hard dependency on the lexer, but that is obviously a larger change. I'm still hoping to get a PR up at some point for you to review but if you don't object, I'm going to close this issue in the meantime since it's not really a bug and more of a "request for your thoughts". Thank you for your input, Miek! |
[ Quoting ***@***.***> in "Re: [miekg/dns] RFC: Parsing record..." ]
In our case it's coming from the API or a database column that needs to be
searchable text. In most other cases, I would agree that storing wire format
data is better.
Maybe ideally we'd have something similar to #1507 that allows parsing to work
without a hard dependency on the lexer, but that is obviously a larger change.
I'm still hoping to get a PR up at some point for you to review but if you
don't object, I'm going to close this issue in the meantime since it's not
really a bug and more of a "request for your thoughts".
Thank you for your input, Miek!
Cheers!
The only take-away I currently have is that the next version of miekg/dns should have
something for this.
What we can do (and want to do) with the current code, I still don't know.
/Miek
|
We often need to construct an RR from its name, type, TTL and content (for example, when reading records from the database or from user API requests.)
We currently use something similar to
NewRR(fmt.Sprintf("%s %d IN %s %s", name, ttl, type, content))
, but that requires a fair amount hacky extra work:1.2.3.4\n5.6.7.8
to parse without errors (as1.2.3.4
).$INCLUDE
somehow, so we need to disable that as a precaution.fe80::1\t
is considered a valid AAAA record content.It's also currently impossible to parse generic (RFC 3597) records into a
*dns.RFC3597
when the underlying type is known to this library becauseNewRR
always returns the known type's struct in that case, even when the content started with a\#
token.ToRFC3597
can be used to convert it back to a*dns.RFC3597
, but that callspack()
, which changes behavior depending on the underlying type. (For example, callingNewRR
with. 1 IN TYPE9999 \# 3 abcdef
produces Rdata\# 3 abcdef
as expected, whereas changing TYPE9999 to TYPE48 results in\# 4 abcdef00
instead.)This is far from ideal for our use case, so I've taken a stab at trying to build something more reasonable, which would require additional public interface. In short, I've been trying to build something that parses only the record's content, without any special zone file syntax.
My first thought was that this would be fairly simple to do with a single function (this one isn't super clean; consider it an example or first draft):
This gets us most of the way there, but I realized the lexer is also context-aware and also parses comments. The former is easily worked around by initializing the
zlexer
withowner: false
, which makes it expect content instead of an owner directive. The latter is more problematic and seems to require an entirely new, purpose-built lexer. SinceRR.parse()
expects a*zlexer
, that would either involve makingzlexer
an interface, or adding the new logic to the existingzlexer
struct. In addition, I noticed allRR.parse()
functions callslurpRemainder()
, which would then possibly have to be moved out.I'd be happy to work on this but before I spend too much time I wanted to ask whether this is a change you'd consider merging at all (hopefully), and, if so, whether someone with a deeper understanding of the parser and lexer has any design thoughts for me. I know you all tend to be conservative when it comes to adding new public interface (which is good), but I think this could be a useful addition that would enable use cases that are otherwise very difficult to achieve correctly.
The text was updated successfully, but these errors were encountered: