Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

9373: New RR classes for SSHFP and TSIG #954

Merged
merged 14 commits into from
Oct 20, 2018
Merged

Conversation

wiml
Copy link
Contributor

@wiml wiml commented Jan 23, 2018

The new RR classes are (mostly) independent of each other. This PR contains the new classes for SSHFP and TSIG. I'll submit a second PR for the others.

Contributor Checklist:

@adiroiban adiroiban requested review from a team and adiroiban March 4, 2018 13:11
Copy link
Member

@adiroiban adiroiban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. Happy to see these updates for DNS.

I am not sure if this is ready for review.

I did a quick review and left some comments.

Let me know what do you think.

Thanks!

"""
Assert that encoding C{record} and then decoding the resulting bytes
creates a record which compares equal to C{record}.
creates a record which compares equal to C{record}. Optionally assert
that the encoded bytes are equal to C{expectedEncoding}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are also using pydoctor / epydoc markup for arguments. What is C{record} ?

But rather than adding an optional argument, I think that it would be best to have a separate/dedicated method.

I think that the name of this function can be changed to something like _assertRecordSerialization or _assertRecordEncoding.

The new method can do a call to _recordRoundtripTest() if required

Also, instead of encoding we can use expectedWireBytes or something like that.

b'\x00\x2A\x00\x00\x00\x00')
self._recordRoundtripTest(rr, expectedEncoding=rdata)

rr = dns.Record_TSIG(algorithm='hmac-sha256',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this needs separate test method with a dedicated docstring.

Two L{dns.Record_SSHFP} instances compare equal if and only if
they have the same key type, fingerprint type, fingerprint, and ttl.
"""
# Vary the key type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that existing code is using the same convention.
I am ok to keep multiple assertions into this single test, but please add a full stop at the end of the comment to make it clear that the comment is done and that it was not accidentally left unfinished.

def test_tsig(self):
"""
L{dns.Record_TSIG} instances compare equal if and only if they have the
same data and ttl.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is data ?

otherData=b'\x80\x00\x00\x00\x00\x08',
MAC=b'\x00\x01\x02\x03\x10\x11\x12\x13'
b'\x20\x21\x22\x23\x30\x31\x32\x33')
self._recordRoundtripTest(rr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use expectedEncoding here?

@cvar KEYTYPE_ECDSA: The key type value for C{ecdsa-sha2-*} keys.
@cvar KEYTYPE_Ed25519: The key type value for C{ed25519} keys.

@cvar FPTYPE_SHA1: The fptype value for SHA-1 fingerprints.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have this FINGERPRINT_TYPE_SHA1 to not have abbreviations and have a value which can be searched in the RFC

A transaction signature, encapsulated in a RR, as described
in U{RFC 2845 <https://tools.ietf.org/html/rfc2845>}.

@type algorithm: L{Name}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what type is Name?

also, to match the RFC, shouldn't this be algorithmName? as otherwise, instead of 'timeSigned' we can have just 'signed'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name is a class in the same module. The apidocs seem to find and link to it correctly as is. (It corresponds to a domain name.)

I would normally be in favor of shortening timeSigned but I think that changing timeSigned to signed would be confusing — there are too many things that could indicate (time signed, whether it's signed, the item being signed, etc). Calling it time would be OK, perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Sorry for the noise with Name.
As for naming the variables, I am fine with any other name if you think that it would make it easier to use.
As a non DNS guy, my initial thought was that keeping the names as close as possible to the RFC would reduce the confusion of whether something is associated with the RFC or is extra API.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"time" is the name of an existing module, so not a good choice. timeSigned is fine. It's the time it was signed, so timeSigned "reads" well.

@type keytype: L{int}
@ivar keytype: The SSH key algorithm or key type.
Note that the numbering used for SSH key algorithms is specific
to the SSHFP record type, and is not the same as the nubering
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nubering vs numbering


@type keytype: L{int}
@ivar keytype: The SSH key algorithm or key type.
Note that the numbering used for SSH key algorithms is specific
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this needs some indentation

A record containing the fingerprint of an SSH key.

@type keytype: L{int}
@ivar keytype: The SSH key algorithm or key type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key algorithm or key type... which one?

Looking at the RFC, I think that this is referred as "algorithm number" and I would prefer to have that terminology here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user documentation (e.g. the ssh man pages) mostly refer to it as the "key type" but the RFCs (RFC4255, RFC4252) seem to prefer "algorithm". I'll change it to algorithm and ALGORITHM_RSA etc. There's the possibility of different key types using the same underlying algorithm but that seems unlikely to confuse people in practice.

…the terminology used in the RFCs. Improve the docstrings and testcase organization.
@wiml
Copy link
Contributor Author

wiml commented Mar 11, 2018

I think my last commit addresses your comments. Twistedchecker rejects it for an over-long line in test_dns.py — but it also rejects it if I break that across two lines (twistedchecker doesn't like whitespace in keyword arguments). I'm not sure there's a fix for that that isn't worse than the problem.

@@ -442,6 +447,23 @@ def _recordRoundtripTest(self, record):
self.assertEqual(record, replica)


def _recordEncodingTest(self, record, expectedEncoding):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that the previous function was based on this convention, but for new code I think that we could do better.

This is testing code, so unless there is some method which should be used by a single test, we can mark any method used by more tests as public. _recordEncodingTest vs recordEncodingTest

In the testing code, we should reserve the private things only for code which should not be touched :)


To make the tests easier to round, I would like to see domain specific language assertions.
So instead of _recordEncodingTest this should be something like assertWireFormat or something like that.

This test case is all about records... so no need for record in the function name... and encoding is very generic.

And the whole code from here is about tests, so no need for the Test suffix.

self._recordEncodingTest(rr, b'\x02\x01' + fp)
vs
self.assertWireFormat(b'\x02\x01' + fp, rr)

Minor comment: To make the tests easier to read, as a convention I would like to have new methods writtern as assert(expected, actual)

Copy link
Contributor Author

@wiml wiml Mar 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I'll rename the method.

I think encoding is the right word here, though, because it's testing the encode() method of the IEncodable protocol.

Maybe assertEncodedFormat or assertEncodedBytes ? assertEncodingResult ?

@adiroiban adiroiban self-assigned this Mar 11, 2018
@adiroiban
Copy link
Member

Thanks. Looks good. As the first review I have only read the RFC for SSHFP... will the the TSIG one and will make a final review... but I think that this is good shape.

Only minor comments about test style. When things go wrong, I end up reading a lot of tests, and it helps a lot to have clean/easy to read tests :)

@adiroiban
Copy link
Member

Twistedchecker rejects it for an over-long line in test_dns.py

I made a commit via web... let's see if TwistedChecker is happy.

instead of

someVariable = self.someLongCall(firstArgument=someValue,
                                                        secondArgument=otherValue)

I prefer this formatting as it helps with breaking long lines...and closing parenthesis on a separate line to reduce future diffs

someVariable = self.someLongCall(
    firstArgument=someValue,
    secondArgument=otherValue,
    )

Copy link
Member

@adiroiban adiroiban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSHFP looks good.
I suggest to change the scope of this PR to only SSHFP and have a separate ticket for TSIG...

I don't know what TKEY is for.

I think that SSHFP and TSIG are independent code so with separate PR we can merge one as soon as the code is ready.

See inline comments.

Thanks!

SPF = 99

# These record types do not exist in zones, but are transferred in
# messages the same way normal RRs are.
TKEY = 249
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is TKEY used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, TKEY slipped through from a future PR (I have several new RR implementations to submit but I split them up into smaller pull requests).

@type algorithm: L{Name}
@ivar algorithm: The name of the signature or MAC algorithm.

@type time: L{int}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the RFC I see

      Algorithm Name   domain-name    Name of the algorithm
                                      in domain name syntax.
      Time Signed      u_int48_t      seconds since 1-Jan-70 UTC.
      Fudge            u_int16_t      seconds of error permitted
                                      in Time Signed.
      MAC Size         u_int16_t      number of octets in MAC.
      MAC              octet stream   defined by Algorithm Name.
      Original ID      u_int16_t      original message ID
      Error            u_int16_t      expanded RCODE covering
                                      TSIG processing.
      Other Len        u_int16_t      length, in octets, of
                                      Other Data.
      Other Data       octet stream   empty unless Error == BADTIME

I suggest

algorithmName
timeSigned

There are already fine:

fudge
MAC
originalID
otherData

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this point - can you adjust time and algorithm to adi's suggestions, or just a response saying why you didn't want to do that?

fudge=5, MAC=None, originalID=0,
error=OK, otherData=b'', ttl=0):
self.algorithm = None if algorithm is None else Name(algorithm)
self.time = None if time is None else int(time)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do we need this?
Each if/else branch need a test... so I think that we should just assing self.timeSigned = timeSigned and don't try to add any more logic.

Otherwise, please document and test that it also accepts text/str for time.
I don't see any test where time is None or not an int.

FINGERPRINT_TYPE_SHA1 = 1
FINGERPRINT_TYPE_SHA256 = 2

def __init__(self, algorithm=0, fingerprintType=0, fingerprint=b'', ttl=0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the default values.
What is the reason of doing this?

I think that by default, we should initialize a valid object and document the default values ... or else make the argument mandatory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree, but the way that t.n.dns is written, each record type has to be instantiable with no arguments except for ttl. (See the parseRecords method of Message.)

There are certainly cleaner ways to write that, and I'd even be interested in implementing one if you think the Twisted community would be willing to accept a bunch of compatibility-breaking changes for such a thing, but I think that's something that should be done in a different PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just asking. Do we need to create by default an instance which is not valid?

This PR might need a review from someone who cares about DNS :)

I have never used the DNS part of Twisted, so in these matters of API design maybe is best to have someone else approve or reject it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of the arguments should be mandatory, because there isn't really a good default value. I think we agree on that.

Because it's not possible to make them mandatory (without widespread compatibility-breaking changes to the module) I think that giving them invalid default values is the second-best option. It would be possible to give them arbitrary, but valid, defaults, but I wouldn't want anyone to write code that relies on those defaults. If fingerprintType defaulted to SHA1 (for example), someone would leave it unspecified in code that creates an SSHFP record, and I think that would be error-prone and confusing — a fingerprint value is nonsense unless its type (and key type) are also known.

There are a few arguments that I think do have useful default values (like the otherData field of TSIG) but in my opinion it would be ideal if most of the RR init arguments were mandatory.

Like I said, I would be happy to submit a patch to clean up this part of dns.py's API, but it would change the public interface and have to go through a full deprecation cycle anyway.

def __init__(self, algorithm=None, time=None,
fudge=5, MAC=None, originalID=0,
error=OK, otherData=b'', ttl=0):
self.algorithm = None if algorithm is None else Name(algorithm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the type of init argument is different from the instance member, we need to document the init argument.

Otherwise, it can be confusing... and we need tests for each if/else branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I did the Name conversion because it's consistent with the way the rest of the module is written and also I think it is a convenient behavior for users of the module. I'll document it.

The None branch is only there so that the record can be instantiated with an empty argument list, right before decode is called. IMHO the nullability of algorithm shouldn't be part of the public API; it is an internal implementation detail. It is in effect tested by the record round-trip tests, though, because they need to instantiate an "empty" record before calling decode.

error=OK, otherData=b'', ttl=0):
self.algorithm = None if algorithm is None else Name(algorithm)
self.time = None if time is None else int(time)
self.fudge = str2time(fudge)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this? I think that we need a test to document why we are converting the input

self.fudge = str2time(fudge)
self.MAC = MAC
self.originalID = int(originalID)
self.error = int(error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do the conversions?


def __init__(self, algorithm=None, time=None,
fudge=5, MAC=None, originalID=0,
error=OK, otherData=b'', ttl=0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since TTL is always 0 in RFC, do we want it to expose it ?
We can add it later, if needed... but until then https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ttl field doesn't even belong on this class, since TTL is a property of the record header, not of the record data. But t.n.dns expects one.

(Actually, it looks like it would be OK to have a ttl keyword argument but ignore it and have no ttl attribute. I think that would be worse; I'd rather have all the record classes be weird in the same way, than have them be inconsistent.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that if this record type has special rules around its TTL, that should at least be documented in its @ivar. (Which, btw, this doesn't have; it should have one.)

self._recordEncodingTest(rr, rdata)

rr = dns.Record_TSIG(algorithm='hmac-sha256',
time=4511798055, # More than 32 bits
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why more than 32bit... I think that this needs a separate test case with a dedicated docstring to explain what we are doing here :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main reason for the second set of values is to test the case where the optional error and otherData fields are present. I used a larger time stamp because that field is a 48-bit number, and in my experience it's common for a bug to show up when a field's value crosses a 32-bit boundary.

I suppose it wouldn't hurt anything to split this method into two, but it seems to me that would make the code harder to read, not easier. They're testing essentially the same thing (whether the record's encode/decode methods work).

@adiroiban adiroiban assigned wiml and unassigned adiroiban Mar 13, 2018
Copy link
Member

@adiroiban adiroiban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that we should create valid instance by default... other than, that, this PR should be ok, but it needs a review from someone else who has more experience with using the DNS api of Twisted.

@adiroiban adiroiban dismissed their stale review March 16, 2018 00:24

Need someone with some experience in Twisted DNS

@adiroiban
Copy link
Member

Thanks a lot for the explanation.
I think that a test case in which we initiate with default values and explain why do we have this, should help.

And we need tests for each if/else branch... but I prefer to not handle invalid input. If we document that int is required, then we should not try to accept string and convert it.

Without the tests and its docstring, just by looking at the code I am left wondering: Why do we need this nonsense ? :) WTF/Min

@glyph
Copy link
Member

glyph commented Jun 23, 2018

The travis failure is irrelevant to this change so I restarted it.

I don't know if I'm going to have time to fully deal with this today but I'm inclined to merge this ASAP. @wiml has been gamely suffering our process for months now, I think it meets the quality bar, so despite the fact that we might not have any real actual DNS ninjas to participate here we should probably trust his guidance.

Unless @wallrj wants to return from his long hiatus to opine... :)

@glyph
Copy link
Member

glyph commented Jun 24, 2018

./admin/pr_as_branch 954 9373 names-new-rrs done.

@codecov
Copy link

codecov bot commented Jul 16, 2018

Codecov Report

Merging #954 into trunk will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##            trunk     #954      +/-   ##
==========================================
+ Coverage   91.91%   91.91%   +<.01%     
==========================================
  Files         844      844              
  Lines      151023   151070      +47     
  Branches    13182    13176       -6     
==========================================
+ Hits       138815   138863      +48     
+ Misses      10113    10112       -1     
  Partials     2095     2095

Copy link
Member

@glyph glyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wiml ! Now that you yourself are a project member, I can leave an "address feedback then merge" review. Hooray!

  • Please address the point about the naming; a public attribute is forever, so it would be good to ack the point @adiroiban previously noted about naming. (It looks like you solidly addressed everything else.)
  • Please fix the ticket summary & description over on Trac to match what is actually being done here; it addresses 5 RR types, this implements 2. If there's more work to be done that isn't addressed by existing tickets, please file a new one for the remainder of the work you'd like to land. Also please do any gardening of other related tickets; you may have to add some additional .misc newsfiles here to address the other tickets. (Also, be extra sure to use the Fixes: ticket:XXXX syntax described in https://twistedmatrix.com/trac/wiki/ReviewProcess#Authors:Howtomergethechangetotrunk to auto-close any related tickets.

Thanks for persevering with this!

@type algorithm: L{Name}
@ivar algorithm: The name of the signature or MAC algorithm.

@type time: L{int}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this point - can you adjust time and algorithm to adi's suggestions, or just a response saying why you didn't want to do that?


def __init__(self, algorithm=None, time=None,
fudge=5, MAC=None, originalID=0,
error=OK, otherData=b'', ttl=0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that if this record type has special rules around its TTL, that should at least be documented in its @ivar. (Which, btw, this doesn't have; it should have one.)

self.fudge = str2time(fudge)
self.MAC = MAC
self.originalID = originalID
self.error = error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was already pining for attrs by the second attribute - but that's probably a task for another day :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think that the ttl attribute (and keyword parameter) shouldn't be there at all — I only added it for consistency with the other record classes. I wrote up my thinking here: https://twistedmatrix.com/trac/ticket/9550

@wiml wiml force-pushed the 9373-names-new-rrs branch 2 times, most recently from 53b27d0 to ed02ddd Compare October 11, 2018 22:59
@wiml
Copy link
Contributor Author

wiml commented Oct 18, 2018

I think the last bit of unresolved feedback is the TSIG.algorithm attribute's name, algorithm vs. algorithmName. My preference is to keep it as it is, because I prefer shorter attribute names when all else is equal. I think the balance is different than it is for timeSigned for a few reasons:

  1. There's no ambiguity of meaning (as there is with time or signed); there's only one algorithm salient to a TSIG.
  2. There's no conflict with some other common identifier.
  3. The algorithm of a TSIG record is a domain name, i.e., a Name; there isn't another value (like an int or class or something) that could be placed there instead. My rubric here is that any user familiar enough with transaction signatures to know how to use TSIG at all will know that the algorithm is a DNS-name.

Number 3 is probably the weakest point — the attribute could contain a str or bytes for example — but the Name suffix doesn't actually help disambiguate that case, so it's still syntactic noise.

@wiml wiml force-pushed the 9373-names-new-rrs branch 2 times, most recently from 32851c8 to 6e5d07c Compare October 20, 2018 04:24
@wiml wiml merged commit 6d4690b into twisted:trunk Oct 20, 2018
wiml added a commit that referenced this pull request Oct 20, 2018
Author: wiml
Reviewer: adiroiban, RussNelson, glyph
Fixes: ticket:9373
Refs: ticket:4602

New RR classes for SSHFP and TSIG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants