-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefer dnspython to py3dns #25
Conversation
For example, two consecutives of trustymail --debug --json --spf ed.gov were observed to give different results. It appears that the py3dns library does not always correctly handle the case where a DNS response is too big to fit in a UDP packet. Further testing showed that the dnspython library was more reliable. As a result, I changed trustymail to use dnspython instead of py3dns. I also modified requirements.txt and setup.py to add the new requirement.
Using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a few fixes you may want to make, but nothing earth shattering. 👍
@@ -58,6 +58,7 @@ | |||
'requests', | |||
'docopt', | |||
'publicsuffix', | |||
'dnspython', | |||
'py3dns', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to remove py3dns
from the requirements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try removing py3dns
, but then I got errors regarding a missing import that spf
uses. So I put it back in.
trustymail/trustymail.py
Outdated
|
||
if record_text.startswith("\""): | ||
record_text = record[1:-1] | ||
record_text = record_text[1:-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this (record
-> record_text
) change the logical behavior of this function? or just something that is different about the two DNS libraries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I wonder if rather than the conditional, it would be better to simplify to:
record_text = record_text.lstrip('"')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First point: The new library returns an object. If it's an MX record then you have to access the exchange
and preference
member variables. If you treat the object as a string you get something like 10 mail.blah.com
.
Second point: I think you're right, and I'll make the change.
trustymail/trustymail.py
Outdated
handle_error("[DMARC]", domain, error) | ||
|
||
|
||
def find_host_from_ip(ip_addr): | ||
return DNS.revlookup(ip_addr) | ||
return str(resolver.query(reversename.from_address(ip_addr), "PTR")[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I'm a fan of splitting major indices out, to clarify what they are. Maybe something like:
hostname, _ = resolver.query(reversename.from_address(ip_addr), "PTR")
return str(hostname)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, does reversename.from_address(ip_addr)
not provide a hostname itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First point: I agree and will make the change.
Second point: No, it does not:
Python 3.6.3 (default, Oct 24 2017, 14:48:20)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dns.resolver
>>> import dns.reversename
>>> record = dns.resolver.query("www.npr.org", "A")
>>> print(record.response)
id 39913
opcode QUERY
rcode NOERROR
flags QR RD RA
;QUESTION
www.npr.org. IN A
;ANSWER
www.npr.org. 172 IN CNAME wildcard.npr.org.edgekey.net.
wildcard.npr.org.edgekey.net. 20988 IN CNAME e4539.g.akamaiedge.net.
e4539.g.akamaiedge.net. 19 IN A 23.46.53.110
;AUTHORITY
;ADDITIONAL
>>> n = dns.reversename.from_address("23.46.53.110")
>>> print(n)
110.53.46.23.in-addr.arpa.
>>>
user to specify the DNS hosts against which to query * Added another exception in the catch blocks that follow calls to dns.resolver.Resolver.query. In my testing I saw that this exception was occassionally thrown but not being caught.
…k of a line there was causing the command line options to be parsed incorrectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions about domain.errors
and domain.syntax_errors
.
@@ -221,20 +215,16 @@ def spf_scan(domain): | |||
logging.debug("\tResult Differs: Expected [{0}] - Actual [{1}]".format(result, response[0])) | |||
domain.errors.append("Result Differs: Expected [{0}] - Actual [{1}]".format(result, response[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Should this be
domain.syntax_errors
? - The field
domain.errors
doesn't even appear in the output oftrustymail
. Should it? Note that the methodhandle_errors
below also writes todomain.errors
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a moot point now, since I added domain.errors
to the output of trustymail
.
I did some testing with the TLDR; Go ahead and merge this pull request if you want. 😃 |
field is quite helpful when debugging why trustymail gives teh result it does fro a domain.
reliability. We only care about the content and correctness of the DNS records, not whether they fit inside a single UDP packet. * Retry DNS queries that return SERVFAIL, since this may only indicate a temporary network issue. * Removed some hard-wired timeout code that was mistakenly committed.
timeout passed in
We were using 127.0.0.1. When the domain being scanned uses the ptr mechanism, a reverse DNS lookup is done against the IP of the mail sender. This fails for 127.0.0.1. The solution is to use an IP that 1. Has a valid DNS PTR record 2. Is not used as a mail server by anyone I tried using c1b1.ncats.cyber.dhs.gov, since that has a valid DNS PTR record in our AWS Route 53 setup and it is not used as a mail server. I found, though, that Google DNS does not return the PTR record for that hostname. Until that is resolved I'm using an IP that is used by www.google.com that does have a valid PTR record and should not used as a mail server by anyone we scan.
a5c6cbd
to
8e1fc18
Compare
current-federal.csv. This was for two reasons: * PublicSuffixList returns a lower case version of the base domain. The case was not taken into account when comparing the base domain returned by PublicSuffixList with the domain passed into the Domain constructor. * When the domain is not equal to the base domain returned by PublicSuffixList, the Domain constructor does a DMARC scan to populate some data. That scan was being called in a way that was inconsistent with the current trustymail code. I fixed these issues by: * Converting domain names to lowercase in the Domain constructor and only using the lowercase version. * Changing the DMARC scan function call to be consistent with the current trustymail code. A further advantage of the first point is that many needless DMARC scans are no longer taking place.
/etc/resolv.conf ONLY if no DNS hostnames are specified.
we were using appears to be also used for mail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the excellent additions and fixes, @jsf9k. I've documented a few minor concerns, but given our timetable I'm going to commit these, filing issues where needed.
@@ -142,7 +143,8 @@ def generate_results(self): | |||
"DMARC Results on Base Domain": self.parent_dmarc_results(), | |||
"DMARC Policy": self.get_dmarc_policy(), | |||
|
|||
"Syntax Errors": self.format_list(self.syntax_errors) | |||
"Syntax Errors": self.format_list(self.syntax_errors), | |||
"Errors": self.format_list(self.errors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we embed scan results in our reports, and the attachments do get (occasionally 😅) looked at, the more clarity we can provide up front to enable self-servicing the better, lest we get sent a ton of inquiries. I agree with #27 that this should probably be renamed, and clarified. For example, the README indicates some of this ambiguity. A snippit reads:
- Syntax Errors - A list of syntax errors that were detected when scanning DMARC or SPF records, or checking for STARTTLS support.
- Errors - A list of any other errors encountered, such as DNS failures.
If Syntax Errors
are just for errors of syntax, network error information about STARTTLS should probably not go there.
Perhaps Debug
is a better field name than Errors
?
Also, the information that lands in Errors
can be ambiguous as to its provenance. For instance, in a scan of one domain, the error field results were:
[Errno 101] Network is unreachable, [Errno 101] Network is unreachable, [Errno 101] Network is unreachable, [Errno 101] Network is unreachable, [Errno 101] Network is unreachable
There actually was a network-related error. However, the error output is not clear as to the object whose network appears unreachable. (The domain scanned has 5 mail servers listed on their MX record, so I expect this error is in reference to those 5 hostnames.)
@@ -22,6 +23,14 @@ | |||
--dmarc Only check dmarc records | |||
--json Output is in json format (default csv) | |||
--debug Output should include error messages. | |||
--dns-hostnames=HOSTNAMES A comma-delimited list of DNS servers to query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider renaming dns-hostnames
just dns
. In most instances, IPs will be entered, not actual hostnames, and dns
is faster to type on the command-line.
# virginia.edu resolves to until we resolve why Google | ||
# DNS does not return the same PTR records as the CAL | ||
# DNS does for 64.69.57.18. | ||
query = spf.query("128.143.22.36", "email_wizard@" + domain.domain_name, domain.domain_name, strict=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm interested in the long-term solution to this, given @dhs-ncats' intent to craft tools to be reusable outside of our specific needs. I wonder if setting an IP-with-a-valid-PTR-record-that-doesn't-send-mail needs to be something a user supplies at the command line or as an environmental variable in order to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other folks can use our IP-with-a-valid-PTR-record-that-doesn't-send-mail once the wrinkles are ironed out with our DNS. The IP I was going to use does have a valid PTR record (although it isn't publicly available yet) and does not send mail.
Having such an IP as a precondition for using trustymail seems to put a lot of burden on the user. I don't personally have such an IP, for example, because I don't own any static IPs. I would have to find one.
@jsf9k -- A recommendation for the future, might be good to prefix the pull request name with |
I'll do that. Thanks. |
Add caches for GitHub Actions.
Add codeowners file with team OIS maintainers.
During testing we found that
trustymail
was not always idempotent. For example, two consecutive runs oftrustymail --debug --json --spf ed.gov
were observed to give different results. It appears that thepy3dns
library does not always correctly handle the case where a DNS response is too big to fit in a UDP packet. Further testing showed that thednspython
library was more reliable.As a result, I changed
trustymail
to usednspython
instead ofpy3dns
. I also modifiedrequirements.txt
andsetup.py
to add the new requirement.This should resolve #24.