-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
smtp_banners: Update using Project Sonar data from 2017.11.30 and 2018.04.05 #160
Conversation
xml/smtp_banners.xml
Outdated
catch all for daemons that have no distinguishing fingerprint whatsoever | ||
</description> | ||
<fingerprint pattern="^(?i)(?:([^ ]+) )?E?SMTP(?: (?:Service )?Ready\.?)?$"> | ||
<description>catch all for daemons that have no distinguishing fingerprint whatsoever</description> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember there were some concerns for catch-alls. Specifically around ordering and whether we should include them or not. @jhart-r7 might have more info here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This catch all isn't new, I've just made the hostname optional for it so that it will matched even shorter non-specific names such as 'ESMTP ready'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be labeled as a catch all but the regular expression is not. It looks like it is intended to only match things that have an optional hostname followed by the string "ESMTP Service Ready" and a few minor variants. A catch-all would do something like match "ESMTP Service Ready" with arbitrary text before and after.
xml/smtp_banners.xml
Outdated
@@ -69,28 +69,43 @@ The system or service fingerprint with the highest certainty overwrites the othe | |||
http://www.argosoft.com/applications/mailserver/ | |||
Example: 220 ArGoSoft Mail Server, Version 1.4 (1.4.0.3) | |||
</description> | |||
<param pos="0" name="os.vendor" value="Microsoft"/> | |||
<param pos="0" name="os.family" value="Windows"/> | |||
<param pos="0" name="os.device" value="General"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not be adding the "General" fingerprint (here and elsewhere)
- Rename multiple Sendmail fingerprints with the description of "unknown" to something descriptive. The allows generating more accurate metrics of which FPs matched. Simplification of multiple description lines. - Tuning Sendmail fingerprints regex and ordering.
xml/smtp_banners.xml
Outdated
<description>IBM Domino SMTP MTA</description> | ||
<example host.name="foo.bar" service.version="9.0.1FP8 HF475">foo.bar ESMTP Service (IBM Domino Release 9.0.1FP8 HF475) ready at Thu, 30 Nov 2017 17:55:48 +0900</example> | ||
<example host.name="foo.bar" service.version="9.0.1">foo.bar ESMTP Service (IBM Domino Release 9.0.1) ready at Thu, 30 Nov 2017 10:12:26 +0100</example> | ||
<example service.version="9.0.1FP8"> ESMTP Service (IBM Domino Release 9.0.1FP8) ready at Thu, 30 Nov 2017 13:51:59 -0800</example> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is host.name
in this case? You made that outer group non-capturing and optional, but the inner will capture. I wonder if it is just set to the empty string? Do we care? I think the only solution would be to split these fingerprints up, one with a hostname and one with just spaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went this route to reduce the number of fingerprints. The result is an empty capture (at least in Ruby and (according to an online tester) Java. I'm ok with splitting them if you think it will make it easier to support or more cross-language compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my concern is that the behavior of an empty field in the "fingerprint" that we returns is undefined. It works one way when you use recog as a data source of fingerprints and an api/cli for interacting with them, it may work another way when used in something else, like metasploit-framework or other products. We don't have a good way to control this today, AFAIK.
IMO, split.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this on the phone. Good to go.
This PR updates the coverage of
xml/smtp_banners.xml
using data from Project Sonar's SMTP 25/tcp study on 2017.11.30. Additionally, there was significant reorganization and cleanup of the file.Note: Due to effort to cleanup description lines (remove duplicates, remove multilines, provide context, standardize format) almost every value for
<description>
has changed. This will impact the value returned asmatched
.Original fingerprint count: 129
Original fingerprint count: 133
Significant changes:
Improved the accuracy and/or flexibility of multiple fingerprints.
Changed ALL instances of
flags="REG_ICASE
to an inline flag ((?i:
) in order to make the regex compatible with more languages.Implemented fingerprint examples for those fingerprints where examples could be found. This sometimes resulted in removing fingerprints that were actually duplicates or trivially different.
Reworked
description
fields so as to remove examples and ensure that this field is unique within the file as the value of description serves as an identifier when processing fingerprints. Multiline descriptions were reduced to single line where possible. Almost every description was modified.Fixed multiple instances where captures where under/over capturing
Fixed multiple instances where the portion of the version banner that was captured was different between two products in the same family.
removed various real and example hostnames from examples and standardized on
foo.bar
Corrected
system.time.format
so as to match timestamp provided by serviceReworked date regex for multiple matches to remove inadvertent requirement for two digit day value when the banner included a single digit day.
Note: A few tweaks were made after the metrics below were generated and so the match % is higher
Overall fingerprint matches
New fingerprints and metrics
Existing fingerprint value shifts of interest
JAMES SMTP Server
picked up 1,256 matches (from 0 matches) in the 2018.04.05 data set simply due to fixing the regex which required 2 digit day values (05
) which the product didn't emit. Similar forA.K.I. PMail
which picked up 587 (from 0) after date change and fingerprint tweak.