Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smtp_banners: Update using Project Sonar data from 2017.11.30 and 2018.04.05 #160

Merged
merged 9 commits into from
Apr 16, 2018

Conversation

tsellers-r7
Copy link
Contributor

@tsellers-r7 tsellers-r7 commented Dec 4, 2017

This PR updates the coverage of xml/smtp_banners.xml using data from Project Sonar's SMTP 25/tcp study on 2017.11.30. Additionally, there was significant reorganization and cleanup of the file.

Note: Due to effort to cleanup description lines (remove duplicates, remove multilines, provide context, standardize format) almost every value for <description> has changed. This will impact the value returned as matched.

Original fingerprint count: 129
Original fingerprint count: 133

Significant changes:

  • Improved the accuracy and/or flexibility of multiple fingerprints.

  • Changed ALL instances of flags="REG_ICASE to an inline flag ((?i:) in order to make the regex compatible with more languages.

  • Implemented fingerprint examples for those fingerprints where examples could be found. This sometimes resulted in removing fingerprints that were actually duplicates or trivially different.

  • Reworked description fields so as to remove examples and ensure that this field is unique within the file as the value of description serves as an identifier when processing fingerprints. Multiline descriptions were reduced to single line where possible. Almost every description was modified.

  • Fixed multiple instances where captures where under/over capturing

  • Fixed multiple instances where the portion of the version banner that was captured was different between two products in the same family.

  • removed various real and example hostnames from examples and standardized on foo.bar

  • Corrected system.time.format so as to match timestamp provided by service

  • Reworked date regex for multiple matches to remove inadvertent requirement for two digit day value when the banner included a single digit day.

Note: A few tweaks were made after the metrics below were generated and so the match % is higher

Overall fingerprint matches

Total 2018-04-05 BEFORE 2018-04-05 AFTER
6,561,395 5,364,381 5,490,301

New fingerprints and metrics

Fingerprint 2017-11-30 Dataset 2018-04-05 Dataset
Kerio Connect 12,862 12,713
Ecelerity 15,719 15,084
MailEnable - Simple 6,344 6,780
SonicWall Email Security 5,729 5,315
Postfix - Ubuntu, Mail-in-a-Box package 5,048
IBM Domino SMTP MTA 3,632 4,472
Twisted SMTP server 1,842 4,165
Communigate Pro 4,284 4,303
PowerMTA 4,242 3,878
Lyris ListManager 2,981 2,742
Ma Jian WinWebMail 2,982 2,726
Sendmail - Debian patch only 3,955 3,664
Sendmail - Debian 7.x (wheezy) 8,894 1,400
Sendmail - Debian 8.x (jessie) 1,239 1,147
Sendmail - Debian 5.x (lenny) 486 400
Sendmail - Debian 3.1 (sarge) 335 334
Sendmail - Debian 4.x (etch) 122 64
Tobit Software David 1,988 1,907
Cellopoint E-mail Firewall 9,096 1,842

Existing fingerprint value shifts of interest

Fingerprint 2017-11-30 - BEFORE 2017-11-30 - AFTER 2018-04-05 BEFORE 2018-04-05 AFTER
Null (unmatched) 1,260,499 1,124,855 1,197,014 1,071,094
Exim with version string and optional timestamp 832,821 833,333 835,880 837,506
Sendmail - Ubuntu 3,495 5,811 1,950 3,993
Lotus Domino SMTP MTA 2,949 3,632 2,652 3,266
MailEnable - Complex 80,103 97,080 79,843 97,076
Postfix - generic banner with amusing comments in parentheses 17,438 13,805 17,407 13,845
JAMES SMTP Server 0 1,256
A.K.I. PMail 0 587

JAMES SMTP Server picked up 1,256 matches (from 0 matches) in the 2018.04.05 data set simply due to fixing the regex which required 2 digit day values (05) which the product didn't emit. Similar for A.K.I. PMail which picked up 587 (from 0) after date change and fingerprint tweak.

catch all for daemons that have no distinguishing fingerprint whatsoever
</description>
<fingerprint pattern="^(?i)(?:([^ ]+) )?E?SMTP(?: (?:Service )?Ready\.?)?$">
<description>catch all for daemons that have no distinguishing fingerprint whatsoever</description>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember there were some concerns for catch-alls. Specifically around ordering and whether we should include them or not. @jhart-r7 might have more info here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catch all isn't new, I've just made the hostname optional for it so that it will matched even shorter non-specific names such as 'ESMTP ready'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be labeled as a catch all but the regular expression is not. It looks like it is intended to only match things that have an optional hostname followed by the string "ESMTP Service Ready" and a few minor variants. A catch-all would do something like match "ESMTP Service Ready" with arbitrary text before and after.

@@ -69,28 +69,43 @@ The system or service fingerprint with the highest certainty overwrites the othe
http://www.argosoft.com/applications/mailserver/
Example: 220 ArGoSoft Mail Server, Version 1.4 (1.4.0.3)
</description>
<param pos="0" name="os.vendor" value="Microsoft"/>
<param pos="0" name="os.family" value="Windows"/>
<param pos="0" name="os.device" value="General"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be adding the "General" fingerprint (here and elsewhere)

@rapid7 rapid7 deleted a comment from tsellers-r7 Dec 4, 2017
- Rename multiple Sendmail fingerprints with the description of "unknown" to
  something descriptive. The allows generating more accurate metrics of which
  FPs matched. Simplification of multiple description lines.

- Tuning Sendmail fingerprints regex and ordering.
<description>IBM Domino SMTP MTA</description>
<example host.name="foo.bar" service.version="9.0.1FP8 HF475">foo.bar ESMTP Service (IBM Domino Release 9.0.1FP8 HF475) ready at Thu, 30 Nov 2017 17:55:48 +0900</example>
<example host.name="foo.bar" service.version="9.0.1">foo.bar ESMTP Service (IBM Domino Release 9.0.1) ready at Thu, 30 Nov 2017 10:12:26 +0100</example>
<example service.version="9.0.1FP8"> ESMTP Service (IBM Domino Release 9.0.1FP8) ready at Thu, 30 Nov 2017 13:51:59 -0800</example>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is host.name in this case? You made that outer group non-capturing and optional, but the inner will capture. I wonder if it is just set to the empty string? Do we care? I think the only solution would be to split these fingerprints up, one with a hostname and one with just spaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went this route to reduce the number of fingerprints. The result is an empty capture (at least in Ruby and (according to an online tester) Java. I'm ok with splitting them if you think it will make it easier to support or more cross-language compatible.

Copy link
Contributor

@jhart-r7 jhart-r7 Dec 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my concern is that the behavior of an empty field in the "fingerprint" that we returns is undefined. It works one way when you use recog as a data source of fingerprints and an api/cli for interacting with them, it may work another way when used in something else, like metasploit-framework or other products. We don't have a good way to control this today, AFAIK.

IMO, split.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this on the phone. Good to go.

@tsellers-r7 tsellers-r7 changed the title smtp_banners: Update using Project Sonar data from 2017.11.30 smtp_banners: Update using Project Sonar data from 2017.11.30 and 2018.04.05 Apr 15, 2018
@tsellers-r7 tsellers-r7 merged commit 18418dd into rapid7:master Apr 16, 2018
@tsellers-r7 tsellers-r7 deleted the smtp_update branch April 16, 2018 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants