Add Hadoop Fingerprints #715

wants to merge 3 commits into


None yet
4 participants

Varunram commented Mar 1, 2017

With reference to a comment on #620 . Intended for easy reviewing

@nnposter Mind taking a look at these? Be sure to credit @maaaaz who developed the original probes. Thanks!

nnposter commented Mar 3, 2017

This will need a bit of touch-up:

  • category should not be info (perhaps a confusion with severity?) but an existing category, maybe management.
  • A typo: logn.jsp instead of login.jsp here
  • A minor issue: Dots should be escaped here
  • A typo: Cloduera here
  • Single quotes need to be escaped here
  • There are missing commas here and here
  • For the last two fingerprints it might be worth rearranging the matchers. As an example, it is more interesting to report the version than the state.

Please let me know if you have any questions or concerns.

Varunram commented Mar 3, 2017

@nnposter Made the required changes

nnposter commented Mar 3, 2017 edited

The singe-quote escaping is still incorrect for two reasons:

  • There are two instances of single quotes inside the string, not just one. Both need to be escaped.
  • Using % assures that the next character is not going to be interpreted as a pattern meta-character but this is not what you need here. Instead you need to make sure that the Lua parser correctly identifies the string boundaries so the escaping character is \, not %. See this example:
$ lua
Lua 5.2.3  Copyright (C) 1994-2013, PUC-Rio
> foo='aaa%'bbb%'ccc'
stdin:1: syntax error near '%'

Let's also try to tighten the Cloudera Manager pattern to make it more future-proof against false-positives. The relevant payload to be parsed is:

<script type="text/javascript">
var clouderaManager = {
    version: '5.7.0',
    state: '0',
    license: 'no uuid',

So let's:

  • Anchor the pattern on the assignment var clouderaManager, not just clouderaManager, just like Thomas originally had it.
  • Restrict the potential overrun due to .* in front of version. Otherwise we would be attempting to grab the last occurrence of version in the entire response.

The end-result would be something like:


For the YARN patterns we might be grabbing too much of the versioning information.

2.6.0-cdh5.9.0 from 69e4e0d951319fea693402c9f82449447fd00a17  by jenkins source checksum 63b7a782ae4cd2a7918821206cd0a0a6 on 2016-10-21T08:28Z

If we stop at the version itself, not including the source of the version, then we could collect and combine all three pieces of information, namely the RM version, state, and Hadoop version, still producing a relatively compact one-liner:

8088/tcp open  radan-http syn-ack
| http-enum:
|_  /cluster/cluster: Hadoop YARN Resource Manager version 2.6.0-cdh5.9.0, state "started", Hadoop version 2.6.0-cdh5.9.0

I am not sure if we should care for the full version. Anybody believes otherwise?

The implementation could look something like:

        match = 'ResourceManager state:.-<td>%s*([^%s<]*)'
                .. '.-ResourceManager version:.-<td>%s*([^%s<]*)'
                .. '.-Hadoop version:.-<td>%s*([^%s<]*)',
        output = 'Hadoop YARN Resource Manager version \\2,'
                 .. ' state "\\1", Hadoop version \\3'

A similar extraction pattern could be used for Node Manager. I would also drop the following part entirely. It is redundant:

        match = '<h3>%s*NodeManager%s*</h3>',
        output = 'Hadoop YARN Node Manager WebUI'

Varunram commented Mar 3, 2017 edited

Thanks for the update! Updating PR shortly

nnposter commented Mar 3, 2017

I did not intend to say that you should just replace "Resource" with "Node". The newly updated Node Manager pattern is no longer matching the literal strings observable in Thomas' scripts, such as "NodeManager" vs. "Node Manager" or the absence of the state.

I have made those adjustments and I am committing the fingerprints shortly.
Thank you and @maaaaz for your hard work on this.

nmap-bot closed this in fe622e1 Mar 4, 2017

Varunram deleted the Varunram:fingerprints branch Mar 4, 2017

maaaaz commented Mar 4, 2017

Thank you @nnposter and @Varunram !

maaaaz commented May 4, 2017

Sorry to re-open that conversation but it could be a good idea to take the old hadoop modules and integrate the probes into the http-enum script, as they are just fingerprinting from HTTP headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment