Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong reverse dns lookups in shadowserver reports cause a crash of the FQDN-validation #1022

Closed
bernhardreiter opened this issue Jun 27, 2017 · 8 comments
Labels
bug Indicates an unexpected problem or unintended behavior component: bots
Milestone

Comments

@bernhardreiter
Copy link
Contributor

bernhardreiter commented Jun 27, 2017

Right now, if a reverse dns returns something bogus, there is a crash and dump
in sanitize()

Happened with shadowserver opendns resolver, and the hostname field was something like
.example.net

 'source_queue': 'shadowserver-parser-dns-open-resolvers-queue',
 'traceback': ['Traceback (most recent call last):',
               '  File "/usr/lib/python3.4/encodings/idna.py", line 165, in '
               'encode',
               '    raise UnicodeError("label empty or too long")',
               'UnicodeError: label empty or too long',
               '',
               'The above exception was the direct cause of the following '
               'exception:',
               '',
               'Traceback (most recent call last):',
               '  File "/usr/lib/python3/dist-packages/intelmq/lib/bot.py", '
               'line 606, in process',
               '    events = list(filter(bool, self.parse_line(line, 
report)))',
               '  File '
               '"/usr/lib/python3/dist-packages/intelmq/bots/parsers/shadowserver/parser.py", '
               'line 150, in parse_line',
               '    event.add(intelmqkey, value)',
               '  File '
               '"/usr/lib/python3/dist-packages/intelmq/lib/message.py", 
line '
               '202, in add',
               '    value = self.__sanitize_value(key, value)',
               '  File '
               '"/usr/lib/python3/dist-packages/intelmq/lib/message.py", 
line '
               '298, in __sanitize_value',
               '    return class_reference().sanitize(value)',
               '  File '
               '"/usr/lib/python3/dist-packages/intelmq/lib/harmonization.py", '
               'line 374, in sanitize',
               "    return value.encode('idna').decode().lower()",
               "UnicodeError: encoding with 'idna' codec failed 
(UnicodeError: "
               'label empty or too long)']}
@ghost ghost added bug Indicates an unexpected problem or unintended behavior component: bots labels Jun 28, 2017
@ghost ghost added this to the v1.0 Stable Release milestone Jun 28, 2017
@dmth
Copy link
Contributor

dmth commented Jun 28, 2017

harmonization.py does this, to validate "FQDN" -Data. As you can see, leading dots are not stripped.

    @staticmethod
    def sanitize(value):
        value = value.rstrip('.')
        if value:
            return value.encode('idna').decode().lower()

Minimalistic example to reproduce the error:

>>> badurl = ".example.com"
>>> goodurl = "example.com"

>>> goodurl.encode("idna")
b'example.com'

>>> badurl.encode("idna")
Traceback (most recent call last):
  File "/usr/lib/python3.4/encodings/idna.py", line 165, in encode
    raise UnicodeError("label empty or too long")
UnicodeError: label empty or too long

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeError: encoding with 'idna' codec failed (UnicodeError: label empty or too long)

The data in the report is simply wrong. But how to deal with this?

Reject the dataset (the call of is_valid()) should fail now)?

    @staticmethod
    def sanitize(value):
        value = value.rstrip('.')
        if value[0] == ".":
            return False
        if value:
            return value.encode('idna').decode().lower()

Or alter and sanitize the dataset?

    @staticmethod
    def sanitize(value):
        value = value.rstrip('.')
        value = value.lstrip('.')
        if value:
            return value.encode('idna').decode().lower()

Is an indication of this alteration necessary?

@dmth dmth changed the title leave hostname empty, if the reverse dns lookup produces wrong reverse dns lookups in shadowserver reports cause a crash of the FQDN-validation Jun 30, 2017
@dmth
Copy link
Contributor

dmth commented Jun 30, 2017

This issue shall only focus on a solution for the Shadowserver-Parser, for instance by creating a validatior in config.py

A universal solution for this problem is pursued in #1030

@ghost
Copy link

ghost commented Jul 3, 2017

IMHO we can strip the leading dot always. See also #369 for the discussion on trailing dot

Ping @aaronkaplan

EDIT: s/zero/dot/ :/

@bernhardreiter
Copy link
Contributor Author

@wagner-certat you are refering to the leading dot (.)?

@ghost
Copy link

ghost commented Jul 3, 2017

@bernhardreiter yes :)

@dmth
Copy link
Contributor

dmth commented Jul 6, 2017

We are going to work on that issue, first as validator in the shadowserver-config.
If we create a PR for that , we'll create it against the master branch, as it is a bugfix.

dmth pushed a commit to Intevation/intelmq that referenced this issue Jul 6, 2017
Extends the validate_fqdn method by removing trailing dots from
the fqdn.
This fixes certtools#1022 on the
shadowserver-parser config level.

All "hostname" carrying values are now validated using this function.

Nevertheless this issue should still be discussed as it is present for all
other feeds. See certtools#1030 for the
general solution.
dmth pushed a commit to Intevation/intelmq that referenced this issue Jul 6, 2017
Extends the validate_fqdn method by removing trailing dots from
the fqdn.
This fixes certtools#1022 on the
shadowserver-parser config level.

All "hostname" carrying values are now validated using this function.

Nevertheless this issue should still be discussed as it is present for all
other feeds. See certtools#1030 for the
general solution.
@ghost
Copy link

ghost commented Jul 6, 2017 via email

@dmth
Copy link
Contributor

dmth commented Jul 6, 2017

I'll push a commit fixing this to both branches of course.

Great I'm looking forward to it.

@ghost ghost closed this as completed in ab3b31e Jul 6, 2017
ghost pushed a commit that referenced this issue Jul 6, 2017
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior component: bots
Projects
None yet
Development

No branches or pull requests

2 participants