Skip to content
Permalink
Browse files

BUG: lib/harm & bots: Use new RSIT mapping

see CHANGELOG and NEWS

fixes #1380

  - replace `botnet drone` with `infected-system`
  - replace `infected system` with `infected-system`
  - replace `ids alert` with `ids-alert`
  - replace `c&c` with `c2server`
  - replace `malware configuration` with `malware-configuration`
  • Loading branch information...
wagner-certat committed May 14, 2019
1 parent 9d4f62b commit e25cf7cce6b243bcd16c1ee6783dc03e62580918
Showing with 257 additions and 176 deletions.
  1. +8 −0 CHANGELOG.md
  2. +28 −0 NEWS.md
  3. +2 −2 docs/Feeds.md
  4. +11 −5 docs/Harmonization-fields.md
  5. +2 −2 intelmq/bots/BOTS
  6. +6 −7 intelmq/bots/experts/idea/expert.py
  7. +4 −5 intelmq/bots/experts/taxonomy/expert.py
  8. +1 −1 intelmq/bots/parsers/abusech/parser_domain.py
  9. +1 −1 intelmq/bots/parsers/abusech/parser_ip.py
  10. +2 −2 intelmq/bots/parsers/abusech/parser_ransomware.py
  11. +1 −1 intelmq/bots/parsers/alienvault/parser.py
  12. +2 −2 intelmq/bots/parsers/bambenek/parser.py
  13. +7 −7 intelmq/bots/parsers/blocklistde/parser.py
  14. +1 −1 intelmq/bots/parsers/blueliv/parser_crimeserver.py
  15. +4 −4 intelmq/bots/parsers/cert_eu/parser_csv.py
  16. +5 −5 intelmq/bots/parsers/cymru/parser_cap_program.py
  17. +2 −2 intelmq/bots/parsers/fraunhofer/parser_ddosattack_cnc.py
  18. +1 −1 intelmq/bots/parsers/fraunhofer/parser_dga.py
  19. +4 −4 intelmq/bots/parsers/mcafee/parser_atd.py
  20. +1 −1 intelmq/bots/parsers/microsoft/parser_ctip.py
  21. +4 −4 intelmq/bots/parsers/misp/parser.py
  22. +4 −4 intelmq/bots/parsers/n6/parser_n6stomp.py
  23. +1 −1 intelmq/bots/parsers/netlab_360/parser.py
  24. +4 −4 intelmq/bots/parsers/shadowserver/config.py
  25. +3 −3 intelmq/bots/parsers/spamhaus/parser_cert.py
  26. +2 −2 intelmq/bots/parsers/taichung/parser.py
  27. +2 −2 intelmq/etc/feeds.yaml
  28. +26 −5 intelmq/lib/harmonization.py
  29. +1 −1 intelmq/tests/bots/collectors/tcp/test_collector.py
  30. +2 −2 intelmq/tests/bots/experts/generic_db_lookup/test_expert.py
  31. +1 −1 intelmq/tests/bots/experts/idea/test_expert.py
  32. +1 −1 intelmq/tests/bots/experts/modify/test_expert.py
  33. +1 −1 intelmq/tests/bots/experts/sieve/test_expert.py
  34. +2 −2 intelmq/tests/bots/experts/sieve/test_sieve_files/test_string_match_value_list.sieve
  35. +5 −5 intelmq/tests/bots/outputs/elasticsearch/test_output.py
  36. +2 −2 intelmq/tests/bots/outputs/mongodb/test_output.py
  37. +1 −1 intelmq/tests/bots/outputs/postgresql/test_output.py
  38. +1 −1 intelmq/tests/bots/parsers/abusech/test_parser_domain.py
  39. +1 −1 intelmq/tests/bots/parsers/abusech/test_parser_ip.py
  40. +1 −1 intelmq/tests/bots/parsers/abusech/test_parser_ip_zeus.py
  41. +4 −4 intelmq/tests/bots/parsers/abusech/test_parser_ransomware.py
  42. +2 −2 intelmq/tests/bots/parsers/bambenek/test_parser.py
  43. +1 −1 intelmq/tests/bots/parsers/blocklistde/test_parser.py
  44. +4 −4 intelmq/tests/bots/parsers/cymru/test_cap_program.py
  45. +4 −4 intelmq/tests/bots/parsers/cymru/test_cap_program_new.py
  46. +1 −1 intelmq/tests/bots/parsers/fraunhofer/test_parser_ddosattack_cnc.py
  47. +1 −1 intelmq/tests/bots/parsers/fraunhofer/test_parser_dga.py
  48. +4 −4 intelmq/tests/bots/parsers/generic/test_parser_csv4.py
  49. +2 −2 intelmq/tests/bots/parsers/generic/test_parser_csv_data_type.py
  50. +3 −3 intelmq/tests/bots/parsers/generic/test_parser_csv_extra_regex.py
  51. +3 −3 intelmq/tests/bots/parsers/generic/test_parser_multivalue_cols.py
  52. +2 −2 intelmq/tests/bots/parsers/json/data.json
  53. +1 −1 intelmq/tests/bots/parsers/json/data2.json
  54. +5 −5 intelmq/tests/bots/parsers/json/test_parser.py
  55. +1 −1 intelmq/tests/bots/parsers/mcafee/test_parser_atd.py
  56. +5 −5 intelmq/tests/bots/parsers/microsoft/test_parser_ctip.py
  57. +2 −2 intelmq/tests/bots/parsers/n6/test_parser.py
  58. +2 −2 intelmq/tests/bots/parsers/netlab_360/test_parser.py
  59. +8 −8 intelmq/tests/bots/parsers/shadowserver/test_drone_hadoop.py
  60. +14 −14 intelmq/tests/bots/parsers/shadowserver/test_microsoft_sinkhole.py
  61. +3 −3 intelmq/tests/bots/parsers/shadowserver/test_sinkhole6_http.py
  62. +2 −2 intelmq/tests/bots/parsers/shadowserver/test_sinkhole_http_drone.py
  63. +5 −5 intelmq/tests/bots/parsers/spamhaus/test_parser_cert.py
  64. +20 −0 intelmq/tests/lib/test_harmonization.py
@@ -13,13 +13,21 @@ CHANGELOG
- Use `statistics_*` parameters for bot's statistics (#1402).
- Introduce `collector_empty_process` for collectors with an empty `process()` method, hardcoded 1s minimum sleep time, preventing endless loops, causing high load (#1364).
- `intelmq.lib.pipeline`: redis: OOM can also be low memory, add this to log message (#1405).
- `intelmq.lib.harmonization`: ClassificationType: Update RSIT mapping (#1380):
- replace `botnet drone` with `infected-system`
- replace `infected system` with `infected-system`
- replace `ids alert` with `ids-alert`
- replace `c&c` with `c2server`
- replace `malware configuration` with `malware-configuration`
- sanitize replaces these values on the fly

### Development
- Applied isort to all core files and core-related test files, sorting the imports there (every thing except bots and bots' tests).

### Harmonization

### Bots
- Use the new RSIT types in several types, see above
#### Collectors

#### Parsers
28 NEWS.md
@@ -11,6 +11,12 @@ See the changelog for a full list of changes.
### Tools

### Harmonization
The allowed values for the `classification.type` field have been updated to the RSIT mapping. These values have changed and are automatically mapped:
- `botnet drone` with `infected-system`
- `infected system` with `infected-system`
- `ids alert` with `ids-alert`
- `c&c` with `c2server`
- `malware configuration` with `malware-configuration`

### Configuration
Four new values have been introduced to configure the statistics database. Add them to your `defaults.conf` file:
@@ -22,6 +28,28 @@ Four new values have been introduced to configure the statistics database. Add t
### Libraries

### Postgres databases
The following statements optionally update existing data.
Please check if you did use these feed names and eventually adapt them for your setup!
```SQL
UPDATE events
SET "classification.type" = 'infected-system'
WHERE "classification.type" = 'botnet drone';
UPDATE events
SET "classification.type" = 'infected-system'
WHERE "classification.type" = 'infected system';
UPDATE events
SET "classification.type" = 'ids-alert'
WHERE "classification.type" = 'ids alert';
UPDATE events
SET "classification.type" = 'c2server'
WHERE "classification.type" = 'c&c';
UPDATE events
SET "classification.type" = 'malware-configuration'
WHERE "classification.type" = 'malware configuration';
```

In the section for 1.1.0 there was this command:
```
2.0.0.beta1 release (2019-04-10)
@@ -81,7 +81,7 @@ To add feeds to this file add them to `intelmq/etc/feeds.yaml` and then run
* * `columns`: `['time.source', 'source.ip', 'malware.name', 'status', 'extra.SBL', 'source.as_name', 'source.geolocation.cc']`
* * `ignore_values`: `['', '', '', '', 'Not listed', '', '']`
* * `skip_table_head`: `True`
* * `type`: `c&c`
* * `type`: `c2server`


## Feodo Tracker IPs
@@ -750,7 +750,7 @@ To add feeds to this file add them to `intelmq/etc/feeds.yaml` and then run
* * `columns`: `['time.source', 'source.url', 'source.ip', 'malware.name', '__IGNORE__']`
* * `default_url_protocol`: `http://`
* * `skip_table_head`: `True`
* * `type`: `c&c`
* * `type`: `c2server`


# DShield
@@ -130,14 +130,20 @@ Reference Security Incident Taxonomy Working Group – RSIT WG
https://github.com/enisaeu/Reference-Security-Incident-Taxonomy-Task-Force/
with extensions.

These old values are automatically mapped to the new ones:
'botnet drone' -> 'infected-system'
'ids alert' -> 'ids-alert'
'c&c' -> 'c2server'
'infected system' -> 'infected-system'
'malware configuration' -> 'malware-configuration'

Allowed values are:
* application-compromise
* backdoor
* blacklist
* botnet drone
* brute-force
* burglary
* c&c
* c2server
* compromised
* copyright
* data-loss
@@ -149,12 +155,12 @@ Allowed values are:
* dropzone
* exploit
* harmful-speech
* ids alert
* infected system
* ids-alert
* infected-system
* information-disclosure
* leak
* malware
* malware configuration
* malware-configuration
* malware-distribution
* masquerade
* other
@@ -449,7 +449,7 @@
"filter_type": null,
"skip_header": true,
"time_format": null,
"type": "c&c",
"type": "c2server",
"type_translation": null
}
},
@@ -474,7 +474,7 @@
"split_index": 0,
"default_url_protocol": "http://",
"time_format": null,
"type": "c&c"
"type": "c2server"
}
},
"HpHosts": {
@@ -30,15 +30,14 @@ class IdeaExpertBot(Bot):
"spam": "Abusive.Spam",
"scanner": "Recon.Scanning",
"dropzone": "Information.UnauthorizedAccess",
"infected system": "Malware",
"malware configuration": "Malware",
"botnet drone": "Malware",
"infected-system": "Malware",
"malware-configuration": "Malware",
"ransomware": "Malware",
"malware": "Malware",
"c&c": "Intrusion.Botnet",
"c2server": "Intrusion.Botnet",
"exploit": "Attempt.Exploit",
"brute-force": "Attempt.Login",
"ids alert": "Attempt.Exploit",
"ids-alert": "Attempt.Exploit",
"defacement": "Intrusion.AppCompromise",
"compromised": "Intrusion.AdminCompromise",
"backdoor": "Intrusion.AdminCompromise",
@@ -84,8 +83,8 @@ class IdeaExpertBot(Bot):

"phishing": "Phishing",
"dropzone": "Dropzone",
"malware configuration": "MalwareConf",
"c&c": "CC",
"malware-configuration": "MalwareConf",
"c2server": "CC",
"dga domain": "DGA",
"proxy": "Proxy",
"tor": "Tor",
@@ -34,7 +34,7 @@
"social-engineering": "information-gathering",
"brute-force": "intrusion attempts",
"exploit": "intrusion attempts",
"ids alert": "intrusion attempts", # ENISA eCSIRT-II taxonomy: 'ids-alert'
"ids-alert": "intrusion attempts",
"application-compromise": "intrusions",
"backdoor": "intrusions", # not in ENISA eCSIRT-II taxonomy
"burglary": "intrusions",
@@ -44,12 +44,11 @@
"unauthorized-command": "intrusions", # not in ENISA eCSIRT-II taxonomy
"unauthorized-login": "intrusions", # not in ENISA eCSIRT-II taxonomy
"unprivileged-account-compromise": "intrusions",
"botnet drone": "malicious code", # not in ENISA eCSIRT-II taxonomy, deprecated -> infected system
"c&c": "malicious code", # ENISA eCSIRT-II taxonomy: 'c2server'
"c2server": "malicious code", # ENISA eCSIRT-II taxonomy: 'c2server'
"dga domain": "malicious code", # not in ENISA eCSIRT-II taxonomy
"infected system": "malicious code", # ENISA eCSIRT-II taxonomy: 'infected-system'
"infected-system": "malicious code", # ENISA eCSIRT-II taxonomy: 'infected-system'
"malware": "malicious code", # not in ENISA eCSIRT-II taxonomy
"malware configuration": "malicious code", # ENISA eCSIRT-II taxonomy: 'malware-configuration'
"malware-configuration": "malicious code", # ENISA eCSIRT-II taxonomy: 'malware-configuration'
"malware-distribution": "malicious code",
"ransomware": "malicious code", # not in ENISA eCSIRT-II taxonomy
"blacklist": "other",
@@ -32,7 +32,7 @@ def parse_line(self, line, report):
event = self.new_event(report)
event.add('time.source', self.lastgenerated)
event.add('classification.taxonomy', 'malicious code')
event.add('classification.type', 'c&c')
event.add('classification.type', 'c2server')
event.add('source.fqdn', line)
event.add("raw", line)
event.add("malware.name", SOURCE_FEEDS[report["feed.url"]])
@@ -74,7 +74,7 @@ def __process_defaults(self, event, line, feed_url):
defaults = {
('malware.name', FEEDS[feed_url]['malware']),
('raw', line),
('classification.type', 'c&c'),
('classification.type', 'c2server'),
('classification.taxonomy', 'malicious code'),
('extra.feed_last_generated', self.__last_generated_date)
}
@@ -38,7 +38,7 @@ def process(self):
for nrow in csv.reader(io.StringIO(new_row)):
ev = Event(report)
ev.add('classification.taxonomy', 'malicious code')
ev.add('classification.type', 'c&c')
ev.add('classification.type', 'c2server')
ev.add('classification.identifier', nrow[2].lower())
ev.add('time.source', nrow[0] + ' UTC', overwrite=True)
ev.add('status', nrow[5])
@@ -51,7 +51,7 @@ def process(self):
else:
event = Event(report)
event.add('classification.taxonomy', 'malicious code')
event.add('classification.type', 'c&c')
event.add('classification.type', 'c2server')
event.add('classification.identifier', row[2].lower())
event.add('time.source', row[0] + ' UTC')
event.add('status', row[5])
@@ -3,7 +3,7 @@
from intelmq.lib.bot import ParserBot

CLASSIFICATION = {
"c&c": "c&c",
"c2server": "c2server",
"scanning host": "scanner",
"malicious host": "malware",
"spamming": "spam",
@@ -46,13 +46,13 @@ def parse_line(self, line, report):
if report['feed.url'] in BambenekParserBot.IPMASTERLIST:
event.add('source.ip', value[0])
event.add('time.source', value[2] + ' UTC')
event.add('classification.type', 'c&c')
event.add('classification.type', 'c2server')
event.add('status', 'online')

elif report['feed.url'] in BambenekParserBot.DOMMASTERLIST:
event.add('source.fqdn', value[0])
event.add('time.source', value[2] + ' UTC')
event.add('classification.type', 'c&c')
event.add('classification.type', 'c2server')
event.add('status', 'online')

elif report['feed.url'] in BambenekParserBot.DGA_FEED:
@@ -9,37 +9,37 @@
"classification.type": "blacklist",
},
"ssh.txt": {
"classification.type": "ids alert",
"classification.type": "ids-alert",
"protocol.application": "ssh",
"event_description.text": "IP reported as having run attacks on the "
"service SSH",
},
"mail.txt": {
"classification.type": "ids alert",
"classification.type": "ids-alert",
"protocol.application": "smtp",
"event_description.text": "IP reported as having run attacks on the "
"service Mail, Postfix",
},
"apache.txt": {
"classification.type": "ids alert",
"classification.type": "ids-alert",
"protocol.application": "http",
"event_description.text": "IP reported as having run attacks on the "
"service Apache, Apache-DDoS, RFI-Attacks",
},
"imap.txt": {
"classification.type": "ids alert",
"classification.type": "ids-alert",
"protocol.application": "imap",
"event_description.text": "IP reported as having run attacks on the "
"service IMAP, SASL, POP3",
},
"ftp.txt": {
"classification.type": "ids alert",
"classification.type": "ids-alert",
"protocol.application": "ftp",
"event_description.text": "IP reported as having run attacks on the "
"service FTP",
},
"sip.txt": {
"classification.type": "ids alert",
"classification.type": "ids-alert",
"protocol.application": "sip",
"event_description.text": "IP reported as having run attacks on the "
"service SIP, VOIP, Asterisk",
@@ -55,7 +55,7 @@
"2 months",
},
"ircbot.txt": {
"classification.type": "infected system",
"classification.type": "infected-system",
"protocol.application": "irc",
},
"bruteforcelogin.txt": {
@@ -13,7 +13,7 @@
'EXPLOIT_KIT': 'exploit',
'BACKDOOR': 'backdoor',
'TOR_IP': 'proxy',
'C_AND_C': 'c&c'
'C_AND_C': 'c2server'
}


@@ -19,17 +19,17 @@ class CertEUCSVParserBot(ParserBot):
abuse_to_intelmq = defaultdict(lambda: "unknown", {
"backdoor": "backdoor",
"blacklist": "blacklist",
"botnet drone": "botnet drone",
"botnet drone": "infected-system",
"brute-force": "brute-force",
"c&c": "c&c",
"c2server": "c2server",
"compromised server": "compromised",
"ddos infrastructure": "ddos",
"ddos target": "ddos",
"defacement": "defacement",
"dropzone": "dropzone",
"exploit url": "exploit",
"ids alert": "ids alert",
"malware configuration": "malware configuration",
"ids alert": "ids-alert",
"malware-configuration": "malware-configuration",
"malware url": "malware",
"phishing": "phishing",
"ransomware": "ransomware",
@@ -3,11 +3,11 @@
from intelmq.lib.bot import ParserBot

MAPPING_STATIC = {'bot': {
'classification.type': 'infected system'},
'classification.type': 'infected-system'},
'bruteforce': {
'classification.type': 'brute-force'},
'controller': {
'classification.type': 'c&c'},
'classification.type': 'c2server'},
'darknet': {'classification.type': 'scanner',
'classification.identifier': 'darknet'},
'phishing': {'classification.type': 'phishing',
@@ -76,7 +76,7 @@ def parse_line(self, line, report):
elif report_type == 'bots':
# bots|192.0.2.1|ASN|YYYY-MM-DD HH:MM:SS|[srcport <PORT>] [mwtype <TYPE>] [destaddr <IPADDR>] [comment]|ASNAME
# TYPE can contain spaces -.-
event.add('classification.type', 'infected system')
event.add('classification.type', 'infected-system')
comment_results = {}
comment_key = None
comment_value = []
@@ -113,7 +113,7 @@ def parse_line(self, line, report):
# ddosreport|192.0.2.1|ASN|YYYY-MM-DD HH:MM:SS|[<PROTOCOL> <PORT>] [category: <CATEGORY>]
# [servpass: <PASSWORD>] [SSL] [url: <URL>]|ASNAME
raise NotImplementedError('Report %r not implemented, format is unknown.' % report_type)
event['classification.type'] = 'c&c'
event['classification.type'] = 'c2server'
event['protocol.application'] = comment_split[0]
event['source.port'] = comment_split[1]
# TODO: category? password? ssl?
@@ -198,7 +198,7 @@ def parse_line(self, line, report):
break
elif report_type == 'toxbot': # TODO: verify
# toxbot|192.0.2.1|ASN|YYYY-MM-DD HH:MM:SS|srcport <SOURCE PORT>|ASNAME
event.add('classification.type', 'infected system')
event.add('classification.type', 'infected-system')
event.add('classification.identifier', report_type)
event.add('malware.name', report_type)
event['extra.source_port'] = int(comment_split[1])
@@ -2,7 +2,7 @@
"""
The source provides a stream/list of newline separated JSON objects. Each line
represents a single event observed by a DDoS C&C tracker, like an attack
command. This parser emits a c&c event for the C&C tracked server the
command. This parser emits a c2server event for the C&C tracked server the
observed event originated from. If the bot receives a report with a known
C&C type but with an unknown message type, it generates a C&C event with a
feed.accuracy given by the parameter unknown_messagetype_accuracy, if set.
@@ -27,7 +27,7 @@ def __parse_cnc_server(self, message, line, report):
'unsupported cnctype %s.' % message['cnctype'])

event = self.__new_event(message, line, report)
event.add('classification.type', 'c&c')
event.add('classification.type', 'c2server')
event.add('classification.taxonomy', 'malicious code')
event.add('source.fqdn', message['cnc']['domain'])
event.add('source.ip', message['cnc']['ip'])
Oops, something went wrong.

0 comments on commit e25cf7c

Please sign in to comment.
You can’t perform that action at this time.