From e28e24749f93fc39298523ece695b1993a781fef Mon Sep 17 00:00:00 2001 From: cstella Date: Thu, 5 Oct 2017 16:08:28 -0400 Subject: [PATCH 1/4] Added more detailed documentation around global configs. --- metron-platform/metron-common/README.md | 26 +++++++++++++- .../metron-elasticsearch/README.md | 34 +++++++++++++++++++ metron-platform/metron-indexing/README.md | 13 +++++++ metron-platform/metron-parsers/README.md | 17 +++++++++- 4 files changed, 88 insertions(+), 2 deletions(-) create mode 100644 metron-platform/metron-elasticsearch/README.md diff --git a/metron-platform/metron-common/README.md b/metron-platform/metron-common/README.md index 54738f8230..725c191dc9 100644 --- a/metron-platform/metron-common/README.md +++ b/metron-platform/metron-common/README.md @@ -47,7 +47,7 @@ This configuration is stored in zookeeper, but looks something like "es.ip": "node1", "es.port": "9300", "es.date.format": "yyyy.MM.dd.HH", - "parser.error.topic": "indexing" + "parser.error.topic": "indexing", "fieldValidations" : [ { "input" : [ "ip_src_addr", "ip_dst_addr" ], @@ -60,6 +60,30 @@ This configuration is stored in zookeeper, but looks something like } ``` +Various parts of our stack uses the global config are documented throughout the Metron documentation, +but a convenient index is provided here: + +| Property Name | Subsystem | Type | Ambari Property | +|---------------------------------------------------------------------------------------------------------------|---------------|------------|----------------------------| +| [`es.clustername`](metron-platform/metron-elasticsearch#esclustername) | Indexing | String | `es_cluster_name` | +| [`es.ip`](metron-platform/metron-elasticsearch#esip) | Indexing | String | `es_hosts` | +| [`es.port`](metron-platform/metron-elasticsearch#esport) | Indexing | String | `es_port` | +| [`es.date.format`](metron-platform/metron-elasticsearch#esdateformat) | Indexing | String | `es_date_format` | +| [`fieldValidations`](#validation-framework) | Parsing | Object | N/A | +| [`parser.error.topic`](metron-platform/metron-parsers#parsererrortopic) | Parsing | String | N/A | +| [`stellar.function.paths`](metron-stellar/stellar-common#stellarfunctionpaths) | Stellar | CSV String | N/A | +| [`stellar.function.resolver.includes`](metron-stellar/stellar-common#stellarfunctionresolverincludesexcludes) | Stellar | CSV String | N/A | +| [`stellar.function.resolver.excludes`](metron-stellar/stellar-common#stellarfunctionresolverincludesexcludes) | Stellar | CSV String | N/A | +| [`profiler.period.duration`](metron-analytics/metron-profiler#profilerperiodduration) | Profiler | Integer | `profiler_period_duration` | +| [`profiler.period.duration.units`](metron-analytics/metron-profiler#profilerperioddurationunits) | Profiler | String | `profiler_period_units` | +| [`update.hbase.table`](metron-platform/metron-indexing#updatehbasetable) | REST/Indexing | String | `update_hbase_table` | +| [`update.hbase.cf`](metron-platform/metron-indexing#updatehbasecf) | REST/Indexing | String | `update_hbase_cf` | + +## Note Configs in Ambari +If a field is managed via ambari, you should change the field via +ambari. Otherwise, upon service restarts, you may find your update +overwritten. + # Validation Framework Inside of the global configuration, there is a validation framework in diff --git a/metron-platform/metron-elasticsearch/README.md b/metron-platform/metron-elasticsearch/README.md new file mode 100644 index 0000000000..0808769332 --- /dev/null +++ b/metron-platform/metron-elasticsearch/README.md @@ -0,0 +1,34 @@ +# Contents + +Elasticsearch is a very popular indexing target for Metron. In order to +configure Elasticsearch, there are a few properties that one must set up in the global +configuration. + +## Properties + +### `es.clustername` + +The name of the elasticsearch Cluster. See [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#cluster.name) + +### `es.ip` + +Specifies the nodes in the elasticsearch cluster to use for writing. +The format is one of the following: +* A hostname or IP address with a port (e.g. `hostname1:1234`), in which case `es.port` is ignored. +* A hostname or IP address without a port (e.g. `hostname1`), in which case `es.port` is used. +* A string containing a CSV of hostnames without ports (e.g. `hostname1,hostname2,hostname3`) without spaces between. `es.port` is assumed to be the port for each host. +* A string containing a CSV of hostnames with ports (e.g. `hostname1:1234,hostname2:1234,hostname3:1234`) without spaces between. `es.port` is ignored. +* A list of hostnames with ports (e.g. `[ "hostname1:1234", "hostname2:1234"]`). Note, `es.port` is NOT used in this construction. + +### `es.port` + +The port for the elasticsearch hosts. This will be used in accordance with the discussion of `es.ip`. + +### `es.date.format` + +The date format to use when constructing the indices. For every message, the date format will be applied +to the current time and that will become the last part of the index name where the message is written to. + +For instance, an `es.date.format` of `yyyy.MM.dd.HH` would have the consequence that the indices would +roll hourly, whereas an `es.date.format` of `yyyy.MM.dd` would have the consequence that the indices would +roll daily. diff --git a/metron-platform/metron-indexing/README.md b/metron-platform/metron-indexing/README.md index e65152c71f..9a682eb4c2 100644 --- a/metron-platform/metron-indexing/README.md +++ b/metron-platform/metron-indexing/README.md @@ -146,6 +146,19 @@ in parallel. This enables a flexible strategy for specifying your backing store For instance, currently the REST API supports the update functionality and may be configured with a list of IndexDao implementations to use to support the updates. +### The `HBaseDao` + +Updates will be written to HBase. The key structure is the GUID and +for each new version, a new column is created with value as the message. + +The HBase table and column family are configured via fields in the global configuration. + +#### `update.hbase.table` +The HBase table to use for message updates. + +#### `update.hbase.cf` +The HBase column family to use for message updates. + ### The `MetaAlertDao` The goal of meta alerts is to be able to group together a set of alerts while being able to transparently perform actions diff --git a/metron-platform/metron-parsers/README.md b/metron-platform/metron-parsers/README.md index 141e232d03..ef2d48fa24 100644 --- a/metron-platform/metron-parsers/README.md +++ b/metron-platform/metron-parsers/README.md @@ -76,7 +76,22 @@ So putting it all together a typical Metron message with all 5-tuple fields pres ## Global Configuration -See the "[Global Configuration](../metron-common)" section. +There are a few properties which can be managed in the global configuration that have pertinence to +parsers and parsing in general. + +### `parser.error.topic` + +The topic where messages which were unable to be parsed due to error are sent. +Error messages will be indexed under a sensor type of `error` and the messages will have +the following fields: +* `sensor.type`: `error` +* `failed_sensor_type` : The sensor type of the message which wasn't able to be parsed +* `error_type` : The error type, in this case `parser`. +* `stack` : The stack trace of the error +* `hostname` : The hostname of the node where the error happened +* `raw_message` : The raw message in string form +* `raw_message_bytes` : The raw message bytes +* `error_hash` : A hash of the error message ## Parser Configuration From a8570271b1b747806ca27c9741a331f8dad00c28 Mon Sep 17 00:00:00 2001 From: cstella Date: Thu, 5 Oct 2017 16:18:10 -0400 Subject: [PATCH 2/4] Fixed links. --- metron-platform/metron-common/README.md | 30 ++++++++++++------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/metron-platform/metron-common/README.md b/metron-platform/metron-common/README.md index 725c191dc9..b10841b159 100644 --- a/metron-platform/metron-common/README.md +++ b/metron-platform/metron-common/README.md @@ -63,21 +63,21 @@ This configuration is stored in zookeeper, but looks something like Various parts of our stack uses the global config are documented throughout the Metron documentation, but a convenient index is provided here: -| Property Name | Subsystem | Type | Ambari Property | -|---------------------------------------------------------------------------------------------------------------|---------------|------------|----------------------------| -| [`es.clustername`](metron-platform/metron-elasticsearch#esclustername) | Indexing | String | `es_cluster_name` | -| [`es.ip`](metron-platform/metron-elasticsearch#esip) | Indexing | String | `es_hosts` | -| [`es.port`](metron-platform/metron-elasticsearch#esport) | Indexing | String | `es_port` | -| [`es.date.format`](metron-platform/metron-elasticsearch#esdateformat) | Indexing | String | `es_date_format` | -| [`fieldValidations`](#validation-framework) | Parsing | Object | N/A | -| [`parser.error.topic`](metron-platform/metron-parsers#parsererrortopic) | Parsing | String | N/A | -| [`stellar.function.paths`](metron-stellar/stellar-common#stellarfunctionpaths) | Stellar | CSV String | N/A | -| [`stellar.function.resolver.includes`](metron-stellar/stellar-common#stellarfunctionresolverincludesexcludes) | Stellar | CSV String | N/A | -| [`stellar.function.resolver.excludes`](metron-stellar/stellar-common#stellarfunctionresolverincludesexcludes) | Stellar | CSV String | N/A | -| [`profiler.period.duration`](metron-analytics/metron-profiler#profilerperiodduration) | Profiler | Integer | `profiler_period_duration` | -| [`profiler.period.duration.units`](metron-analytics/metron-profiler#profilerperioddurationunits) | Profiler | String | `profiler_period_units` | -| [`update.hbase.table`](metron-platform/metron-indexing#updatehbasetable) | REST/Indexing | String | `update_hbase_table` | -| [`update.hbase.cf`](metron-platform/metron-indexing#updatehbasecf) | REST/Indexing | String | `update_hbase_cf` | +| Property Name | Subsystem | Type | Ambari Property | +|---------------------------------------------------------------------------------------------------------------------|---------------|------------|----------------------------| +| [`es.clustername`](../metron-elasticsearch#esclustername) | Indexing | String | `es_cluster_name` | +| [`es.ip`](../metron-elasticsearch#esip) | Indexing | String | `es_hosts` | +| [`es.port`](../metron-elasticsearch#esport) | Indexing | String | `es_port` | +| [`es.date.format`](../metron-elasticsearch#esdateformat) | Indexing | String | `es_date_format` | +| [`fieldValidations`](#validation-framework) | Parsing | Object | N/A | +| [`parser.error.topic`](../metron-parsers#parsererrortopic) | Parsing | String | N/A | +| [`stellar.function.paths`](../../metron-stellar/stellar-common#stellarfunctionpaths) | Stellar | CSV String | N/A | +| [`stellar.function.resolver.includes`](../../metron-stellar/stellar-common#stellarfunctionresolverincludesexcludes) | Stellar | CSV String | N/A | +| [`stellar.function.resolver.excludes`](../../metron-stellar/stellar-common#stellarfunctionresolverincludesexcludes) | Stellar | CSV String | N/A | +| [`profiler.period.duration`](../../metron-analytics/metron-profiler#profilerperiodduration) | Profiler | Integer | `profiler_period_duration` | +| [`profiler.period.duration.units`](../../metron-analytics/metron-profiler#profilerperioddurationunits) | Profiler | String | `profiler_period_units` | +| [`update.hbase.table`](../metron-indexing#updatehbasetable) | REST/Indexing | String | `update_hbase_table` | +| [`update.hbase.cf`](../metron-indexing#updatehbasecf) | REST/Indexing | String | `update_hbase_cf` | ## Note Configs in Ambari If a field is managed via ambari, you should change the field via From d07e7634273cea825d70e778f6865ef1e809d8cf Mon Sep 17 00:00:00 2001 From: cstella Date: Thu, 5 Oct 2017 17:01:32 -0400 Subject: [PATCH 3/4] Added geo --- metron-platform/metron-common/README.md | 1 + metron-platform/metron-enrichment/README.md | 20 +++++++++++++++++++- 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/metron-platform/metron-common/README.md b/metron-platform/metron-common/README.md index b10841b159..5f9fec6015 100644 --- a/metron-platform/metron-common/README.md +++ b/metron-platform/metron-common/README.md @@ -78,6 +78,7 @@ but a convenient index is provided here: | [`profiler.period.duration.units`](../../metron-analytics/metron-profiler#profilerperioddurationunits) | Profiler | String | `profiler_period_units` | | [`update.hbase.table`](../metron-indexing#updatehbasetable) | REST/Indexing | String | `update_hbase_table` | | [`update.hbase.cf`](../metron-indexing#updatehbasecf) | REST/Indexing | String | `update_hbase_cf` | +| [`geo.hdfs.file`](../metron-enrichment#geohdfsfile) | Enrichment | String | `geo_hdfs_file` | ## Note Configs in Ambari If a field is managed via ambari, you should change the field via diff --git a/metron-platform/metron-enrichment/README.md b/metron-platform/metron-enrichment/README.md index 10f2cd4809..fc13bf2730 100644 --- a/metron-platform/metron-enrichment/README.md +++ b/metron-platform/metron-enrichment/README.md @@ -25,9 +25,26 @@ defined by JSON documents stored in zookeeper. There are two types of configurations at the moment, `global` and `sensor` specific. + ## Global Configuration -See the "[Global Configuration](../metron-common)" section. +There are a few enrichments which have independent configurations, such +as from the global config. + +Also, see the "[Global Configuration](../metron-common)" section for +more discussion of the global config. + +### GeoIP +Metron supports enrichment of IP information using +[GeoLite2](https://dev.maxmind.com/geoip/geoip2/geolite2/). The +location of the file is managed in the global config. + +#### `geo.hdfs.file` + +The location on HDFS of the GeoLite2 database file to use for GeoIP +lookups. This file will be localized on the storm supervisors running +the topology and used from there. If this file changes, a topology +restart will be required. ## Sensor Enrichment Configuration @@ -269,6 +286,7 @@ An example configuration for the YAF sensor is as follows: ThreatIntel alert levels are emitted as a new field "threat.triage.level." So for the example above, an incoming message that trips the `ip_src_addr` rule will have a new field threat.triage.level=10. + # Example Enrichment via Stellar Let's walk through doing a simple enrichment using Stellar on your cluster using the Squid topology. From 9a4d220c73321032219383c903d235810988d80f Mon Sep 17 00:00:00 2001 From: cstella Date: Thu, 5 Oct 2017 17:15:40 -0400 Subject: [PATCH 4/4] Reworded. --- metron-platform/metron-enrichment/README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/metron-platform/metron-enrichment/README.md b/metron-platform/metron-enrichment/README.md index fc13bf2730..70bf832208 100644 --- a/metron-platform/metron-enrichment/README.md +++ b/metron-platform/metron-enrichment/README.md @@ -43,8 +43,9 @@ location of the file is managed in the global config. The location on HDFS of the GeoLite2 database file to use for GeoIP lookups. This file will be localized on the storm supervisors running -the topology and used from there. If this file changes, a topology -restart will be required. +the topology and used from there. This is lazy, so if this property +changes in a running topology, the file will be localized from HDFS upon first +time the file is used via the geo enrichment. ## Sensor Enrichment Configuration