Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow customizing indexes for logs written to OpenSearch/Elasticsearch #313

Closed
mmguero opened this issue Dec 18, 2023 · 2 comments
Closed
Assignees
Labels
elastic Related to issue with external ElasticSearch/Kibana output enhancement New feature or request logstash Relating to Malcolm's use of Logstash opensearch Relating to Malcolm's use of OpenSearch
Milestone

Comments

@mmguero
Copy link
Collaborator

mmguero commented Dec 18, 2023

Some users want to be able to define their own different indices for Zeek and Suricata and not just lump them in with the Arkime indices. If this is done, the user will have to understand that without creating something like an Elasticsearch data view those logs wouldn't be visible in the UI/dashboards.

We will also allow for the configuring of Arkime's rotateIndex value via environment variable.

Applicable configuration via environment variables:

  • arkime.env
# How often to create a new index in OpenSearch/Elasticsearch
#   https://arkime.com/settings#rotateIndex
ARKIME_ROTATE_INDEX=daily
  • opensearch.env
# OpenSearch index patterns and timestamp fields
# Index pattern for network traffic logs written via Logstash (e.g., Zeek logs, Suricata alerts)
MALCOLM_NETWORK_INDEX_PATTERN=arkime_sessions3-*
# Default time field to use for network traffic logs in Logstash and Dashboards
MALCOLM_NETWORK_INDEX_TIME_FIELD=firstPacket
# Suffix used to create index to which network traffic logs are written (supports Ruby strftime strings in %{})
MALCOLM_NETWORK_INDEX_SUFFIX=%{%y%m%d}
# Index pattern for other logs written via Logstash (e.g., nginx, beats, fluent-bit, etc.)
MALCOLM_OTHER_INDEX_PATTERN=malcolm_beats_*
# Default time field to use for other logs in Logstash and Dashboards
MALCOLM_OTHER_INDEX_TIME_FIELD=@timestamp
# Suffix used to create index to which other logs are written (supports Ruby strftime strings in %{})
MALCOLM_OTHER_INDEX_SUFFIX=%{%y%m%d}
# Index pattern used specifically by Arkime (will probably match MALCOLM_NETWORK_INDEX_PATTERN, should probably be arkime_sessions3-*)
ARKIME_NETWORK_INDEX_PATTERN=arkime_sessions3-*
# Default time field used by for sessions in Arkime viewer
ARKIME_NETWORK_INDEX_TIME_FIELD=firstPacket
@mmguero mmguero added enhancement New feature or request opensearch Relating to Malcolm's use of OpenSearch logstash Relating to Malcolm's use of Logstash elastic Related to issue with external ElasticSearch/Kibana output labels Dec 18, 2023
@mmguero mmguero added this to the v24.01.0 milestone Dec 18, 2023
@mmguero mmguero self-assigned this Jan 9, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 10, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 10, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 10, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 11, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 11, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 11, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 11, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 11, 2024
@mmguero mmguero changed the title allow overriding destination index for Zeek and Suricata logs allow customizing indexes for logs written to OpenSearch/Elasticsearch Jan 11, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 15, 2024
@mmguero
Copy link
Collaborator Author

mmguero commented Jan 15, 2024

For example, in order to use weekly time bucketing for the indices:

  • arkime.env
...
# How often to create a new index in OpenSearch/Elasticsearch
#   https://arkime.com/settings#rotateIndex
ARKIME_ROTATE_INDEX=weekly
...
  • opensearch.env
...
MALCOLM_NETWORK_INDEX_SUFFIX=%{%yw%U}
...
MALCOLM_OTHER_INDEX_SUFFIX=%{%yw%U}
...

resulting in:

dc exec -u $(id -u) opensearch curl -sS -XGET http://localhost:9200/_cat/indices|grep 24w
green  open malcolm_beats_nginx_24w02         -TvMFHXrSoCF5gkL_IRLjQ 1 0    9   2  74.2kb  74.2kb
green  open malcolm_beats_disk_24w02          5Q64CQkqRGi7zNsoUOA-MQ 1 0   21   0  39.1kb  39.1kb
green  open malcolm_beats_thermal_24w02       Z4hFCNqoTEOvls7Bvowwaw 1 0   10   0  79.8kb  79.8kb
green  open arkime_sessions3-24w02            WiPND_cDRVyVN1U-Tmsw6w 1 0  897   0   1.9mb   1.9mb
green  open malcolm_beats_mem_24w02           73TwZodrRQSP8Sy7DrC2Cg 1 0    6   0  41.4kb  41.4kb
green  open malcolm_beats_network_24w02       yE-Nh0U5R8ivusN6-DrD0w 1 0    6   0  57.7kb  57.7kb
green  open malcolm_beats_cpu_24w02           gAICFOJoR46APHXuvu0NhQ 1 0    2   0  51.1kb  51.1kb
green  open malcolm_beats_systemd_24w02       DxBV4MhWSniRBNeblMHm1A 1 0  122  11 313.6kb 313.6kb

@mmguero
Copy link
Collaborator Author

mmguero commented Jan 15, 2024

Or hourly:

  • arkime.env
...
# How often to create a new index in OpenSearch/Elasticsearch
#   https://arkime.com/settings#rotateIndex
ARKIME_ROTATE_INDEX=hourly
...
  • opensearch.env
...
MALCOLM_NETWORK_INDEX_SUFFIX=%{%y%m%dh%H}
...
MALCOLM_OTHER_INDEX_SUFFIX=%{%y%m%dh%H}
...

resulting in:

dc exec -u $(id -u) opensearch curl -sS -XGET http://localhost:9200/_cat/indices|grep 15h
green open malcolm_beats_thermal_240115h20 h0_uCC0_QleHz4CA0Uu8tw 1 0    9   0  60.6kb  60.6kb
green open malcolm_beats_cpu_240115h20     jYRE9N7CRwCHwI0CWzWMjQ 1 0    3   0  76.6kb  76.6kb
green open malcolm_beats_mem_240115h20     oo9rfLNGSweLXH9q_JeX0g 1 0    6   0  33.4kb  33.4kb
green open malcolm_beats_systemd_240115h20 jIi_E1CoTZCuflzLt1GoCw 1 0   90   7 247.3kb 247.3kb
green open malcolm_beats_disk_240115h20    BUY76o1xRPCS_yPfgg-MuQ 1 0   21   0  35.7kb  35.7kb
green open malcolm_beats_network_240115h20 uS7aCi2MSku-su287dmVHQ 1 0    6   0    40kb    40kb
green open malcolm_beats_nginx_240115h20   N0MEgqtxSRm-qdB7667TQA 1 0   11   2  32.3kb  32.3kb
green open arkime_sessions3-240115h20      HZpVSHOLScmO5cd8-T4M7A 1 0  512   0 898.5kb 898.5kb

mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 15, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 16, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 16, 2024
@mmguero mmguero closed this as completed Jan 16, 2024
This was referenced Jan 17, 2024
mmguero added a commit that referenced this issue Jan 17, 2024
Malcolm v24.01.0 contains new features, improvements, bug fixes and component version updates.

v23.12.1...v24.0.1

* Features and enhancements
    + new Malcolm instance landing page (#252)
    + file carve download with password-protected .zip file (#288)
    + new "all files exept common plain text files" option for Malcolm's file carving to match Hedgehog capability (#290)
    + allow customizing indexes for logs written to OpenSearch/Elasticsearch (#313)
    + more consistently differentiate between uploaded and live-captured traffic (#321)
    + make download extracted file context item from Arkime smarter (#330)
    + improve netbox device type library import by using "official" import script (#384)
* Component version updates
    + Alpine Linux to [v3.19](https://alpinelinux.org/posts/Alpine-3.19.0-released.html) as the base for some Docker images
    + Fluent Bit to [v2.2.2](https://github.com/fluent/fluent-bit/releases/tag/v2.2.2)
    + Beats to [v8.11.4](https://www.elastic.co/guide/en/beats/libbeat/8.11/release-notes-8.11.4.html)
    + LogStash to [v8.11.4](https://www.elastic.co/guide/en/logstash/current/logstash-8-11-4.html)
* Bug fixes
    + Suricata Alerts dashboard "Alerts - Tags" visualization is useless (#314)
    + third party logs are not parsed correctly from fluentbit -> fluentd aggregator -> Malcolm (#318)
    + update document lookup APIs to search either network or host data (#322)
    + suricata rule update is broken (#323)
    + time sync from hedgehog to Malcolm opensearch instance not working (#324)
    + fix issue specifying database mode via command-line
    + have pruning of OpenSearch indices (based on size) include "other" Malcolm indices as well (e.g., nginx logs, system resources, third-party logs, etc.)
* Configuration changes (in [environment variables](https://idaholab.github.io/Malcolm/docs/malcolm-config.html#MalcolmConfigEnvVars) in [`./config/`](https://github.com/idaholab/Malcolm/tree/v24.0.1/config))
    + added the following variables with relation to #313
        - added `ARKIME_ROTATE_INDEX` to [`arkime.env`](https://github.com/idaholab/Malcolm/tree/v24.0.1/arkime.env.example) with default value of `daily` (see [Arkime docs on rotateIndex](https://arkime.com/settings#rotateIndex))
        - added the following variables and defaults to [`opensearch.env`](https://github.com/idaholab/Malcolm/tree/v24.0.1/opensearch.env.example):
        ```
        # OpenSearch index patterns and timestamp fields
        # Index pattern for network traffic logs written via Logstash (e.g., Zeek logs, Suricata alerts)
        MALCOLM_NETWORK_INDEX_PATTERN=arkime_sessions3-*
        # Default time field to use for network traffic logs in Logstash and Dashboards
        MALCOLM_NETWORK_INDEX_TIME_FIELD=firstPacket
        # Suffix used to create index to which network traffic logs are written (supports Ruby strftime strings in %{})
        MALCOLM_NETWORK_INDEX_SUFFIX=%{%y%m%d}
        # Index pattern for other logs written via Logstash (e.g., nginx, beats, fluent-bit, etc.)
        MALCOLM_OTHER_INDEX_PATTERN=malcolm_beats_*
        # Default time field to use for other logs in Logstash and Dashboards
        MALCOLM_OTHER_INDEX_TIME_FIELD=@timestamp
        # Suffix used to create index to which other logs are written (supports Ruby strftime strings in %{})
        MALCOLM_OTHER_INDEX_SUFFIX=%{%y%m%d}
        # Index pattern used specifically by Arkime (will probably match MALCOLM_NETWORK_INDEX_PATTERN, should probably be arkime_sessions3-*)
        ARKIME_NETWORK_INDEX_PATTERN=arkime_sessions3-*
        # Default time field used by for sessions in Arkime viewer
        ARKIME_NETWORK_INDEX_TIME_FIELD=firstPacket
        ```
    + changed default for `EXTRACTED_FILE_HTTP_SERVER_KEY` to `infected` in [`zeek-secret.env`](https://github.com/idaholab/Malcolm/tree/v24.0.1/zeek-secret.env.example)
    + added `EXTRACTED_FILE_HTTP_SERVER_ZIP` with default value of `false` in [`zeek.env`](https://github.com/idaholab/Malcolm/tree/v24.0.1/zeek.env.example), see (#288)
mmguero added a commit to cisagov/Malcolm that referenced this issue Jan 17, 2024
Malcolm v24.01.0 contains new features, improvements, bug fixes and component version updates.

v23.12.1...v24.0.1

* Features and enhancements
    + new Malcolm instance landing page (idaholab#252)
    + file carve download with password-protected .zip file (idaholab#288)
    + new "all files exept common plain text files" option for Malcolm's file carving to match Hedgehog capability (idaholab#290)
    + allow customizing indexes for logs written to OpenSearch/Elasticsearch (idaholab#313)
    + more consistently differentiate between uploaded and live-captured traffic (idaholab#321)
    + make download extracted file context item from Arkime smarter (idaholab#330)
    + improve netbox device type library import by using "official" import script (idaholab#384)
* Component version updates
    + Alpine Linux to [v3.19](https://alpinelinux.org/posts/Alpine-3.19.0-released.html) as the base for some Docker images
    + Fluent Bit to [v2.2.2](https://github.com/fluent/fluent-bit/releases/tag/v2.2.2)
    + Beats to [v8.11.4](https://www.elastic.co/guide/en/beats/libbeat/8.11/release-notes-8.11.4.html)
    + LogStash to [v8.11.4](https://www.elastic.co/guide/en/logstash/current/logstash-8-11-4.html)
* Bug fixes
    + Suricata Alerts dashboard "Alerts - Tags" visualization is useless (idaholab#314)
    + third party logs are not parsed correctly from fluentbit -> fluentd aggregator -> Malcolm (idaholab#318)
    + update document lookup APIs to search either network or host data (idaholab#322)
    + suricata rule update is broken (idaholab#323)
    + time sync from hedgehog to Malcolm opensearch instance not working (idaholab#324)
    + fix issue specifying database mode via command-line
    + have pruning of OpenSearch indices (based on size) include "other" Malcolm indices as well (e.g., nginx logs, system resources, third-party logs, etc.)
* Configuration changes (in [environment variables](https://idaholab.github.io/Malcolm/docs/malcolm-config.html#MalcolmConfigEnvVars) in [`./config/`](https://github.com/cisagov/Malcolm/tree/v24.0.1/config))
    + added the following variables with relation to idaholab#313
        - added `ARKIME_ROTATE_INDEX` to [`arkime.env`](https://github.com/cisagov/Malcolm/tree/v24.0.1/arkime.env.example) with default value of `daily` (see [Arkime docs on rotateIndex](https://arkime.com/settings#rotateIndex))
        - added the following variables and defaults to [`opensearch.env`](https://github.com/cisagov/Malcolm/tree/v24.0.1/opensearch.env.example):
        ```
        # OpenSearch index patterns and timestamp fields
        # Index pattern for network traffic logs written via Logstash (e.g., Zeek logs, Suricata alerts)
        MALCOLM_NETWORK_INDEX_PATTERN=arkime_sessions3-*
        # Default time field to use for network traffic logs in Logstash and Dashboards
        MALCOLM_NETWORK_INDEX_TIME_FIELD=firstPacket
        # Suffix used to create index to which network traffic logs are written (supports Ruby strftime strings in %{})
        MALCOLM_NETWORK_INDEX_SUFFIX=%{%y%m%d}
        # Index pattern for other logs written via Logstash (e.g., nginx, beats, fluent-bit, etc.)
        MALCOLM_OTHER_INDEX_PATTERN=malcolm_beats_*
        # Default time field to use for other logs in Logstash and Dashboards
        MALCOLM_OTHER_INDEX_TIME_FIELD=@timestamp
        # Suffix used to create index to which other logs are written (supports Ruby strftime strings in %{})
        MALCOLM_OTHER_INDEX_SUFFIX=%{%y%m%d}
        # Index pattern used specifically by Arkime (will probably match MALCOLM_NETWORK_INDEX_PATTERN, should probably be arkime_sessions3-*)
        ARKIME_NETWORK_INDEX_PATTERN=arkime_sessions3-*
        # Default time field used by for sessions in Arkime viewer
        ARKIME_NETWORK_INDEX_TIME_FIELD=firstPacket
        ```
    + changed default for `EXTRACTED_FILE_HTTP_SERVER_KEY` to `infected` in [`zeek-secret.env`](https://github.com/cisagov/Malcolm/tree/v24.0.1/zeek-secret.env.example)
    + added `EXTRACTED_FILE_HTTP_SERVER_ZIP` with default value of `false` in [`zeek.env`](https://github.com/cisagov/Malcolm/tree/v24.0.1/zeek.env.example), see (idaholab#288)
@mmguero mmguero added the falcon label Jan 22, 2024
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 31, 2024
…icsearch are not working

Here's a list of what's been fixed/modified:

* `opensearch.env` and `dashboards.env` environment variables are now provided to `nginx-proxy` container so that the decisions for these links can be made based on them
* `envsubst` is used to expand variables into NGINX conf files in `/etc/nginx/conf.d` based on templates found in `/etc/nginx/templates`
* some custom logic in the `docker_entrpoint.sh` script for NGINX to do the environment variable substitution and some variable massaging prior to starting up NGINX
* two new environment variables in `dashboards.env.example` (blank by default but they could be manually overridden if there's more going on than we can automatically figure out based on the DASHBOARDS_URL variable:
```
NGINX_DASHBOARDS_PREFIX=
NGINX_DASHBOARDS_PROXY_PASS=
```
* not 100% related, but the landing page now shows "Kibana" with an elastic icon rather than "Dashboards" if we're in elasticsearch/kibana mode
* fixed issue with arkime to dashboards links not using correct index and time field names if those are not the defaults (see idaholab#313)
* removed some stuff left over from base image that wasn't ever really used
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
elastic Related to issue with external ElasticSearch/Kibana output enhancement New feature or request logstash Relating to Malcolm's use of Logstash opensearch Relating to Malcolm's use of OpenSearch
Projects
Status: Released
Development

No branches or pull requests

1 participant