Skip to content

Commit

Permalink
README updates
Browse files Browse the repository at this point in the history
  • Loading branch information
mmguero committed Aug 22, 2022
1 parent b70154b commit 6f0cce6
Showing 1 changed file with 91 additions and 4 deletions.
95 changes: 91 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ You can help steer Malcolm's development by sharing your ideas and feedback. Ple
* [macOS host system configuration](#HostSystemConfigMac)
* [Windows host system configuration](#HostSystemConfigWindows)
* [Running Malcolm](#Running)
* [OpenSearch instances](#OpenSearchInstance)
* [Authentication and authorization for remote OpenSearch clusters](#OpenSearchAuth)
* [Configure authentication](#AuthSetup)
* [Local account management](#AuthBasicAccountManagement)
* [Lightweight Directory Access Protocol (LDAP) authentication](#AuthLDAP)
Expand Down Expand Up @@ -553,6 +555,12 @@ Various other environment variables inside of `docker-compose.yml` can be tweake
* `LOGSTASH_SEVERITY_SCORING` - if set to `true`, Logstash will perform [severity scoring](#Severity) when analyzing Zeek logs (default `true`)
* `MANAGE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will be marked as available for deletion by Arkime if available storage space becomes too low (default `false`)
* `MAXMIND_GEOIP_DB_LICENSE_KEY` - Malcolm uses MaxMind's free GeoLite2 databases for GeoIP lookups. As of December 30, 2019, these databases are [no longer available](https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/) for download via a public URL. Instead, they must be downloaded using a MaxMind license key (available without charge [from MaxMind](https://www.maxmind.com/en/geolite2/signup)). The license key can be specified here for GeoIP database downloads during build- and run-time.
* `OPENSEARCH_LOCAL` - if set to `true`, Malcolm will use its own internal [OpenSearch instance](#OpenSearchInstance) (default `true`)
* `OPENSEARCH_URL` - when using Malcolm's internal OpenSearch instance (i.e., `OPENSEARCH_LOCAL` is `true`) this should be `http://opensearch:9200`, otherwise this value specifies the primary remote instance URL in the format `protocol://host:port` (default `http://opensearch:9200`)
* `OPENSEARCH_SSL_CERTIFICATE_VERIFICATION` - if set to `true`, connections to the primary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (default `false`)
* `OPENSEARCH_SECONDARY` - if set to `true`, Malcolm will forward logs to a secondary remote OpenSearch instance in addition to the primary (local or remote) OpenSearch instance (default `false`)
* `OPENSEARCH_SECONDARY_URL` - when forwarding to a secondary remote OpenSearch instance (i.e., `OPENSEARCH_SECONDARY` is `true`) this value specifies the secondary remote instance URL in the format `protocol://host:port`
* `OPENSEARCH_SECONDARY_SSL_CERTIFICATE_VERIFICATION` - if set to `true`, connections to the secondary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (default `false`)
* `NGINX_BASIC_AUTH` - if set to `true`, use [TLS-encrypted HTTP basic](#AuthBasicAccountManagement) authentication (default); if set to `false`, use [Lightweight Directory Access Protocol (LDAP)](#AuthLDAP) authentication
* `NGINX_LOG_ACCESS_AND_ERRORS` - if set to `true`, all access to Malcolm via its [web interfaces](#UserInterfaceURLs) will be logged to OpenSearch (default `false`)
* `NGINX_SSL` - if set to `true`, require HTTPS connections to Malcolm's `nginx-proxy` container (default); if set to `false`, use unencrypted HTTP connections (using unsecured HTTP connections is **NOT** recommended unless you are running Malcolm behind another reverse proxy like Traefik, Caddy, etc.)
Expand All @@ -577,7 +585,7 @@ Various other environment variables inside of `docker-compose.yml` can be tweake
* `VTOT_API2_KEY` – used to specify a [VirusTotal Public API v.20](https://www.virustotal.com/en/documentation/public-api/) key, which, if specified, will be used to submit hashes of [Zeek-extracted files](#ZeekFileExtraction) to VirusTotal
* `ZEEK_AUTO_ANALYZE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will automatically be analyzed by Zeek, and the resulting logs will also be imported (default `false`)
* `ZEEK_AUTO_ANALYZE_PCAP_THREADS` – the number of threads available to Malcolm for analyzing Zeek logs (default `1`)
* `ZEEK_DISABLE_...` - if set to any non-blank value, each of these variables can be used to disable a certain Zeek function when it analyzes PCAP files (for example, setting `ZEEK_DISABLE_LOG_PASSWORDS` to `true` to disable logging of cleartext passwords)
* `ZEEK_DISABLE_` - if set to any non-blank value, each of these variables can be used to disable a certain Zeek function when it analyzes PCAP files (for example, setting `ZEEK_DISABLE_LOG_PASSWORDS` to `true` to disable logging of cleartext passwords)
* `ZEEK_DISABLE_BEST_GUESS_ICS` - see ["Best Guess" Fingerprinting for ICS Protocols](#ICSBestGuess)
* `ZEEK_EXTRACTOR_MODE` – determines the file extraction behavior for file transfers detected by Zeek; see [Automatic file extraction and scanning](#ZeekFileExtraction) for more details
* `ZEEK_INTEL_FEED_SINCE` - when querying a [TAXII](#ZeekIntelSTIX) or [MISP](#ZeekIntelMISP) feed, only process threat indicators that have been created or modified since the time represented by this value; it may be either a fixed date/time (`01/01/2021`) or relative interval (`30 days ago`)
Expand Down Expand Up @@ -674,7 +682,7 @@ blockdev --setra 512 /dev/sda

* Change the I/O scheduler to `deadline` or `noop`. Again, this can be done in a variety of ways. The simplest is to add `elevator=deadline` to the arguments in `GRUB_CMDLINE_LINUX` in `/etc/default/grub`, then running `sudo update-grub2`

* If you are planning on using very large data sets, consider formatting the drive containing `opensearch` volume as XFS.
* If you are planning on using very large data sets, consider formatting the drive containing the `opensearch` volume as XFS.

After making all of these changes, do a reboot for good measure!

Expand Down Expand Up @@ -734,6 +742,85 @@ Once Docker is installed, configured and running as described in the previous se

## <a name="Running"></a>Running Malcolm

### <a name="OpenSearchInstance"></a>OpenSearch instances

Malcolm's default standalone configuration is to use a local [OpenSearch](https://opensearch.org/) instance in a Docker container to index and search network traffic metadata. OpenSearch can also run as a [cluster](https://opensearch.org/docs/latest/opensearch/cluster/) with instances distributed across multiple nodes with dedicated [roles](https://opensearch.org/docs/latest/opensearch/cluster/#nodes) like cluster manager, data node, ingest node, etc.

As the permutations of OpenSearch cluster configurations are numerous, it is beyond Malcolm's scope to set up multi-node clusters. However, Malcolm can be configured to use a remote OpenSearch cluster rather than its own internal instance.

The `OPENSEARCH_…` [environment variables in `docker-compose.yml`](#DockerComposeYml) control whether Malcolm uses its own local OpenSearch instance or a remote OpenSearch instance as its primary data store. The configuration portion of Malcolm install script ([`./scripts/install.py --configure`](#ConfigAndTuning)) can help you configure these options.

For example, to use the default standalone configuration, answer `Y` when prompted `Should Malcolm use and maintain its own OpenSearch instance?`.

Or, to use a remote OpenSearch cluster:

```
Should Malcolm use and maintain its own OpenSearch instance? (Y/n): n
Enter primary remote OpenSearch connection URL (e.g., https://192.168.1.123:9200): https://192.168.1.123:9200
Require SSL certificate validation for communication with primary OpenSearch instance? (y/N): n
You must run auth_setup after install.py to store OpenSearch connection credentials.
```

Whether the primary OpenSearch instance is a locally maintained single-node instance or is a remote cluster, Malcolm can be configured additionally forward logs to a secondary remote OpenSearch instance. The `OPENSEARCH_SECONDARY_…` [environment variables in `docker-compose.yml`](#DockerComposeYml) control this behavior. Configuration of a remote secondary OpenSearch instance is similar to that of a remote primary OpenSearch instance:


```
Forward Logstash logs to a secondary remote OpenSearch instance? (y/N): y
Enter secondary remote OpenSearch connection URL (e.g., https://192.168.1.123:9200): https://192.168.1.124:9200
Require SSL certificate validation for communication with secondary OpenSearch instance? (y/N): n
You must run auth_setup after install.py to store OpenSearch connection credentials.
```

### <a name="OpenSearchAuth"></a>Authentication and authorization for remote OpenSearch clusters

In addition to setting the environment variables in [`docker-compose.yml`](#DockerComposeYml) as described above, you must provide Malcolm with credentials for it to be able to communicate with remote OpenSearch instances. These credentials are stored in the Malcolm installation directory as `.opensearch.primary.curlrc` and `.opensearch.secondary.curlrc` for the primary and secondary OpenSearch connections, respectively, and are bind mounted into the Docker containers which need to communicate with OpenSearch. These [cURL-formatted](https://everything.curl.dev/cmdline/configfile) config files can be generated for you by the [`auth_setup`](#AuthSetup) script as illustrated:

```
$ ./scripts/auth_setup
Store username/password for primary remote OpenSearch instance? (y/N): y
OpenSearch username: servicedb
servicedb password: :
servicedb password (again): :
Require SSL certificate validation for OpenSearch communication? (Y/n): n
Store username/password for secondary remote OpenSearch instance? (y/N): y
OpenSearch username: remotedb
remotedb password: :
remotedb password (again): :
Require SSL certificate validation for OpenSearch communication? (Y/n): n
```

These files are created with permissions such that only the user account running Malcolm can access them:

```
$ ls -la .opensearch.*.curlrc
-rw------- 1 tlacuache tlacuache 36 Aug 22 14:17 .opensearch.primary.curlrc
-rw------- 1 tlacuache tlacuache 35 Aug 22 14:18 .opensearch.secondary.curlrc
```

One caveat with Malcolm using a remote OpenSearch cluster as its primary document store is that the accounts used to access Malcolm's [web interfaces](#UserInterfaceURLs), particularly [OpenSearch Dashboards](#Dashboards), are in some instance passed directly through to OpenSearch itself. For this reason, both Malcolm and the remote primary OpenSearch instance must have the same account information. The easiest way to accomplish this is to use an Active Directory/LDAP server that both [Malcolm](#AuthLDAP) and [OpenSearch](https://opensearch.org/docs/latest/security-plugin/configuration/ldap/) use as a common authentication backend.

See the OpenSearch documentation on [access control](https://opensearch.org/docs/latest/security-plugin/access-control/index/) for more information.

### <a name="AuthSetup"></a>Configure authentication

Malcolm requires authentication to access the [user interface](#UserInterfaceURLs). [Nginx](https://nginx.org/) can authenticate users with either local TLS-encrypted HTTP basic authentication or using a remote Lightweight Directory Access Protocol (LDAP) authentication server.
Expand Down Expand Up @@ -1334,7 +1421,7 @@ The `EXTRACTED_FILE_PRESERVATION` [environment variable in `docker-compose.yml`]
* `all`: preserve flagged files in `./zeek-logs/extract_files/quarantine` and all other extracted files in `./zeek-logs/extract_files/preserved`
* `none`: preserve no extracted files

The `EXTRACTED_FILE_HTTP_SERVER_...` [environment variables in `docker-compose.yml`](#DockerComposeYml) configure access to the Zeek-extracted files path through the means of a simple HTTPS directory server. Beware that Zeek-extracted files may contain malware. As such, the files may be optionally encrypted upon download.
The `EXTRACTED_FILE_HTTP_SERVER_` [environment variables in `docker-compose.yml`](#DockerComposeYml) configure access to the Zeek-extracted files path through the means of a simple HTTPS directory server. Beware that Zeek-extracted files may contain malware. As such, the files may be optionally encrypted upon download.

### <a name="HostAndSubnetNaming"></a>Automatic host and subnet name assignment

Expand Down Expand Up @@ -3407,7 +3494,7 @@ Malcolm uses [OpenSearch](https://opensearch.org/) and [OpenSearch Dashboards](h
* Logs appended to textual log files (e.g., `tail`-ing a log file)
* The output of an external script or program
* Messages in the form of MQTT control packets
* many more...
* many more

Refer to [**Forwarding Third-Party Logs to Malcolm**](./scripts/third-party-logs/README.md) for more information.

Expand Down

0 comments on commit 6f0cce6

Please sign in to comment.