Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for the Amazon Security Lake integration #226

Merged
merged 8 commits into from
May 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 3 additions & 71 deletions integrations/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,79 +12,11 @@ also improve the protection of your workloads, applications, and data. Security
Open Cybersecurity Schema Framework (OCSF), an open standard. With OCSF support, the service normalizes
and combines security data from AWS and a broad range of enterprise security data sources.

#### Development guide
Refer to these documents for more information about this integration:

A demo of the integration can be started using the content of this folder and Docker.
* [User Guide](./amazon-security-lake/README.md).
* [Developer Guide](./amazon-security-lake/CONTRIBUTING.md).

```console
docker compose -f ./docker/amazon-security-lake.yml up -d
```

This docker compose project will bring a _wazuh-indexer_ node, a _wazuh-dashboard_ node,
a _logstash_ node, our event generator and an AWS Lambda Python container. On the one hand, the event generator will push events
constantly to the indexer, to the `wazuh-alerts-4.x-sample` index by default (refer to the [events
generator](./tools/events-generator/README.md) documentation for customization options).
On the other hand, logstash will constantly query for new data and deliver it to output configured in the
pipeline, which can be one of `indexer-to-s3` or `indexer-to-file`.

The `indexer-to-s3` pipeline is the method used by the integration. This pipeline delivers
the data to an S3 bucket, from which the data is processed using a Lambda function, to finally
be sent to the Amazon Security Lake bucket in Parquet format.

<!-- TODO continue with S3 credentials setup -->

Attach a terminal to the container and start the integration by starting logstash, as follows:

```console
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash
```

After 5 minutes, the first batch of data will show up in http://localhost:9444/ui/wazuh-indexer-aux-bucket.
You'll need to invoke the Lambda function manually, selecting the log file to process.

```bash
bash amazon-security-lake/src/invoke-lambda.sh <file>
```

Processed data will be uploaded to http://localhost:9444/ui/wazuh-indexer-amazon-security-lake-bucket. Click on any file to download it,
and check it's content using `parquet-tools`. Just make sure of installing the virtual environment first, through [requirements.txt](./amazon-security-lake/).

```bash
parquet-tools show <parquet-file>
```

Bucket names can be configured editing the [amazon-security-lake.yml](./docker/amazon-security-lake.yml) file.

For development or debugging purposes, you may want to enable hot-reload, test or debug on these files,
by using the `--config.reload.automatic`, `--config.test_and_exit` or `--debug` flags, respectively.

For production usage, follow the instructions in our documentation page about this matter.
(_when-its-done_)

As a last note, we would like to point out that we also use this Docker environment for development.

#### Deployment guide

- Create one S3 bucket to store the raw events, for example: `wazuh-security-lake-integration`
- Create a new AWS Lambda function
- Create an IAM role with access to the S3 bucket created above.
- Select Python 3.12 as the runtime
- Configure the runtime to have 512 MB of memory and 30 seconds timeout
- Configure an S3 trigger so every created object in the bucket with `.txt` extension invokes the Lambda.
- Run `make` to generate a zip deployment package, or create it manually as per the [AWS Lambda documentation](https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-create-dependencies).
- Upload the zip package to the bucket. Then, upload it to the Lambda from the S3 as per these instructions: https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-package.html#gettingstarted-package-zip
- Create a Custom Source within Security Lake for the Wazuh Parquet files as per the following guide: https://docs.aws.amazon.com/security-lake/latest/userguide/custom-sources.html
- Set the **AWS account ID** for the Custom Source **AWS account with permission to write data**.

<!-- TODO Configure AWS Lambda Environment Variables /-->
<!-- TODO Install and configure Logstash /-->

The instructions on this section have been based on the following AWS tutorials and documentation.

- [Tutorial: Using an Amazon S3 trigger to create thumbnail images](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-tutorial.html)
- [Tutorial: Using an Amazon S3 trigger to invoke a Lambda function](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html)
- [Working with .zip file archives for Python Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/python-package.html)
- [Best practices for working with AWS Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html)

### Other integrations

Expand Down
59 changes: 59 additions & 0 deletions integrations/amazon-security-lake/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Wazuh to Amazon Security Lake Integration Development Guide

## Deployment guide on Docker

A demo of the integration can be started using the content of this folder and Docker. Open a terminal in the `wazuh-indexer/integrations` folder and start the environment.

```console
docker compose -f ./docker/amazon-security-lake.yml up -d
```

This Docker Compose project will bring up these services:

- a _wazuh-indexer_ node
- a _wazuh-dashboard_ node
- a _logstash_ node
- our [events generator](./tools/events-generator/README.md)
- an AWS Lambda Python container.

On the one hand, the event generator will push events constantly to the indexer, to the `wazuh-alerts-4.x-sample` index by default (refer to the [events generator](./tools/events-generator/README.md) documentation for customization options). On the other hand, Logstash will query for new data and deliver it to output configured in the pipeline, which can be one of `indexer-to-s3` or `indexer-to-file`.

The `indexer-to-s3` pipeline is the method used by the integration. This pipeline delivers the data to an S3 bucket, from which the data is processed using a Lambda function, to finally be sent to the Amazon Security Lake bucket in Parquet format.


Attach a terminal to the container and start the integration by starting Logstash, as follows:

```console
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash
```

After 5 minutes, the first batch of data will show up in http://localhost:9444/ui/wazuh-aws-security-lake-raw. You'll need to invoke the Lambda function manually, selecting the log file to process.

```bash
bash amazon-security-lake/src/invoke-lambda.sh <file>
```

Processed data will be uploaded to http://localhost:9444/ui/wazuh-aws-security-lake-parquet. Click on any file to download it, and check it's content using `parquet-tools`. Just make sure of installing the virtual environment first, through [requirements.txt](./amazon-security-lake/).

```bash
parquet-tools show <parquet-file>
```

If the `S3_BUCKET_OCSF` variable is set in the container running the AWS Lambda function, intermediate data in OCSF and JSON format will be written to a dedicated bucket. This is enabled by default, writing to the `wazuh-aws-security-lake-ocsf` bucket. Bucket names and additional environment variables can be configured editing the [amazon-security-lake.yml](./docker/amazon-security-lake.yml) file.

For development or debugging purposes, you may want to enable hot-reload, test or debug on these files, by using the `--config.reload.automatic`, `--config.test_and_exit` or `--debug` flags, respectively.

For production usage, follow the instructions in our documentation page about this matter.
See [README.md](README.md). The instructions on that section have been based on the following AWS tutorials and documentation.

- [Tutorial: Using an Amazon S3 trigger to create thumbnail images](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-tutorial.html)
- [Tutorial: Using an Amazon S3 trigger to invoke a Lambda function](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html)
- [Working with .zip file archives for Python Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/python-package.html)
- [Best practices for working with AWS Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html)

## Makefile

**Docker is required**.

The [Makefile](./Makefile) in this folder automates the generation of a zip deployment package containing the source code and the required dependencies for the AWS Lambda function. Simply run `make` and it will generate the `wazuh_to_amazon_security_lake.zip` file. The main target runs a Docker container to install the Python3 dependencies locally, and zips the source code and the dependencies together.

4 changes: 3 additions & 1 deletion integrations/amazon-security-lake/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,6 @@ $(TARGET):

clean:
@rm -rf $(TARGET)
@py3clean .
docker run -v `pwd`:/src -w /src \
python:3.12 \
py3clean .
Loading