[conf] support schema definitions in multiple files vs logs.json file #981

blakemotl · 2019-08-23T20:26:22Z

to: @ryandeivert @chunyong-lin @Ryxias
cc: @airbnb/streamalert-maintainers

Background

Currently all schemas are defined in one large logs.json file. This is pretty unorganized and would be better if the schemas could be split off by group, i.e carbonblack, ghe, slack, etc. That is exactly what this PR solves.

Changes

Split schema into multiple files in a new conf/schemas directory
Added the ability to configure the priority of individual schemas within a larger group after uncovering an issue with ordering that split schema files uncovered
Added unit tests for the new functionality and added priority to some rule tests

Testing

Unit tests and integration tests passed locally.

conf/schemas/cloudtrail.json

conf/schemas/cloudwatch.json

docs/source/firehose.rst

docs/source/rules.rst

docs/source/testing.rst

stream_alert/shared/config.py

docs/source/conf-schemas.rst

chunyong-lin

logs.json file is so long, we really need to split it. Thank you for working on this issue!

conf/schemas/cloudtrail.json

docs/source/conf-datasources.rst

docs/source/conf-schemas.rst

tests/unit/stream_alert_shared/test_config.py

ryandeivert

nice work so far @blakemotl - some comments for you!

docs/source/rules.rst

docs/source/testing.rst

stream_alert/shared/config.py

blakemotl · 2019-08-29T22:09:30Z

Have made several updates. PTAL

chunyong-lin · 2019-08-30T21:43:07Z

@blakemotl Can you resolve the conflict?

docs/source/conf-datasources.rst

docs/source/conf-schemas.rst

ryandeivert · 2019-08-30T21:49:11Z

stream_alert/shared/config.py

@@ -17,6 +17,11 @@
 import json
 import os

+from stream_alert.shared.logger import get_logger
+
+


remove 1 line here. constants don't need to abide by the 2 newline rule

stream_alert/shared/config.py

blakemotl · 2019-09-05T00:56:14Z

Fixed the issues Ryan pointed out, and fixed the merge conflict.

coveralls · 2019-09-05T16:30:13Z

Coverage decreased (-0.002%) to 96.606% when pulling a9c9780 on breakup-log-schema-file into 0d545a3 on release-3-0-0.

blakemotl · 2019-09-05T17:21:11Z

Fixed another issue I found with the CLI trying to rewrite logs.json to disk which fails. Logs does not need to be written by the CLI and fixing it to work with the new schemas system(where it is aware of what file each schema came from) would require a lot of work.

Ryxias

Great job @blakemotl !!

:partycatbug: :partycatbug: :partycatbug: :partycatbug:
:partycatbug: :partycatbug: :partycatbug: :partycatbug:
:partycatbug: :partycatbug: :partycatbug: :partycatbug:
:partycatbug: :partycatbug: :partycatbug: :partycatbug:

* Changes 3.0.0+ behavior to by default include SQS prefix (#979) * Changes 3.0.0+ behavior to by default include SQS prefix * PR comment * Fix up tests * Convert all python to python3.7 (#974) * [setup] Configure the Vagrantfile for Python 2.x and 3.x development. The `vagrant/` folder contains bash scripts used to configure virtualenv, virtualenvwrapper, streamalert, and terraform. Additionally, a patch for the recent libssl1.1 manual prompt during apt configuration is included to allow for automatic builds without user interaction. Build scripts are mostly configurable through the Vagrantfile and the manipulation of the exposed environment variables (`SA_*` environment variables). These allow versions and credentials to be specified and automatically propagated to the guest VM built by Vagrant. * [core][setup] Update dependencies to Python 3. The requirements-top-level.txt was modified to remove the version pinning of the aliyun-python-sdk-* dependencies, allowing the upgrade to Python 3 compatible versions. The requirements.txt was updated to reflect the Python 3 compatible versions of the project dependencies. * [setup] Fix the `.gitignore` after the fork merge which required rebase. * [testing] Run `2to3-2.7 -n -w tests/`, no modifications of output. * [testing] Change `assert_items_equal` to `assert_count_equal`. In Python 3, the `assertItemsEqual` function is named `assertCountEqual` (https://docs.python.org/2/library/unittest.html#unittest.TestCase.assertItemsEqual). Since Nose derives from this naming, the function references have been updated across all tests so imports work correctly. * [core] The raw 2to3 pass for the core packages. This handles many of the syntax issues and semantic differences. It appears that the remaining issues are mostly type and library changes from Python 2.x to Python 3.x. * [core] Change division work the same as in Python2. The Python '/' division results in an integer number. However, in Python 3 it results in a float. By changing it to '//' it performs floored division. * [core] Change core semantics related to strings vs bytes. In Python 3 the difference between strings and bytes are made explicit. Thes are able to be enforced through the use of `.encode(...)` and `.decode(...)`. This commit addresses this difference in most of the core library. * [core] Remove the forward compatibility __bool__. * [testing] Fix the patching of builtins. The builtins package is now required to allow patching of builtins. This commit imports it in the appropriate tests and updates the patch targets appropriately. * [testing] The update the use of `reload` for Python 3. The `reload` function was moved to the `importlib` package in Python 3. This commit adds the appropriate import and updates the usage within tests. * [core] Update error message reporting for Python3. In Python 3, not all error objects have the `.message` attribute. Instead, the generally accepted practice is to perform a `str(exception)` to convert it into a string. This commit updates the usage appropriately across the core library. * [core] Update generators to use return instead of StopIteration. In the later versions of Python 3, `StopIteration` explicitly raises the error instead of signaling an exhausted generator. This commit converts the use of `StopIteration` to `return`, signaling generator exhaustion without raising the error. * [testing] Update the alert merger test to disregard call order. The `mock_logger.assert_has_calls(...)` asserts that the calls provided are executed in the order provided. However, in Python 3, the order of calls is different. The `any_oder=True` parameter is provided to the assertion to disregard this order. Upon deeper analysis of the test semantics, it is testing to ensure that merging, deletion, and dispatching occur appropriately. The implementation semantics appear to be unchanged correct according to the test, and the only difference between the Python 2 and Python 3 versions is execution order. * [testing] Mock a class variable to support a `dict.get`. Since the `Normalizer` class is mocked, it returns `None` for the `_types_config` class variable of type `dict`. In order to allow a `.get`, the `_types_config` is patched to be an empty `dict`. * [testing] Update the error returned by a failed JSON parse call. * [testing] Switch to assert_dict_equal and change order of test `dict`s. This commit continues to update certain comparisons to use the `assert_dict_equal` helper, and also fixes tests with incorrectly ordered dictionaries. * [testing][wip] Is this supposed to require failing status codes? It appears that the `PhantomOutput._setup_container` should `return false`, which required stubbing the `get_mock` and `post_mock` to return failing status codes. This was added to allow the test to pass, but needs to be checked for correctness. * [testing] Update another test to use `assert_dict_equal`. * [testing] Update the DuoApp test to properly patch abstract methods. In Python 3 the `__abstractmethods__` variable contains all abstract methods which must be implemented in classes deriving from it. This causes problems when attempting to test the `DuoApp` class, since it does not implement the `_type` method. Furthermore, the `patch.object` decorator will patch instances of a particular object in all methods matching the specified method name test prefix (e.g. `test_some_method` where the test prefix is `test_*`). This causes issues because the `setup` function does not match the test prefix, which results in an error when attempting to create an instance of `DuoApp` before each test. This commit fixes this issue by patching all methods matching the test prefix, as well as explicitly patching the `setup` method. * [testing] Use the Python 3 AST when generating rule checksum in testing. The AST has changed between Python 2 and Python 3. Therefore, the test AST required updating in order to reflect the Python version upgrade. * [testing][wip] Update the terraform generation test to be order-agnostic. The use of pre-defined strings was replaced with the `ANY` helper to reduce dependency on order. Any value produced under these fields will result in a True assertion during the test, ensuring correct production, while failing to account for changes in specific key values. * [testing] Bump the version of travis to use Pyhon 3.7. * [testing] Xenial is required on Travis for Python 3.7. * address PR comments * removed reference to __nonzero__ * updates to manage.py for python3, plus some fixes for integration tests * update the terraform version configured by vagrant * updated use of the regex._pattern_type to regex.Pattern * lambdas now run on python3.7 * convert bytes to string for generating athena partition refresh query * fix for requirements.txt and athena unit tests * linter fixes and relevant unit test updates * bandit properly excludes the tests directory * documentation for getting started and contributing now references python 3.7 * [setup] Configure the Vagrantfile for Python 2.x and 3.x development. The `vagrant/` folder contains bash scripts used to configure virtualenv, virtualenvwrapper, streamalert, and terraform. Additionally, a patch for the recent libssl1.1 manual prompt during apt configuration is included to allow for automatic builds without user interaction. Build scripts are mostly configurable through the Vagrantfile and the manipulation of the exposed environment variables (`SA_*` environment variables). These allow versions and credentials to be specified and automatically propagated to the guest VM built by Vagrant. * [core][setup] Update dependencies to Python 3. The requirements-top-level.txt was modified to remove the version pinning of the aliyun-python-sdk-* dependencies, allowing the upgrade to Python 3 compatible versions. The requirements.txt was updated to reflect the Python 3 compatible versions of the project dependencies. * [testing] Run `2to3-2.7 -n -w tests/`, no modifications of output. * [testing] Change `assert_items_equal` to `assert_count_equal`. In Python 3, the `assertItemsEqual` function is named `assertCountEqual` (https://docs.python.org/2/library/unittest.html#unittest.TestCase.assertItemsEqual). Since Nose derives from this naming, the function references have been updated across all tests so imports work correctly. * [setup] Fix the `.gitignore` after the fork merge which required rebase. * [core] The raw 2to3 pass for the core packages. This handles many of the syntax issues and semantic differences. It appears that the remaining issues are mostly type and library changes from Python 2.x to Python 3.x. * [core] Change division work the same as in Python2. The Python '/' division results in an integer number. However, in Python 3 it results in a float. By changing it to '//' it performs floored division. * [core] Change core semantics related to strings vs bytes. In Python 3 the difference between strings and bytes are made explicit. Thes are able to be enforced through the use of `.encode(...)` and `.decode(...)`. This commit addresses this difference in most of the core library. * [core] Remove the forward compatibility __bool__. * [testing] Fix the patching of builtins. The builtins package is now required to allow patching of builtins. This commit imports it in the appropriate tests and updates the patch targets appropriately. * [testing] The update the use of `reload` for Python 3. The `reload` function was moved to the `importlib` package in Python 3. This commit adds the appropriate import and updates the usage within tests. * [core] Update error message reporting for Python3. In Python 3, not all error objects have the `.message` attribute. Instead, the generally accepted practice is to perform a `str(exception)` to convert it into a string. This commit updates the usage appropriately across the core library. * [core] Update generators to use return instead of StopIteration. In the later versions of Python 3, `StopIteration` explicitly raises the error instead of signaling an exhausted generator. This commit converts the use of `StopIteration` to `return`, signaling generator exhaustion without raising the error. * [testing] Update the alert merger test to disregard call order. The `mock_logger.assert_has_calls(...)` asserts that the calls provided are executed in the order provided. However, in Python 3, the order of calls is different. The `any_oder=True` parameter is provided to the assertion to disregard this order. Upon deeper analysis of the test semantics, it is testing to ensure that merging, deletion, and dispatching occur appropriately. The implementation semantics appear to be unchanged correct according to the test, and the only difference between the Python 2 and Python 3 versions is execution order. * [testing] Mock a class variable to support a `dict.get`. Since the `Normalizer` class is mocked, it returns `None` for the `_types_config` class variable of type `dict`. In order to allow a `.get`, the `_types_config` is patched to be an empty `dict`. * [testing] Update the error returned by a failed JSON parse call. * [testing] Switch to assert_dict_equal and change order of test `dict`s. This commit continues to update certain comparisons to use the `assert_dict_equal` helper, and also fixes tests with incorrectly ordered dictionaries. * [testing][wip] Is this supposed to require failing status codes? It appears that the `PhantomOutput._setup_container` should `return false`, which required stubbing the `get_mock` and `post_mock` to return failing status codes. This was added to allow the test to pass, but needs to be checked for correctness. * [testing] Update another test to use `assert_dict_equal`. * [testing] Update the DuoApp test to properly patch abstract methods. In Python 3 the `__abstractmethods__` variable contains all abstract methods which must be implemented in classes deriving from it. This causes problems when attempting to test the `DuoApp` class, since it does not implement the `_type` method. Furthermore, the `patch.object` decorator will patch instances of a particular object in all methods matching the specified method name test prefix (e.g. `test_some_method` where the test prefix is `test_*`). This causes issues because the `setup` function does not match the test prefix, which results in an error when attempting to create an instance of `DuoApp` before each test. This commit fixes this issue by patching all methods matching the test prefix, as well as explicitly patching the `setup` method. * [testing] Use the Python 3 AST when generating rule checksum in testing. The AST has changed between Python 2 and Python 3. Therefore, the test AST required updating in order to reflect the Python version upgrade. * [testing][wip] Update the terraform generation test to be order-agnostic. The use of pre-defined strings was replaced with the `ANY` helper to reduce dependency on order. Any value produced under these fields will result in a True assertion during the test, ensuring correct production, while failing to account for changes in specific key values. * [testing] Bump the version of travis to use Pyhon 3.7. * [testing] Xenial is required on Travis for Python 3.7. * address PR comments * removed reference to __nonzero__ * updates to manage.py for python3, plus some fixes for integration tests * update the terraform version configured by vagrant * updated use of the regex._pattern_type to regex.Pattern * lambdas now run on python3.7 * convert bytes to string for generating athena partition refresh query * fix for requirements.txt and athena unit tests * linter fixes and relevant unit test updates * bandit properly excludes the tests directory * documentation for getting started and contributing now references python 3.7 * [setup] Configure the Vagrantfile for Python 2.x and 3.x development. The `vagrant/` folder contains bash scripts used to configure virtualenv, virtualenvwrapper, streamalert, and terraform. Additionally, a patch for the recent libssl1.1 manual prompt during apt configuration is included to allow for automatic builds without user interaction. Build scripts are mostly configurable through the Vagrantfile and the manipulation of the exposed environment variables (`SA_*` environment variables). These allow versions and credentials to be specified and automatically propagated to the guest VM built by Vagrant. * [core][setup] Update dependencies to Python 3. The requirements-top-level.txt was modified to remove the version pinning of the aliyun-python-sdk-* dependencies, allowing the upgrade to Python 3 compatible versions. The requirements.txt was updated to reflect the Python 3 compatible versions of the project dependencies. * [setup] Fix the `.gitignore` after the fork merge which required rebase. * [testing] Run `2to3-2.7 -n -w tests/`, no modifications of output. * [testing] Change `assert_items_equal` to `assert_count_equal`. In Python 3, the `assertItemsEqual` function is named `assertCountEqual` (https://docs.python.org/2/library/unittest.html#unittest.TestCase.assertItemsEqual). Since Nose derives from this naming, the function references have been updated across all tests so imports work correctly. * [core] The raw 2to3 pass for the core packages. This handles many of the syntax issues and semantic differences. It appears that the remaining issues are mostly type and library changes from Python 2.x to Python 3.x. * [core] Change division work the same as in Python2. The Python '/' division results in an integer number. However, in Python 3 it results in a float. By changing it to '//' it performs floored division. * [core] Change core semantics related to strings vs bytes. In Python 3 the difference between strings and bytes are made explicit. Thes are able to be enforced through the use of `.encode(...)` and `.decode(...)`. This commit addresses this difference in most of the core library. * [core] Remove the forward compatibility __bool__. * [testing] Fix the patching of builtins. The builtins package is now required to allow patching of builtins. This commit imports it in the appropriate tests and updates the patch targets appropriately. * [testing] The update the use of `reload` for Python 3. The `reload` function was moved to the `importlib` package in Python 3. This commit adds the appropriate import and updates the usage within tests. * [core] Update error message reporting for Python3. In Python 3, not all error objects have the `.message` attribute. Instead, the generally accepted practice is to perform a `str(exception)` to convert it into a string. This commit updates the usage appropriately across the core library. * [core] Update generators to use return instead of StopIteration. In the later versions of Python 3, `StopIteration` explicitly raises the error instead of signaling an exhausted generator. This commit converts the use of `StopIteration` to `return`, signaling generator exhaustion without raising the error. * [testing] Update the alert merger test to disregard call order. The `mock_logger.assert_has_calls(...)` asserts that the calls provided are executed in the order provided. However, in Python 3, the order of calls is different. The `any_oder=True` parameter is provided to the assertion to disregard this order. Upon deeper analysis of the test semantics, it is testing to ensure that merging, deletion, and dispatching occur appropriately. The implementation semantics appear to be unchanged correct according to the test, and the only difference between the Python 2 and Python 3 versions is execution order. * [testing] Mock a class variable to support a `dict.get`. Since the `Normalizer` class is mocked, it returns `None` for the `_types_config` class variable of type `dict`. In order to allow a `.get`, the `_types_config` is patched to be an empty `dict`. * [testing] Update the error returned by a failed JSON parse call. * [testing] Switch to assert_dict_equal and change order of test `dict`s. This commit continues to update certain comparisons to use the `assert_dict_equal` helper, and also fixes tests with incorrectly ordered dictionaries. * [testing][wip] Is this supposed to require failing status codes? It appears that the `PhantomOutput._setup_container` should `return false`, which required stubbing the `get_mock` and `post_mock` to return failing status codes. This was added to allow the test to pass, but needs to be checked for correctness. * [testing] Update another test to use `assert_dict_equal`. * [testing] Update the DuoApp test to properly patch abstract methods. In Python 3 the `__abstractmethods__` variable contains all abstract methods which must be implemented in classes deriving from it. This causes problems when attempting to test the `DuoApp` class, since it does not implement the `_type` method. Furthermore, the `patch.object` decorator will patch instances of a particular object in all methods matching the specified method name test prefix (e.g. `test_some_method` where the test prefix is `test_*`). This causes issues because the `setup` function does not match the test prefix, which results in an error when attempting to create an instance of `DuoApp` before each test. This commit fixes this issue by patching all methods matching the test prefix, as well as explicitly patching the `setup` method. * [testing] Use the Python 3 AST when generating rule checksum in testing. The AST has changed between Python 2 and Python 3. Therefore, the test AST required updating in order to reflect the Python version upgrade. * [testing][wip] Update the terraform generation test to be order-agnostic. The use of pre-defined strings was replaced with the `ANY` helper to reduce dependency on order. Any value produced under these fields will result in a True assertion during the test, ensuring correct production, while failing to account for changes in specific key values. * [testing] Bump the version of travis to use Pyhon 3.7. * [testing] Xenial is required on Travis for Python 3.7. * address PR comments * removed reference to __nonzero__ * updates to manage.py for python3, plus some fixes for integration tests * update the terraform version configured by vagrant * updated use of the regex._pattern_type to regex.Pattern * lambdas now run on python3.7 * convert bytes to string for generating athena partition refresh query * fix for requirements.txt and athena unit tests * linter fixes and relevant unit test updates * bandit properly excludes the tests directory * documentation for getting started and contributing now references python 3.7 * [setup] Configure the Vagrantfile for Python 2.x and 3.x development. The `vagrant/` folder contains bash scripts used to configure virtualenv, virtualenvwrapper, streamalert, and terraform. Additionally, a patch for the recent libssl1.1 manual prompt during apt configuration is included to allow for automatic builds without user interaction. Build scripts are mostly configurable through the Vagrantfile and the manipulation of the exposed environment variables (`SA_*` environment variables). These allow versions and credentials to be specified and automatically propagated to the guest VM built by Vagrant. * [core][setup] Update dependencies to Python 3. The requirements-top-level.txt was modified to remove the version pinning of the aliyun-python-sdk-* dependencies, allowing the upgrade to Python 3 compatible versions. The requirements.txt was updated to reflect the Python 3 compatible versions of the project dependencies. * [testing] Run `2to3-2.7 -n -w tests/`, no modifications of output. * [core] The raw 2to3 pass for the core packages. This handles many of the syntax issues and semantic differences. It appears that the remaining issues are mostly type and library changes from Python 2.x to Python 3.x. * [core] Change core semantics related to strings vs bytes. In Python 3 the difference between strings and bytes are made explicit. Thes are able to be enforced through the use of `.encode(...)` and `.decode(...)`. This commit addresses this difference in most of the core library. * [testing] Fix the patching of builtins. The builtins package is now required to allow patching of builtins. This commit imports it in the appropriate tests and updates the patch targets appropriately. * [testing] Switch to assert_dict_equal and change order of test `dict`s. This commit continues to update certain comparisons to use the `assert_dict_equal` helper, and also fixes tests with incorrectly ordered dictionaries. * address PR comments * linter fixes and relevant unit test updates * updating the intercom unit tests to be py3 compliant * one last merge artifact fixed * removed outdated requirements * addressing initial PR comments * better error handling when manage.py invoked without proper commands, plus addressing PR comments * linter fixes * remove cbapi dependency from alert_processor lambda * added cbapi back in, pip install for lambda packaging updated to not use cache directory * merging master into release-3-0-0 (#982) * [apps] Correctly update aliyun timestamp (#978) * tweaking .gitignore file slightly for venv (#980) * misc fixups to python3 changes (#986) * adding __pycache__ to gitignore * adding logic to not raise exception on s3 test event in athena func * adding record logging upon rule failure * ensuring csv reader receives str when bytes is passed * ensuring same order for some terraform variables * fixing typo * adding unit test for csv bytes * adding traceback formatter for to prevent crappy output (#988) * removing ushlex that is incompatible with py3 (defaultshlex supports unicode now) (#990) * merging master into release-3-0-0 branch (#991) * [apps] Correctly update aliyun timestamp (#978) * tweaking .gitignore file slightly for venv (#980) * Hotfix/links and spelling (#967) * Fix broken link in the documentation * Minor spelling updates in comments, code, and docs * Bumped slack app timeout (#983) * small hack to retain original lambda formatter (#993) * small hack to retain original lambda formatter * allowing for optional formatter spec * fixing bug with joining non-string values (#994) * [apps][box] upgrade box sdk and rebuild the dependencies package (#997) * [apps][box] upgrade box sdk and rebuild the dependencies package * Address my wordy headers * adding proper vagrant ignore (#1002) * [rule_promotion] Change default value of alert_count to -1 (#1001) * [rule_promotion] Change default value of alert_count to -1 * Address Ryan's comment, good catch * [apps] Handle JSONDecoderError (#999) * [apps] Handle JSONDecoderError * Update comment * Breakup log schema file (#981) * Separated logs.json into multiple files * Added unit tests for split schema config * Moved schema loading logic into function * Added documentation to cover the new split file schemas * Removed extraneous import * Removed extraneous whitespace * Fixed wording, whitespace, added comments * Fixed more josn formatting * Added comments for SchemaSorter and clarified docs * More docs and fixed whitespace issues * Docs clarifications, fixed schemas dir test * Fixed another docs issue * Added test for logs and schema exists * Consolidated schemas logic * Fixed docs, added schemas as a TLK * Removed SchemaSorter inheritance from object * Exclude logs.json from being written CLI * Public release for LookupTables (#1003) * [LookupTables] Phase 1 implementation of LookupTables rebuild (#969) * First pass for lookuptables * First in a long line of tests * Metaprogramming hell * Removes old code * Working through kinks and whatever * Better messaging, better reliability * Tests and such * Tests for the S3 driver * It's coming together * Making more progress on dynamodb driver * Getting closer * Deletes old LookupTables * Maybe maybe maybe * Theres gotta bea better way * Gottem tests * Add tests for proving the concept of multi-lookuptables on a single DynamoDB Table * Add lots of tests and such * Ready for testing * A comment * Amending pylint errors * DRYS out DynamoDb mock testing code * Missing region * ? * adds cache classes prior to integrating them * Fixup * Ports over dynamodb driver * Ports S3 driver to cache * Pylint * Cache eviction * PR feedback fixup * pr feedback * Updates to fix python3 * Pylint * Refactors manage.py (#992) * App and configure commands * Athena command * Terraform build and clean * Create alarm commands * Metrics command * Fix bug * Terraform commands * List targets command * Status command * Kinesis and rollback command * Rule stagnign * Finished * Fixups * Ye * Pylint * Done * PR feedback; cleanup * Pr feedback * LookupTables management via manage.py (#984) * Refactors s3 driver stuff * Yeah!!!! it works!!! * Test coverage * Pylint * ?? * done * Pytlint * We gucci now * Changes default configurations * Fixup * Fix bug * PR feedback * Automatically generates AWS IAM Policies for LookupTables (#996) * Terraform modules * Fixup * Fix bugs * More fixups * Manage.py generate * Fix bug * BROKEN COMMIT programmatically generate roles * Fixup subtle bu * Uses aws_iam_role_policy_attachment instead of iam_policy_attachemnt * DRYs out terraform module generation code * LookupTables documentation (#1004) * Documentation * Updates test coverage and adds some bugfixes * fixup * Fixup * Fixup * Fixup * Fixes issues with Integration tests (#1005) * Fixes a bug in integrationtests Signed-off-by: Derek Wang <derek.wang@airbnb.com> * Touches up tests * pylint * Touchups * PR feedbak * Terraform format (#1007) * [LookupTables] Fixes bug with module generation (#1009) * Fixes a bug where clusters in stream alert CLI were wrong * Manage build erroneously said it accepted cluster arguments * Fix issue with defaults when no LookupTables resources are available * Re-works LookupTables terraform generation to work when types of tables are omitted * PR feedback * PR fixup * Accepts JSON from LookupTables CLI, adds `list-add` command (#1010) * Supports JSON for LookupTables * Yep * Refactors some stuff and adds list-add command * Fixup * Fixup * Rework pylint * fixing package naming for consistency (#1013) * renaming folders for consistency and updating imports * more updates to import paths, unit tests passing * migrating alert_merger tests * migrating shared tests * migrating apps tests * migrating athena partition refresh tests * migrating alert processor tests * migrating rule promotion tests * fixing bad copy/paste * migrating more publishers testing, other alert proc renames * migrating threat intel downloader tests * fixing more bad copy/paste * migrating streamalert cli terraform tests * migrating the rest of streamalert cli tests * removing old cruft fixtures * removing weird test printing * fixing docs version logic * updating logger prefix * adding reset logic to LookupTablesCore * fix more naming, related to terraform module paths (#1014) * renaming folders for consistency and updating imports * more updates to import paths, unit tests passing * migrating alert_merger tests * migrating shared tests * migrating apps tests * migrating athena partition refresh tests * migrating alert processor tests * migrating rule promotion tests * fixing bad copy/paste * migrating more publishers testing, other alert proc renames * migrating threat intel downloader tests * fixing more bad copy/paste * migrating streamalert cli terraform tests * migrating the rest of streamalert cli tests * removing old cruft fixtures * removing weird test printing * fixing docs version logic * updating logger prefix * adding reset logic to LookupTablesCore * updating tf_stream_alert_flow_logs module references * updating tf_stream_alert_app_iam module references * updating tf_stream_alert_athena module references * updating tf_stream_alert_cloudtrail module references * updating tf_stream_alert_cloudwatch module references * updating tf_stream_alert_globals module references * updating tf_stream_alert_kinesis_events module references * updating tf_stream_alert_kinesis_firehose_delivery_stream module references * updating tf_stream_alert_kinesis_firehose_setup module references * updating tf_stream_alert_kinesis_streams module references * updating tf_stream_alert_monitoring module references * updating tf_stream_alert_s3_events module references * updates to some stream_alert references previously missed * fixing formatting in docs * updates to terraform for better namespacing, consistency, tagging (#1015) * fixing role name in tf_cloudtrail module * fixing role name in tf_flow_logs module, updating tests * fixing role name in tf_kinesis_streams module, updating var name and tests * fixing role name in tf_cloudwatch module, updating var name and tests * fixing role name in tf_kinesis_firehose_setup module, updating var name and tests * updating to tf_threat_intel_downloader module, rm unused vars * role policy name updates, etc, etc * fixing tf_* module references * resolving more naming issues * adding tags to resources that support them * adding fix for #886 * fixing last small things * DRYing out some old code * updating sns topic with prefix * pr feedback * updating monitoring sns topic references * fix missing tf var issue (#1018) * fixing tf missing var * updating terraform aws provider * adding prefix to firehoses and other small updates (#1022) * removing athena default db name to enforce that one is provided * formatting nit * adding default value for metric to tf_metric_filters module * prefixing firehoses created for data * adding missing test docstring * making prefix for unit testing consistent * adding prefixing support for data firehoses * adding role_count var to lookuptables terraform due to tf bug (#1021) * fixing missing tf var and adding support for optional firehose prefixing (#1024) * fixing bug introduced in #1022 * adding optional prefixing for firehose * adding client support for optional prefix * nit naming change * updating old tf var for bug fixed upstream * fixing resource name in lookuptables * pr feedback * [WIP - do not merge] fixing bugs found during resource migration (#1025) * ensuring order of metric alarms remains unchanged * getting rid of terrible naming * fixing firehose permission with classifier * fixing sorting of metric alarms * fixing missed resource name * [LookupTables] Adds new JSON file import CLI command (#1017) * Enhancements to lookuptables cli * Fixup * Yea * Updates * Fix tests * pylint * Fixup PR feedback * fixup * Fix typo in docs (#1028) * Fix little bug (#1031) * AlertMerger query optimizations (#1030) * Fix generator bug and add a limit of 5000 alerts * Avoid consistent, blocking table scan * fixup * Removes a (probably) unused method. AlertTable.rule_names() (#1033) * updating the flow logs module to not create unnecessary resources (#1034) * updating the flow logs module to not create unnecessary resources * pr feedback * classifier function naming update, rules engine optional stats output (#1019) * updating classifier naming convention; BREAKING CHANGE * update to support toggling output of rule stats * docstring update and import removal * removing cruft * making change for proper formatting of kinesis stream names (#1027) * making change for proper formatting of kinesis stream names * adding custom kinesis stream name support * updating unit tests for custom stream name support, adding new one * removing 'kinesis' suffix from stream names * removing unused variable * adding documentation for optional kinesis stream name (#1037) * adding documentation for optional kinesis stream name * nit change * fixing iam group prefix * small tweak to env var setting for rules lambda (#1039) * Migrate sources to cluster (#1008) * Added data_sources to clusters * Moved sources to cluster config * Updated unit tests * Changed CLUSTERS env var to actual cluster * Updated tests, docs and references to sources TLK * Docs update * Python 2-3 compat issue * Pylint fixes * Realized validate sources is an internal function * Added data_sources TLK to example json blocks * Switched from using tuples * Quote cleanup, moved more code into validate_sources * More cleanup * Get env variable at load time not runtime * Fix cluster env errors, detect duplicate data sources * Removed no-op i forgot about from testing * Fixed docs errors, corrected dupe source checking * Fixed confusion with missing sources versus invalid sources * Fixed some namespacing and import order from rebase * Fixed more namespacing * Fixed global loading sources config in test classifier * Added multi cluster support for testing * Fixed some bad decisions * Name changes * Added brea for finding cluster in cli handler * Wording cleanup, error on missind data_sources, other cleanup * Fixed a missing , * [terraform] update cloudwatch and flow logs terraform module to reduce redundancy (#1041) * adding aliased terraform providers for regions * updating tf_flow_logs module to remove redundancy * adding a tf_cloudwatch_logs_destination module to replace tf_cloudwatch module * removing replaced tf_cloudwatch module * updating infinitedict function to support initial value * updating terraform generation code for aforementioned changes * updating tests 01 * removing legacy test because we ain't living in the past * fixing tests for default terraform settings * updating tests 02 * updates to documentation * addressing PR feedback * fixing a few bugs (#1042) * update gsuite apps for changes to gsuite api (#1046) * updating gsuite streamalert apps * removing deprecated saml app * rm unused import * updating requirements for new google api client, etc * updating packaging for new pinned versions * updating gsuite apps for new python sdk library * misc pylint fixup * pinning pylint version because annoyed * [cli] Handle when AttributeError when test classifier (#1045) * [cli] Handle when AttributeError when test classifier * Address comment * fix bugs with updated dependencies and gsuite app (#1049) * updating box precompiled dependencies * fixing gsuite groups app name * updating readme * final updates for terraform 12 support (#1052) * [terraform] upgrade Terraform to 0.12.9 (#1035) * Terraform 0.12 upgrade * adding conf files back * minor consmetic edits * test case fix * terraform upgrade on new files * removing deprecated sources.json * not sorting keys in config, since it is not necessary * fixing typo * removing s3 events legacy garbage * updating tf_s3_events module for proper handling of more than one filter per bucket * updating tf_s3_events module generation code and tests * updating documentation to reflect changes to tf_s3_events module * fixing unit test * fixing small bugs with terraform 12 code (#1053) * fixing some bugs found with terraform 0.12.9 deployment * fixing some bugs and removing unnecessary prefixing * Critical API call detection fixes (#1029) * [apps] Correctly update aliyun timestamp (#978) * tweaking .gitignore file slightly for venv (#980) * Hotfix/links and spelling (#967) * Fix broken link in the documentation * Minor spelling updates in comments, code, and docs * Bumped slack app timeout (#983) * Fix logic for S3 Public Block Access; Add detection of Organization calls * update to use sets vs lists * [rules] Detect use of PutAccountPublicAccessBlock (#1023) * [apps] Correctly update aliyun timestamp (#978) * tweaking .gitignore file slightly for venv (#980) * Hotfix/links and spelling (#967) * Fix broken link in the documentation * Minor spelling updates in comments, code, and docs * Bumped slack app timeout (#983) * Detect use of PutAccountPublicAccessBlock * [docs] Add Glue to the permissions needed (#1020) * [apps] Correctly update aliyun timestamp (#978) * tweaking .gitignore file slightly for venv (#980) * Hotfix/links and spelling (#967) * Fix broken link in the documentation * Minor spelling updates in comments, code, and docs * Bumped slack app timeout (#983) * Add Glue to the permissions needed Learnt this the hard way... * adding fix for box app request timeout (#1040) (#1057) * adding fix for #1040 * addressing pr feedback * [terraform] support remote state file locking (#1059) * Added terraform state file locking support * Switched from enabled to disabled by default * Updated tests to account for new configuration * Updated documentation to explain how to setup state locking * DDB Table is now created automagically * Finalized code for managing the dynamo table * Added tests * Removed no longer needed config * Updated docs * Various updates from PR feedback * Pylint fixes whoops * Import order, trailing whitespace * Fixed function comment format * [cli] Fix unknown module referenced bug (#1060) * raise exception when config sources are misconfigured (#1063) * raising exception when source is not defined * updating unit tests for source exceptions * fixing pylint * updating old ConfigError class * renaming all known references of "stream_alert" to "streamalert" (#1064) * mass rename of stream_alert_secrets * updating config reading for new "streamalert" module name * updating unit tests for config read logic * fixing unit tests that did not have a docstring * updating stream_alert_app references * vagrant naming update * more stream_alert_apps renames * remaining batch rename of stream_alert --> streamalert * [schema] Add two new keys to osquery.json file (#1062) * [cli] Allow to rebuild partitions when the statement length exceeds the limit (#1067) * [cli] Allow to rebuild partitions when the statement length exceeds the limit * address comments * refactoring the cloudtrail terraform module to decouple cloudwatch events (#1069) * updating s3_events tf module for flexibility * renaming cloudwatch module to cloudwatch_destinations * moving cloudwatch events to its own module * updating cloudtrail terraform module to not be so poop filled * updating readmes * moving cloudtrail --> cloudwatch logs logic to tf submodule * updating cloudtrail module generation code and tests * small tweak to cloudtrail generation * updating docs for new modules * small update * removing invalid/unused statement (#1071) * Don't throw an exception if no partitions are added (#1070) * Don't throw an exception if no partitions are added * Switched to assert_has_calls * Cleaned up unused imports * Docs/general update (#1076) * Updated terraform version and git branch Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * removed step no longer required, as the choices are dynamically created based on the @StreamAlertOutput class decorator Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * reset to stable and changed note Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [docs] Correcting URL in contributing.rst as previous a HTTP 404 error (#1081) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Fix a parser bug when processing raw event encapsulated in a string (#1085) see issue: #1084 for more information Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [unit test] Use single quote around strings (#1087) * [core] Adding trendmicro malware schema and rule (#1077) [testing] Added trendmicro schema and rule test Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [terraform] Implemented a fix for the count error (#1089) [testing] fixed test for rules_engine assertion Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [terraform] fixed destroy issue by reverting #1060 (#1093) This in-turn re-introduced #1047. I fixed this by ensuring that the cleanup function removes the metric_filters.tf.json file, otherwise terraform reads this in as part of its deployment. Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [testing] enable trend tests, previously only schema (#1096) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [rule] Fix cloudtrail_public_resources (#1102) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Updated cloudtrail:events optional_key (#1101) Updated the optional_top_level_keys for cloudtrail:events Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Added trendmicro normalized_types (#1105) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * updating duo auth schema for new alias key (#1129) * Fix bug with default value for firehose use_prefix (#1122) * [docs/misc] documentation overhaul, config format changes, removing periods from bucket names (#1114) * updating secrets bucket name to remove periods * updating s3-logging bucket name to remove periods * updating athena-results bucket name to remove periods * updating streamalerts bucket name to remove periods * updating terraform-state bucket name to remove periods * updating streamalert-data bucket name to remove periods * fixing misc places that were missed regarding periods in bucket names * making data and alerts bucket names configurable * restructuring docs to allow for highlighting global settings * doc updates for alerts_table config * updating documentation, round 1 * moving clusters and global docs to new file * trailing space removal * Update to commands for consistency * misc formatting fixes, migrating rule-staging config to global * updating cluster config docs * massive updates to docs * adding other changes related to doc updates and config changes * adding prefix validation for periods * removing prefix setting trash * updates to remove need for setting terraform config * removing nonsense for athena bucket configuration * addressing PR comments * removing kinesis region setting since it would break things * addressing chunyong PR feedback * adding new streamalert images and updating doc references * [terraform] switching to state lock table being managed by terraform (#1131) * Switch State Lock Table to be Managed by Terraform * Missed some things whoops * Pylint....................... * Add newline between functions * [misc] updating authors/contributors, copyright updates, docs project name fix (#1134) * updating copyright name * updating authors * updating docs project name and copyright * updating README.rst * more copyright fix * misc doc updates * re-adding background to images so the content is visible upon click * final update to images (hopefully) * [core] Moved the secret_store from S3 to SSM (#1142) * [core] Moved the secret_store from S3 to SSM [testing] Updated unit_tests to use SSM instead of S3 [terraform] Updated alert_processor permissions so it can pull from param store Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Removed S3Driver for Credentials [testing] Added SSMDriver Tests Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> Co-authored-by: Ryxias <sunsilverdragon@gmail.com> * Follow up PR to SSM migration (#1147) * Changes the ssm credentials conventions * Refactors some code to be clearer * Fixup * Add a FIXME comment * Another fixup * Doublequote * fixup * [cli] Added additional commands to outputs (#1138) * [cli] Added additional commands to outputs 'set' - Set one output, using user_input (pass --update to overwrite existing outputs) 'get' - Get configured outputs for a service (includes creds) optionally pass --decriptors to only pull certain descriptor secrets for the service 'set-from-file' - Set numerous outputs via a json file. Can set multiple services and descriptors (pass --update to overwrite existing outputs) 'generate-skeleton' - Use to create a skeleton json file to be used with 'set-from-file' Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [cli] Added 'list' to outputs Also updated inline with PR comments Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * fixing bug related to athena data bucket not existing (#1163) * adding fix for #1158 * removing unnecessary perms for data bucket access by rule promotion function * [cli] Fix backend initialization (#1166) Previously, the init command had a '-b' flag. This only initialized a local backend but the description on the flag stated 'useful for refreshing a pre-existing deployment'. This flag now actually reflects this Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [cli] Fix backend initialization (#1166) Previously, the init command had a '-b' flag. This only initialized a local backend but the description on the flag stated 'useful for refreshing a pre-existing deployment'. This flag now actually reflects this Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * fixing venv module usage (#1174) * adding updated graphic with new font (#1182) * updating graphic one more time (#1184) * Remove sample access bucket name to prevent broken deploys * feedback Co-authored-by: Ryxias <derek.wang@airbnb.com> Co-authored-by: Garret Reece <GarretReece@users.noreply.github.com> Co-authored-by: darkjokelady <chunyong.lin@gmail.com> Co-authored-by: Blake Motl <blake@motl.dev> Co-authored-by: Scott Piper <0xdabbad00@users.noreply.github.com> Co-authored-by: Ricard Flores Duran <ricard.flores@teamcmp.com> Co-authored-by: jack1902 <39212456+jack1902@users.noreply.github.com> Co-authored-by: Ryxias <sunsilverdragon@gmail.com>

* Changes 3.0.0+ behavior to by default include SQS prefix (#979) * Changes 3.0.0+ behavior to by default include SQS prefix * PR comment * Fix up tests * Convert all python to python3.7 (#974) * [setup] Configure the Vagrantfile for Python 2.x and 3.x development. The `vagrant/` folder contains bash scripts used to configure virtualenv, virtualenvwrapper, streamalert, and terraform. Additionally, a patch for the recent libssl1.1 manual prompt during apt configuration is included to allow for automatic builds without user interaction. Build scripts are mostly configurable through the Vagrantfile and the manipulation of the exposed environment variables (`SA_*` environment variables). These allow versions and credentials to be specified and automatically propagated to the guest VM built by Vagrant. * [core][setup] Update dependencies to Python 3. The requirements-top-level.txt was modified to remove the version pinning of the aliyun-python-sdk-* dependencies, allowing the upgrade to Python 3 compatible versions. The requirements.txt was updated to reflect the Python 3 compatible versions of the project dependencies. * [setup] Fix the `.gitignore` after the fork merge which required rebase. * [testing] Run `2to3-2.7 -n -w tests/`, no modifications of output. * [testing] Change `assert_items_equal` to `assert_count_equal`. In Python 3, the `assertItemsEqual` function is named `assertCountEqual` (https://docs.python.org/2/library/unittest.html#unittest.TestCase.assertItemsEqual). Since Nose derives from this naming, the function references have been updated across all tests so imports work correctly. * [core] The raw 2to3 pass for the core packages. This handles many of the syntax issues and semantic differences. It appears that the remaining issues are mostly type and library changes from Python 2.x to Python 3.x. * [core] Change division work the same as in Python2. The Python '/' division results in an integer number. However, in Python 3 it results in a float. By changing it to '//' it performs floored division. * [core] Change core semantics related to strings vs bytes. In Python 3 the difference between strings and bytes are made explicit. Thes are able to be enforced through the use of `.encode(...)` and `.decode(...)`. This commit addresses this difference in most of the core library. * [core] Remove the forward compatibility __bool__. * [testing] Fix the patching of builtins. The builtins package is now required to allow patching of builtins. This commit imports it in the appropriate tests and updates the patch targets appropriately. * [testing] The update the use of `reload` for Python 3. The `reload` function was moved to the `importlib` package in Python 3. This commit adds the appropriate import and updates the usage within tests. * [core] Update error message reporting for Python3. In Python 3, not all error objects have the `.message` attribute. Instead, the generally accepted practice is to perform a `str(exception)` to convert it into a string. This commit updates the usage appropriately across the core library. * [core] Update generators to use return instead of StopIteration. In the later versions of Python 3, `StopIteration` explicitly raises the error instead of signaling an exhausted generator. This commit converts the use of `StopIteration` to `return`, signaling generator exhaustion without raising the error. * [testing] Update the alert merger test to disregard call order. The `mock_logger.assert_has_calls(...)` asserts that the calls provided are executed in the order provided. However, in Python 3, the order of calls is different. The `any_oder=True` parameter is provided to the assertion to disregard this order. Upon deeper analysis of the test semantics, it is testing to ensure that merging, deletion, and dispatching occur appropriately. The implementation semantics appear to be unchanged correct according to the test, and the only difference between the Python 2 and Python 3 versions is execution order. * [testing] Mock a class variable to support a `dict.get`. Since the `Normalizer` class is mocked, it returns `None` for the `_types_config` class variable of type `dict`. In order to allow a `.get`, the `_types_config` is patched to be an empty `dict`. * [testing] Update the error returned by a failed JSON parse call. * [testing] Switch to assert_dict_equal and change order of test `dict`s. This commit continues to update certain comparisons to use the `assert_dict_equal` helper, and also fixes tests with incorrectly ordered dictionaries. * [testing][wip] Is this supposed to require failing status codes? It appears that the `PhantomOutput._setup_container` should `return false`, which required stubbing the `get_mock` and `post_mock` to return failing status codes. This was added to allow the test to pass, but needs to be checked for correctness. * [testing] Update another test to use `assert_dict_equal`. * [testing] Update the DuoApp test to properly patch abstract methods. In Python 3 the `__abstractmethods__` variable contains all abstract methods which must be implemented in classes deriving from it. This causes problems when attempting to test the `DuoApp` class, since it does not implement the `_type` method. Furthermore, the `patch.object` decorator will patch instances of a particular object in all methods matching the specified method name test prefix (e.g. `test_some_method` where the test prefix is `test_*`). This causes issues because the `setup` function does not match the test prefix, which results in an error when attempting to create an instance of `DuoApp` before each test. This commit fixes this issue by patching all methods matching the test prefix, as well as explicitly patching the `setup` method. * [testing] Use the Python 3 AST when generating rule checksum in testing. The AST has changed between Python 2 and Python 3. Therefore, the test AST required updating in order to reflect the Python version upgrade. * [testing][wip] Update the terraform generation test to be order-agnostic. The use of pre-defined strings was replaced with the `ANY` helper to reduce dependency on order. Any value produced under these fields will result in a True assertion during the test, ensuring correct production, while failing to account for changes in specific key values. * [testing] Bump the version of travis to use Pyhon 3.7. * [testing] Xenial is required on Travis for Python 3.7. * address PR comments * removed reference to __nonzero__ * updates to manage.py for python3, plus some fixes for integration tests * update the terraform version configured by vagrant * updated use of the regex._pattern_type to regex.Pattern * lambdas now run on python3.7 * convert bytes to string for generating athena partition refresh query * fix for requirements.txt and athena unit tests * linter fixes and relevant unit test updates * bandit properly excludes the tests directory * documentation for getting started and contributing now references python 3.7 * [setup] Configure the Vagrantfile for Python 2.x and 3.x development. The `vagrant/` folder contains bash scripts used to configure virtualenv, virtualenvwrapper, streamalert, and terraform. Additionally, a patch for the recent libssl1.1 manual prompt during apt configuration is included to allow for automatic builds without user interaction. Build scripts are mostly configurable through the Vagrantfile and the manipulation of the exposed environment variables (`SA_*` environment variables). These allow versions and credentials to be specified and automatically propagated to the guest VM built by Vagrant. * [core][setup] Update dependencies to Python 3. The requirements-top-level.txt was modified to remove the version pinning of the aliyun-python-sdk-* dependencies, allowing the upgrade to Python 3 compatible versions. The requirements.txt was updated to reflect the Python 3 compatible versions of the project dependencies. * [testing] Run `2to3-2.7 -n -w tests/`, no modifications of output. * [testing] Change `assert_items_equal` to `assert_count_equal`. In Python 3, the `assertItemsEqual` function is named `assertCountEqual` (https://docs.python.org/2/library/unittest.html#unittest.TestCase.assertItemsEqual). Since Nose derives from this naming, the function references have been updated across all tests so imports work correctly. * [setup] Fix the `.gitignore` after the fork merge which required rebase. * [core] The raw 2to3 pass for the core packages. This handles many of the syntax issues and semantic differences. It appears that the remaining issues are mostly type and library changes from Python 2.x to Python 3.x. * [core] Change division work the same as in Python2. The Python '/' division results in an integer number. However, in Python 3 it results in a float. By changing it to '//' it performs floored division. * [core] Change core semantics related to strings vs bytes. In Python 3 the difference between strings and bytes are made explicit. Thes are able to be enforced through the use of `.encode(...)` and `.decode(...)`. This commit addresses this difference in most of the core library. * [core] Remove the forward compatibility __bool__. * [testing] Fix the patching of builtins. The builtins package is now required to allow patching of builtins. This commit imports it in the appropriate tests and updates the patch targets appropriately. * [testing] The update the use of `reload` for Python 3. The `reload` function was moved to the `importlib` package in Python 3. This commit adds the appropriate import and updates the usage within tests. * [core] Update error message reporting for Python3. In Python 3, not all error objects have the `.message` attribute. Instead, the generally accepted practice is to perform a `str(exception)` to convert it into a string. This commit updates the usage appropriately across the core library. * [core] Update generators to use return instead of StopIteration. In the later versions of Python 3, `StopIteration` explicitly raises the error instead of signaling an exhausted generator. This commit converts the use of `StopIteration` to `return`, signaling generator exhaustion without raising the error. * [testing] Update the alert merger test to disregard call order. The `mock_logger.assert_has_calls(...)` asserts that the calls provided are executed in the order provided. However, in Python 3, the order of calls is different. The `any_oder=True` parameter is provided to the assertion to disregard this order. Upon deeper analysis of the test semantics, it is testing to ensure that merging, deletion, and dispatching occur appropriately. The implementation semantics appear to be unchanged correct according to the test, and the only difference between the Python 2 and Python 3 versions is execution order. * [testing] Mock a class variable to support a `dict.get`. Since the `Normalizer` class is mocked, it returns `None` for the `_types_config` class variable of type `dict`. In order to allow a `.get`, the `_types_config` is patched to be an empty `dict`. * [testing] Update the error returned by a failed JSON parse call. * [testing] Switch to assert_dict_equal and change order of test `dict`s. This commit continues to update certain comparisons to use the `assert_dict_equal` helper, and also fixes tests with incorrectly ordered dictionaries. * [testing][wip] Is this supposed to require failing status codes? It appears that the `PhantomOutput._setup_container` should `return false`, which required stubbing the `get_mock` and `post_mock` to return failing status codes. This was added to allow the test to pass, but needs to be checked for correctness. * [testing] Update another test to use `assert_dict_equal`. * [testing] Update the DuoApp test to properly patch abstract methods. In Python 3 the `__abstractmethods__` variable contains all abstract methods which must be implemented in classes deriving from it. This causes problems when attempting to test the `DuoApp` class, since it does not implement the `_type` method. Furthermore, the `patch.object` decorator will patch instances of a particular object in all methods matching the specified method name test prefix (e.g. `test_some_method` where the test prefix is `test_*`). This causes issues because the `setup` function does not match the test prefix, which results in an error when attempting to create an instance of `DuoApp` before each test. This commit fixes this issue by patching all methods matching the test prefix, as well as explicitly patching the `setup` method. * [testing] Use the Python 3 AST when generating rule checksum in testing. The AST has changed between Python 2 and Python 3. Therefore, the test AST required updating in order to reflect the Python version upgrade. * [testing][wip] Update the terraform generation test to be order-agnostic. The use of pre-defined strings was replaced with the `ANY` helper to reduce dependency on order. Any value produced under these fields will result in a True assertion during the test, ensuring correct production, while failing to account for changes in specific key values. * [testing] Bump the version of travis to use Pyhon 3.7. * [testing] Xenial is required on Travis for Python 3.7. * address PR comments * removed reference to __nonzero__ * updates to manage.py for python3, plus some fixes for integration tests * update the terraform version configured by vagrant * updated use of the regex._pattern_type to regex.Pattern * lambdas now run on python3.7 * convert bytes to string for generating athena partition refresh query * fix for requirements.txt and athena unit tests * linter fixes and relevant unit test updates * bandit properly excludes the tests directory * documentation for getting started and contributing now references python 3.7 * [setup] Configure the Vagrantfile for Python 2.x and 3.x development. The `vagrant/` folder contains bash scripts used to configure virtualenv, virtualenvwrapper, streamalert, and terraform. Additionally, a patch for the recent libssl1.1 manual prompt during apt configuration is included to allow for automatic builds without user interaction. Build scripts are mostly configurable through the Vagrantfile and the manipulation of the exposed environment variables (`SA_*` environment variables). These allow versions and credentials to be specified and automatically propagated to the guest VM built by Vagrant. * [core][setup] Update dependencies to Python 3. The requirements-top-level.txt was modified to remove the version pinning of the aliyun-python-sdk-* dependencies, allowing the upgrade to Python 3 compatible versions. The requirements.txt was updated to reflect the Python 3 compatible versions of the project dependencies. * [setup] Fix the `.gitignore` after the fork merge which required rebase. * [testing] Run `2to3-2.7 -n -w tests/`, no modifications of output. * [testing] Change `assert_items_equal` to `assert_count_equal`. In Python 3, the `assertItemsEqual` function is named `assertCountEqual` (https://docs.python.org/2/library/unittest.html#unittest.TestCase.assertItemsEqual). Since Nose derives from this naming, the function references have been updated across all tests so imports work correctly. * [core] The raw 2to3 pass for the core packages. This handles many of the syntax issues and semantic differences. It appears that the remaining issues are mostly type and library changes from Python 2.x to Python 3.x. * [core] Change division work the same as in Python2. The Python '/' division results in an integer number. However, in Python 3 it results in a float. By changing it to '//' it performs floored division. * [core] Change core semantics related to strings vs bytes. In Python 3 the difference between strings and bytes are made explicit. Thes are able to be enforced through the use of `.encode(...)` and `.decode(...)`. This commit addresses this difference in most of the core library. * [core] Remove the forward compatibility __bool__. * [testing] Fix the patching of builtins. The builtins package is now required to allow patching of builtins. This commit imports it in the appropriate tests and updates the patch targets appropriately. * [testing] The update the use of `reload` for Python 3. The `reload` function was moved to the `importlib` package in Python 3. This commit adds the appropriate import and updates the usage within tests. * [core] Update error message reporting for Python3. In Python 3, not all error objects have the `.message` attribute. Instead, the generally accepted practice is to perform a `str(exception)` to convert it into a string. This commit updates the usage appropriately across the core library. * [core] Update generators to use return instead of StopIteration. In the later versions of Python 3, `StopIteration` explicitly raises the error instead of signaling an exhausted generator. This commit converts the use of `StopIteration` to `return`, signaling generator exhaustion without raising the error. * [testing] Update the alert merger test to disregard call order. The `mock_logger.assert_has_calls(...)` asserts that the calls provided are executed in the order provided. However, in Python 3, the order of calls is different. The `any_oder=True` parameter is provided to the assertion to disregard this order. Upon deeper analysis of the test semantics, it is testing to ensure that merging, deletion, and dispatching occur appropriately. The implementation semantics appear to be unchanged correct according to the test, and the only difference between the Python 2 and Python 3 versions is execution order. * [testing] Mock a class variable to support a `dict.get`. Since the `Normalizer` class is mocked, it returns `None` for the `_types_config` class variable of type `dict`. In order to allow a `.get`, the `_types_config` is patched to be an empty `dict`. * [testing] Update the error returned by a failed JSON parse call. * [testing] Switch to assert_dict_equal and change order of test `dict`s. This commit continues to update certain comparisons to use the `assert_dict_equal` helper, and also fixes tests with incorrectly ordered dictionaries. * [testing][wip] Is this supposed to require failing status codes? It appears that the `PhantomOutput._setup_container` should `return false`, which required stubbing the `get_mock` and `post_mock` to return failing status codes. This was added to allow the test to pass, but needs to be checked for correctness. * [testing] Update another test to use `assert_dict_equal`. * [testing] Update the DuoApp test to properly patch abstract methods. In Python 3 the `__abstractmethods__` variable contains all abstract methods which must be implemented in classes deriving from it. This causes problems when attempting to test the `DuoApp` class, since it does not implement the `_type` method. Furthermore, the `patch.object` decorator will patch instances of a particular object in all methods matching the specified method name test prefix (e.g. `test_some_method` where the test prefix is `test_*`). This causes issues because the `setup` function does not match the test prefix, which results in an error when attempting to create an instance of `DuoApp` before each test. This commit fixes this issue by patching all methods matching the test prefix, as well as explicitly patching the `setup` method. * [testing] Use the Python 3 AST when generating rule checksum in testing. The AST has changed between Python 2 and Python 3. Therefore, the test AST required updating in order to reflect the Python version upgrade. * [testing][wip] Update the terraform generation test to be order-agnostic. The use of pre-defined strings was replaced with the `ANY` helper to reduce dependency on order. Any value produced under these fields will result in a True assertion during the test, ensuring correct production, while failing to account for changes in specific key values. * [testing] Bump the version of travis to use Pyhon 3.7. * [testing] Xenial is required on Travis for Python 3.7. * address PR comments * removed reference to __nonzero__ * updates to manage.py for python3, plus some fixes for integration tests * update the terraform version configured by vagrant * updated use of the regex._pattern_type to regex.Pattern * lambdas now run on python3.7 * convert bytes to string for generating athena partition refresh query * fix for requirements.txt and athena unit tests * linter fixes and relevant unit test updates * bandit properly excludes the tests directory * documentation for getting started and contributing now references python 3.7 * [setup] Configure the Vagrantfile for Python 2.x and 3.x development. The `vagrant/` folder contains bash scripts used to configure virtualenv, virtualenvwrapper, streamalert, and terraform. Additionally, a patch for the recent libssl1.1 manual prompt during apt configuration is included to allow for automatic builds without user interaction. Build scripts are mostly configurable through the Vagrantfile and the manipulation of the exposed environment variables (`SA_*` environment variables). These allow versions and credentials to be specified and automatically propagated to the guest VM built by Vagrant. * [core][setup] Update dependencies to Python 3. The requirements-top-level.txt was modified to remove the version pinning of the aliyun-python-sdk-* dependencies, allowing the upgrade to Python 3 compatible versions. The requirements.txt was updated to reflect the Python 3 compatible versions of the project dependencies. * [testing] Run `2to3-2.7 -n -w tests/`, no modifications of output. * [core] The raw 2to3 pass for the core packages. This handles many of the syntax issues and semantic differences. It appears that the remaining issues are mostly type and library changes from Python 2.x to Python 3.x. * [core] Change core semantics related to strings vs bytes. In Python 3 the difference between strings and bytes are made explicit. Thes are able to be enforced through the use of `.encode(...)` and `.decode(...)`. This commit addresses this difference in most of the core library. * [testing] Fix the patching of builtins. The builtins package is now required to allow patching of builtins. This commit imports it in the appropriate tests and updates the patch targets appropriately. * [testing] Switch to assert_dict_equal and change order of test `dict`s. This commit continues to update certain comparisons to use the `assert_dict_equal` helper, and also fixes tests with incorrectly ordered dictionaries. * address PR comments * linter fixes and relevant unit test updates * updating the intercom unit tests to be py3 compliant * one last merge artifact fixed * removed outdated requirements * addressing initial PR comments * better error handling when manage.py invoked without proper commands, plus addressing PR comments * linter fixes * remove cbapi dependency from alert_processor lambda * added cbapi back in, pip install for lambda packaging updated to not use cache directory * merging master into release-3-0-0 (#982) * [apps] Correctly update aliyun timestamp (#978) * tweaking .gitignore file slightly for venv (#980) * misc fixups to python3 changes (#986) * adding __pycache__ to gitignore * adding logic to not raise exception on s3 test event in athena func * adding record logging upon rule failure * ensuring csv reader receives str when bytes is passed * ensuring same order for some terraform variables * fixing typo * adding unit test for csv bytes * adding traceback formatter for to prevent crappy output (#988) * removing ushlex that is incompatible with py3 (defaultshlex supports unicode now) (#990) * merging master into release-3-0-0 branch (#991) * [apps] Correctly update aliyun timestamp (#978) * tweaking .gitignore file slightly for venv (#980) * Hotfix/links and spelling (#967) * Fix broken link in the documentation * Minor spelling updates in comments, code, and docs * Bumped slack app timeout (#983) * small hack to retain original lambda formatter (#993) * small hack to retain original lambda formatter * allowing for optional formatter spec * fixing bug with joining non-string values (#994) * [apps][box] upgrade box sdk and rebuild the dependencies package (#997) * [apps][box] upgrade box sdk and rebuild the dependencies package * Address my wordy headers * adding proper vagrant ignore (#1002) * [rule_promotion] Change default value of alert_count to -1 (#1001) * [rule_promotion] Change default value of alert_count to -1 * Address Ryan's comment, good catch * [apps] Handle JSONDecoderError (#999) * [apps] Handle JSONDecoderError * Update comment * Breakup log schema file (#981) * Separated logs.json into multiple files * Added unit tests for split schema config * Moved schema loading logic into function * Added documentation to cover the new split file schemas * Removed extraneous import * Removed extraneous whitespace * Fixed wording, whitespace, added comments * Fixed more josn formatting * Added comments for SchemaSorter and clarified docs * More docs and fixed whitespace issues * Docs clarifications, fixed schemas dir test * Fixed another docs issue * Added test for logs and schema exists * Consolidated schemas logic * Fixed docs, added schemas as a TLK * Removed SchemaSorter inheritance from object * Exclude logs.json from being written CLI * Public release for LookupTables (#1003) * [LookupTables] Phase 1 implementation of LookupTables rebuild (#969) * First pass for lookuptables * First in a long line of tests * Metaprogramming hell * Removes old code * Working through kinks and whatever * Better messaging, better reliability * Tests and such * Tests for the S3 driver * It's coming together * Making more progress on dynamodb driver * Getting closer * Deletes old LookupTables * Maybe maybe maybe * Theres gotta bea better way * Gottem tests * Add tests for proving the concept of multi-lookuptables on a single DynamoDB Table * Add lots of tests and such * Ready for testing * A comment * Amending pylint errors * DRYS out DynamoDb mock testing code * Missing region * ? * adds cache classes prior to integrating them * Fixup * Ports over dynamodb driver * Ports S3 driver to cache * Pylint * Cache eviction * PR feedback fixup * pr feedback * Updates to fix python3 * Pylint * Refactors manage.py (#992) * App and configure commands * Athena command * Terraform build and clean * Create alarm commands * Metrics command * Fix bug * Terraform commands * List targets command * Status command * Kinesis and rollback command * Rule stagnign * Finished * Fixups * Ye * Pylint * Done * PR feedback; cleanup * Pr feedback * LookupTables management via manage.py (#984) * Refactors s3 driver stuff * Yeah!!!! it works!!! * Test coverage * Pylint * ?? * done * Pytlint * We gucci now * Changes default configurations * Fixup * Fix bug * PR feedback * Automatically generates AWS IAM Policies for LookupTables (#996) * Terraform modules * Fixup * Fix bugs * More fixups * Manage.py generate * Fix bug * BROKEN COMMIT programmatically generate roles * Fixup subtle bu * Uses aws_iam_role_policy_attachment instead of iam_policy_attachemnt * DRYs out terraform module generation code * LookupTables documentation (#1004) * Documentation * Updates test coverage and adds some bugfixes * fixup * Fixup * Fixup * Fixup * Fixes issues with Integration tests (#1005) * Fixes a bug in integrationtests Signed-off-by: Derek Wang <derek.wang@airbnb.com> * Touches up tests * pylint * Touchups * PR feedbak * Terraform format (#1007) * [LookupTables] Fixes bug with module generation (#1009) * Fixes a bug where clusters in stream alert CLI were wrong * Manage build erroneously said it accepted cluster arguments * Fix issue with defaults when no LookupTables resources are available * Re-works LookupTables terraform generation to work when types of tables are omitted * PR feedback * PR fixup * Accepts JSON from LookupTables CLI, adds `list-add` command (#1010) * Supports JSON for LookupTables * Yep * Refactors some stuff and adds list-add command * Fixup * Fixup * Rework pylint * fixing package naming for consistency (#1013) * renaming folders for consistency and updating imports * more updates to import paths, unit tests passing * migrating alert_merger tests * migrating shared tests * migrating apps tests * migrating athena partition refresh tests * migrating alert processor tests * migrating rule promotion tests * fixing bad copy/paste * migrating more publishers testing, other alert proc renames * migrating threat intel downloader tests * fixing more bad copy/paste * migrating streamalert cli terraform tests * migrating the rest of streamalert cli tests * removing old cruft fixtures * removing weird test printing * fixing docs version logic * updating logger prefix * adding reset logic to LookupTablesCore * fix more naming, related to terraform module paths (#1014) * renaming folders for consistency and updating imports * more updates to import paths, unit tests passing * migrating alert_merger tests * migrating shared tests * migrating apps tests * migrating athena partition refresh tests * migrating alert processor tests * migrating rule promotion tests * fixing bad copy/paste * migrating more publishers testing, other alert proc renames * migrating threat intel downloader tests * fixing more bad copy/paste * migrating streamalert cli terraform tests * migrating the rest of streamalert cli tests * removing old cruft fixtures * removing weird test printing * fixing docs version logic * updating logger prefix * adding reset logic to LookupTablesCore * updating tf_stream_alert_flow_logs module references * updating tf_stream_alert_app_iam module references * updating tf_stream_alert_athena module references * updating tf_stream_alert_cloudtrail module references * updating tf_stream_alert_cloudwatch module references * updating tf_stream_alert_globals module references * updating tf_stream_alert_kinesis_events module references * updating tf_stream_alert_kinesis_firehose_delivery_stream module references * updating tf_stream_alert_kinesis_firehose_setup module references * updating tf_stream_alert_kinesis_streams module references * updating tf_stream_alert_monitoring module references * updating tf_stream_alert_s3_events module references * updates to some stream_alert references previously missed * fixing formatting in docs * updates to terraform for better namespacing, consistency, tagging (#1015) * fixing role name in tf_cloudtrail module * fixing role name in tf_flow_logs module, updating tests * fixing role name in tf_kinesis_streams module, updating var name and tests * fixing role name in tf_cloudwatch module, updating var name and tests * fixing role name in tf_kinesis_firehose_setup module, updating var name and tests * updating to tf_threat_intel_downloader module, rm unused vars * role policy name updates, etc, etc * fixing tf_* module references * resolving more naming issues * adding tags to resources that support them * adding fix for #886 * fixing last small things * DRYing out some old code * updating sns topic with prefix * pr feedback * updating monitoring sns topic references * fix missing tf var issue (#1018) * fixing tf missing var * updating terraform aws provider * adding prefix to firehoses and other small updates (#1022) * removing athena default db name to enforce that one is provided * formatting nit * adding default value for metric to tf_metric_filters module * prefixing firehoses created for data * adding missing test docstring * making prefix for unit testing consistent * adding prefixing support for data firehoses * adding role_count var to lookuptables terraform due to tf bug (#1021) * fixing missing tf var and adding support for optional firehose prefixing (#1024) * fixing bug introduced in #1022 * adding optional prefixing for firehose * adding client support for optional prefix * nit naming change * updating old tf var for bug fixed upstream * fixing resource name in lookuptables * pr feedback * [WIP - do not merge] fixing bugs found during resource migration (#1025) * ensuring order of metric alarms remains unchanged * getting rid of terrible naming * fixing firehose permission with classifier * fixing sorting of metric alarms * fixing missed resource name * [LookupTables] Adds new JSON file import CLI command (#1017) * Enhancements to lookuptables cli * Fixup * Yea * Updates * Fix tests * pylint * Fixup PR feedback * fixup * Fix typo in docs (#1028) * Fix little bug (#1031) * AlertMerger query optimizations (#1030) * Fix generator bug and add a limit of 5000 alerts * Avoid consistent, blocking table scan * fixup * Removes a (probably) unused method. AlertTable.rule_names() (#1033) * updating the flow logs module to not create unnecessary resources (#1034) * updating the flow logs module to not create unnecessary resources * pr feedback * classifier function naming update, rules engine optional stats output (#1019) * updating classifier naming convention; BREAKING CHANGE * update to support toggling output of rule stats * docstring update and import removal * removing cruft * making change for proper formatting of kinesis stream names (#1027) * making change for proper formatting of kinesis stream names * adding custom kinesis stream name support * updating unit tests for custom stream name support, adding new one * removing 'kinesis' suffix from stream names * removing unused variable * adding documentation for optional kinesis stream name (#1037) * adding documentation for optional kinesis stream name * nit change * fixing iam group prefix * small tweak to env var setting for rules lambda (#1039) * Migrate sources to cluster (#1008) * Added data_sources to clusters * Moved sources to cluster config * Updated unit tests * Changed CLUSTERS env var to actual cluster * Updated tests, docs and references to sources TLK * Docs update * Python 2-3 compat issue * Pylint fixes * Realized validate sources is an internal function * Added data_sources TLK to example json blocks * Switched from using tuples * Quote cleanup, moved more code into validate_sources * More cleanup * Get env variable at load time not runtime * Fix cluster env errors, detect duplicate data sources * Removed no-op i forgot about from testing * Fixed docs errors, corrected dupe source checking * Fixed confusion with missing sources versus invalid sources * Fixed some namespacing and import order from rebase * Fixed more namespacing * Fixed global loading sources config in test classifier * Added multi cluster support for testing * Fixed some bad decisions * Name changes * Added brea for finding cluster in cli handler * Wording cleanup, error on missind data_sources, other cleanup * Fixed a missing , * [terraform] update cloudwatch and flow logs terraform module to reduce redundancy (#1041) * adding aliased terraform providers for regions * updating tf_flow_logs module to remove redundancy * adding a tf_cloudwatch_logs_destination module to replace tf_cloudwatch module * removing replaced tf_cloudwatch module * updating infinitedict function to support initial value * updating terraform generation code for aforementioned changes * updating tests 01 * removing legacy test because we ain't living in the past * fixing tests for default terraform settings * updating tests 02 * updates to documentation * addressing PR feedback * fixing a few bugs (#1042) * update gsuite apps for changes to gsuite api (#1046) * updating gsuite streamalert apps * removing deprecated saml app * rm unused import * updating requirements for new google api client, etc * updating packaging for new pinned versions * updating gsuite apps for new python sdk library * misc pylint fixup * pinning pylint version because annoyed * [cli] Handle when AttributeError when test classifier (#1045) * [cli] Handle when AttributeError when test classifier * Address comment * fix bugs with updated dependencies and gsuite app (#1049) * updating box precompiled dependencies * fixing gsuite groups app name * updating readme * final updates for terraform 12 support (#1052) * [terraform] upgrade Terraform to 0.12.9 (#1035) * Terraform 0.12 upgrade * adding conf files back * minor consmetic edits * test case fix * terraform upgrade on new files * removing deprecated sources.json * not sorting keys in config, since it is not necessary * fixing typo * removing s3 events legacy garbage * updating tf_s3_events module for proper handling of more than one filter per bucket * updating tf_s3_events module generation code and tests * updating documentation to reflect changes to tf_s3_events module * fixing unit test * fixing small bugs with terraform 12 code (#1053) * fixing some bugs found with terraform 0.12.9 deployment * fixing some bugs and removing unnecessary prefixing * Critical API call detection fixes (#1029) * [apps] Correctly update aliyun timestamp (#978) * tweaking .gitignore file slightly for venv (#980) * Hotfix/links and spelling (#967) * Fix broken link in the documentation * Minor spelling updates in comments, code, and docs * Bumped slack app timeout (#983) * Fix logic for S3 Public Block Access; Add detection of Organization calls * update to use sets vs lists * [rules] Detect use of PutAccountPublicAccessBlock (#1023) * [apps] Correctly update aliyun timestamp (#978) * tweaking .gitignore file slightly for venv (#980) * Hotfix/links and spelling (#967) * Fix broken link in the documentation * Minor spelling updates in comments, code, and docs * Bumped slack app timeout (#983) * Detect use of PutAccountPublicAccessBlock * [docs] Add Glue to the permissions needed (#1020) * [apps] Correctly update aliyun timestamp (#978) * tweaking .gitignore file slightly for venv (#980) * Hotfix/links and spelling (#967) * Fix broken link in the documentation * Minor spelling updates in comments, code, and docs * Bumped slack app timeout (#983) * Add Glue to the permissions needed Learnt this the hard way... * adding fix for box app request timeout (#1040) (#1057) * adding fix for #1040 * addressing pr feedback * [terraform] support remote state file locking (#1059) * Added terraform state file locking support * Switched from enabled to disabled by default * Updated tests to account for new configuration * Updated documentation to explain how to setup state locking * DDB Table is now created automagically * Finalized code for managing the dynamo table * Added tests * Removed no longer needed config * Updated docs * Various updates from PR feedback * Pylint fixes whoops * Import order, trailing whitespace * Fixed function comment format * [cli] Fix unknown module referenced bug (#1060) * raise exception when config sources are misconfigured (#1063) * raising exception when source is not defined * updating unit tests for source exceptions * fixing pylint * updating old ConfigError class * renaming all known references of "stream_alert" to "streamalert" (#1064) * mass rename of stream_alert_secrets * updating config reading for new "streamalert" module name * updating unit tests for config read logic * fixing unit tests that did not have a docstring * updating stream_alert_app references * vagrant naming update * more stream_alert_apps renames * remaining batch rename of stream_alert --> streamalert * [schema] Add two new keys to osquery.json file (#1062) * [cli] Allow to rebuild partitions when the statement length exceeds the limit (#1067) * [cli] Allow to rebuild partitions when the statement length exceeds the limit * address comments * refactoring the cloudtrail terraform module to decouple cloudwatch events (#1069) * updating s3_events tf module for flexibility * renaming cloudwatch module to cloudwatch_destinations * moving cloudwatch events to its own module * updating cloudtrail terraform module to not be so poop filled * updating readmes * moving cloudtrail --> cloudwatch logs logic to tf submodule * updating cloudtrail module generation code and tests * small tweak to cloudtrail generation * updating docs for new modules * small update * removing invalid/unused statement (#1071) * Don't throw an exception if no partitions are added (#1070) * Don't throw an exception if no partitions are added * Switched to assert_has_calls * Cleaned up unused imports * Docs/general update (#1076) * Updated terraform version and git branch Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * removed step no longer required, as the choices are dynamically created based on the @StreamAlertOutput class decorator Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * reset to stable and changed note Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [docs] Correcting URL in contributing.rst as previous a HTTP 404 error (#1081) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Fix a parser bug when processing raw event encapsulated in a string (#1085) see issue: #1084 for more information Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [unit test] Use single quote around strings (#1087) * [core] Adding trendmicro malware schema and rule (#1077) [testing] Added trendmicro schema and rule test Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [terraform] Implemented a fix for the count error (#1089) [testing] fixed test for rules_engine assertion Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [terraform] fixed destroy issue by reverting #1060 (#1093) This in-turn re-introduced #1047. I fixed this by ensuring that the cleanup function removes the metric_filters.tf.json file, otherwise terraform reads this in as part of its deployment. Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [testing] enable trend tests, previously only schema (#1096) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [rule] Fix cloudtrail_public_resources (#1102) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Updated cloudtrail:events optional_key (#1101) Updated the optional_top_level_keys for cloudtrail:events Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [outputs] add Microsoft Teams as an alerting output (#1079) * [core] Initial Microsoft Teams output code commit, looking for feedback Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [testing] added TeamsOutput Testing (used slack tests as template), ammended list for output_base aswell Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [docs] added Microsoft Teams to output documentation [setup] added pymsteams to reuirements-top-level and added sample-webhook to outputs.json Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Moved pymsteams to package.py [docs] Corrected docstring for teams and added teams to outputs [core] Added Alert section to card (didn't have the alert_id which made it confusing previously) [testing] re-wrote the tests Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Added dynamic_outputs to Rule (#1095) * Now possible to pass dynamic_outputs to the @rule decorator and have outputs be dynamically configured based on information in the record. For example, you could use lookup_tables to map an account_id to an owner which maps to an output [testing] Updated unit tests and added additional tests for new dynamic_outputs [docs] Added dynamic_outputs documentation Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] added aws-ses as an output (#1082) [testing] added aws-ses output tests [docs] updated docs/source/outputs.rst to include aws ses [terraform] Added ses:SendRawEmail to tf_alert_processor_iam Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [docs] added aws-ses to outputs.rst (#1103) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Added trendmicro normalized_types (#1105) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * threat_intel_downloader module now uses tf_lambda module (#1074) * threat intel downloader terraform module now uses tf_lambda * small cleanups make for happier linters * fixed some stale references in the threat_intel_downloader terraform module * rebase release-3-1-0 from release-3-0-0 (#1109) * Docs/general update (#1076) * Updated terraform version and git branch Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * removed step no longer required, as the choices are dynamically created based on the @StreamAlertOutput class decorator Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * reset to stable and changed note Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [docs] Correcting URL in contributing.rst as previous a HTTP 404 error (#1081) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Fix a parser bug when processing raw event encapsulated in a string (#1085) see issue: #1084 for more information Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [unit test] Use single quote around strings (#1087) * [core] Adding trendmicro malware schema and rule (#1077) [testing] Added trendmicro schema and rule test Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [terraform] Implemented a fix for the count error (#1089) [testing] fixed test for rules_engine assertion Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [terraform] fixed destroy issue by reverting #1060 (#1093) This in-turn re-introduced #1047. I fixed this by ensuring that the cleanup function removes the metric_filters.tf.json file, otherwise terraform reads this in as part of its deployment. Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [testing] enable trend tests, previously only schema (#1096) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [rule] Fix cloudtrail_public_resources (#1102) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Updated cloudtrail:events optional_key (#1101) Updated the optional_top_level_keys for cloudtrail:events Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Added trendmicro normalized_types (#1105) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [outputs] add Microsoft Teams as an alerting output (#1079) * [core] Initial Microsoft Teams output code commit, looking for feedback Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [testing] added TeamsOutput Testing (used slack tests as template), ammended list for output_base aswell Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [docs] added Microsoft Teams to output documentation [setup] added pymsteams to reuirements-top-level and added sample-webhook to outputs.json Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Moved pymsteams to package.py [docs] Corrected docstring for teams and added teams to outputs [core] Added Alert section to card (didn't have the alert_id which made it confusing previously) [testing] re-wrote the tests Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Added dynamic_outputs to Rule (#1095) * Now possible to pass dynamic_outputs to the @rule decorator and have outputs be dynamically configured based on information in the record. For example, you could use lookup_tables to map an account_id to an owner which maps to an output [testing] Updated unit tests and added additional tests for new dynamic_outputs [docs] Added dynamic_outputs documentation Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] added aws-ses as an output (#1082) [testing] added aws-ses output tests [docs] updated docs/source/outputs.rst to include aws ses [terraform] Added ses:SendRawEmail to tf_alert_processor_iam Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [docs] added aws-ses to outputs.rst (#1103) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * threat_intel_downloader module now uses tf_lambda module (#1074) * threat intel downloader terraform module now uses tf_lambda * small cleanups make for happier linters * fixed some stale references in the threat_intel_downloader terraform module Co-authored-by: jack1902 <39212456+jack1902@users.noreply.github.com> Co-authored-by: darkjokelady <chunyong.lin@gmail.com> Co-authored-by: Garret Reece <GarretReece@users.noreply.github.com> * Vagrant AWS cli and env vars (#1112) * [setup] fixing missing awscli Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [setup] Natively pass AWS credentials through ssh Updated the sshd_config file to allow AWS_* to be passed through when using vagrant ssh. This allows you to not hardcode credentials inside of the vm Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Fixed bug with dynamic_outputs and publishers (#1125) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [testing] Setting other AWS variables during testing (#1121) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [rules] Add community rule to detect ssh login activity based on osquery events (#1127) * [rules] Add community rule to alert on ssh login activity based on osquery detection * address comments * updating duo auth schema for new alias key (#1129) * Kinda rebase PR 874 to support parquet format * bug bug bug don't bite me * Add athena_partition_refresh lambda function back * Address comment * Fix bug with default value for firehose use_prefix (#1122) * fix pylint failure * [docs/misc] documentation overhaul, config format changes, removing periods from bucket names (#1114) * updating secrets bucket name to remove periods * updating s3-logging bucket name to remove periods * updating athena-results bucket name to remove periods * updating streamalerts bucket name to remove periods * updating terraform-state bucket name to remove periods * updating streamalert-data bucket name to remove periods * fixing misc places that were missed regarding periods in bucket names * making data and alerts bucket names configurable * restructuring docs to allow for highlighting global settings * doc updates for alerts_table config * updating documentation, round 1 * moving clusters and global docs to new file * trailing space removal * Update to commands for consistency * misc formatting fixes, migrating rule-staging config to global * updating cluster config docs * massive updates to docs * adding other changes related to doc updates and config changes * adding prefix validation for periods * removing prefix setting trash * updates to remove need for setting terraform config * removing nonsense for athena bucket configuration * addressing PR comments * removing kinesis region setting since it would break things * addressing chunyong PR feedback * adding new streamalert images and updating doc references * fixup * [terraform] switching to state lock table being managed by terraform (#1131) * Switch State Lock Table to be Managed by Terraform * Missed some things whoops * Pylint....................... * Add newline between functions * Official release of StreamQuery, StreamAlert's scheduled query service (#1133) * Ports over internal StreamQuery code; integrates into manage.py (#1128) * Ports over StreamQuery code verbatim, without proprietary query packs * Removes some dead streamquery code, adds Apache license doc blocks, restructures some stuff * Pylint fixups * An extremely WIP commit for StreamQuery terraform * Fix some bugs in terraform module * Renames streamquery to scheduled_queries * First WORKING DRAFT of scheduled queries. - renames terraform resources to scheduled_queries - implements functioning deploy - configuration * Adds new query_parameters execution field. Supports dynamic query_packs. Supports better tf module configurations * Move scheduled_queries directory properly * Extracts parameter generation into own file * Documentation * Adds in lambda_config * Fixup tests * Fix documentation * Fix test files * Typo * Fixup code-block * Fix terraform comment conventions * Fixes some doc blocks, removes __versions__ * Fix stuff * Fixup * Refactors StreamQuery a little to "hide" the ServiceContainer (#1130) * Refactors StreamQuery a little to "hide" the ServiceContainer * Restructures tests * Fixup * fixup * Fixup * Fixup again * fixup again * [docs] Updated style and fixed bug (#1139) * Updated the style of the documentation to be consistent with the new format and fixed indentation * Fixed a bug in rules.rst when referencing the dynamic_outputs. swapped _ to - * Updated references according to https://documentation-style-guide-sphinx.readthedocs.io/en/latest/style-guide.html#headings Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [docs] Fixing docstring within method (#1141) Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [misc] updating authors/contributors, copyright updates, docs project name fix (#1134) * updating copyright name * updating authors * updating docs project name and copyright * updating README.rst * more copyright fix * misc doc updates * re-adding background to images so the content is visible upon click * final update to images (hopefully) * Add packetbeat parser * [core] Moved the secret_store from S3 to SSM (#1142) * [core] Moved the secret_store from S3 to SSM [testing] Updated unit_tests to use SSM instead of S3 [terraform] Updated alert_processor permissions so it can pull from param store Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [core] Removed S3Driver for Credentials [testing] Added SSMDriver Tests Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> Co-authored-by: Ryxias <sunsilverdragon@gmail.com> * Follow up PR to SSM migration (#1147) * Changes the ssm credentials conventions * Refactors some code to be clearer * Fixup * Add a FIXME comment * Another fixup * Doublequote * fixup * [cli] Added additional commands to outputs (#1138) * [cli] Added additional commands to outputs 'set' - Set one output, using user_input (pass --update to overwrite existing outputs) 'get' - Get configured outputs for a service (includes creds) optionally pass --decriptors to only pull certain descriptor secrets for the service 'set-from-file' - Set numerous outputs via a json file. Can set multiple services and descriptors (pass --update to overwrite existing outputs) 'generate-skeleton' - Use to create a skeleton json file to be used with 'set-from-file' Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * [cli] Added 'list' to outputs Also updated inline with PR comments Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * address leftovers when merge to latest release-3-1-0 * Ooooops, forgot to address ryan's comment * Add some schema tests. * Add Tests for the Rules :) * [rules] Added AWS Config Compliance and Remediation Rules * Aws prefix added to relevant matchers Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * Add packbeat reference. * Fixes a bug when generating scheduled_queries terraform (#1162) * FIx incorrect input restrictions * Fix bug with bucket names * fixing bug related to athena data bucket not existing (#1163) * adding fix for #1158 * removing unnecessary perms for data bucket access by rule promotion function * [cli] Fix backend initialization (#1166) Previously, the init command had a '-b' flag. This only initialized a local backend but the description on the flag stated 'useful for refreshing a pre-existing deployment'. This flag now actually reflects this Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * fixing bug with targeting resources (#1150) * [cli] Fix backend initialization (#1166) Previously, the init command had a '-b' flag. This only initialized a local backend but the description on the flag stated 'useful for refreshing a pre-existing deployment'. This flag now actually reflects this Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * Remove duplicated resource and variables. * [core] Added button ability to teams output (#1168) It is now possible to pass '@teams.buttons' via the publication to add links in the form of buttons to the Teams Card [testing] Added appropriate unit_tests Signed-off-by: jack1902 <39212456+jack1902@users.noreply.github.com> * Add log_type as an optional TLK * Add OSquery snapshot test. * fixing venv module usage (#1174) * Add isotimestamp to duo schema (#1175) * Add backward compatible support for output in JSON format * Change default store_format to null * [docs] Update historical search docs * adding support for cloudtrail to sns for new objects in s3 (#1177) * [core] Use terraform to generate Athena tables * Add log_type to batch. * Address comments * Add additional prefix 'parquet' to firehose and athena table file location * adding updated graphic with new font (#1182) * updating graphic one more time (#1184) * refactor rule integration tests to be included with rules (#1179) * first pass at rule integration test refactor * adding proper return for manage.py main func * adding change for UniqueSortedListAction in CLI * changing `validate_schema_only` to `classify_only` * initial change for rules directory testing * fixing bad integration test formatting/usage * quick update to docs * updating packbeat rules for previous changes * misc formatting fixes * simple helper to ensure valid directories are passed for rule tests * using file handles for test rules * first updates to testing framework * update to packetbeat test events * moving classify only test event * WIP - initial fixtures refactor * adding event file to encapsulate event files and events * fix test event * moving code * first take on mocks * fix threatintel mocks * finalized mocking of fixtures * updates to events and test results * finalized handler code * prettify and cleanup * adding example duo rule using lookup tables * adding readme for local fixture usage * adding/updating readmes for fixtures * adding some unit tests * fix for excluding test * TestEvent unit tests * more testevent tests * fixing copy pasta * removing unused imports * First draft for publisher tests with jmespath * Fixup * Fixup + docs * Removes all publisher tests + enhances integration tests (#1185) * Deletes slack publisher unit test and moves them to integration tests * Removes final publisher tests, adds new clause * fixing bug with printing during rules * removing old cruft * removing old fixture files * support for updated format for test fixtures * updating test events for new fixture formatt * pylint fixup * updating publisher rule test file as example * removing trash script * fixing bug when handling multiple test files * updating docs for rule tests * updating publisher example test event * nitting out double quotes (") * adding docs for test fixture usage * disabling threat intel again * disabling rule by default that uses threat intel Co-authored-by: Derek Wang <derek.wang@airbnb.com> * [core][testing] sanitize log name when there are dots in it * Address comments and add more test cases * Refactor config helpers little bit after rebase * fixing some test events after merge of #1179 (#1189) * fixup for after merge of #1179 * changing logger code slightly * [core] Hash firehose stream name it is too long * [ci] disable bandt check on hashlib.md5() method * [docs] Update doc * Address comments * Address more comment * Remove sample access bucket name to prevent broken deploys * feedback * ignoring json files during packaging, fixes for log names (#1197) * checking for defined log schema * ignoring json files during packaging * addressing firehose naming issue * additional fixes * updates to unit tests * addressing feedback * Cylin fix create table via athena tf (#1198) * Fix create_table and generate_data_table_schema bug * Delete inaccurate comments Co-authored-by: Chunyong Lin <chunyong.lin@airbnb.com> * [tf] Change alerts_bucket type to string in tf_rule_promotion_iam module * Move Terraform Files to Support Packaging (re #1194) (#1200) * Move terraform config to streamalert_cli * adding back providers.tf * small updates to terraform path stuff Co-authored-by: Blake Motl <blake.motl@airbnb.com> * fixing bug related to ignoring all json files (#1201) * [fix] Fix some bugs during internal deployment of parquet (#1202) Co-authored-by: Chunyong Lin <chunyong.lin@airbnb.com> * Make scheduled_queries/ directory configurable from global.json * [cli] fix a minor bug when list-targets * [doc] Update docs for file_format setting * [core] Missing alerts table creation during init * Change logging serverity to warning if firehose module is not enabled * typo * Address comments * fixing bug with clean subcommand (#1212) * pr feedback * bumping version to 3.1.0 Co-authored-by: Ryxias <derek.wang@airbnb.com> Co-authored-by: Garret Reece <GarretReece@users.noreply.github.com> Co-authored-by: darkjokelady <chunyong.lin@gmail.com> Co-authored-by: Blake Motl <blake@motl.dev> Co-authored-by: Scott Piper <0xdabbad00@users.noreply.github.com> Co-authored-by: Ricard Flores Duran <ricard.flores@teamcmp.com> Co-authored-by: jack1902 <39212456+jack1902@users.noreply.github.com> Co-authored-by: Chunyong Lin <chunyong.lin@airbnb.com> Co-authored-by: Gavin Elder <gavin@improbable.io> Co-authored-by: Ryxias <sunsilverdragon@gmail.com> Co-authored-by: Gavin <gav.elder@gmail.com> Co-authored-by: Blake Motl <blake.motl@airbnb.com>

Blake Motl added 6 commits August 22, 2019 15:54

Separated logs.json into multiple files

e0ee278

Added unit tests for split schema config

04cf7f4

Moved schema loading logic into function

c86a35d

Added documentation to cover the new split file schemas

46a7c30

Removed extraneous import

d171f14

Removed extraneous whitespace

91e8d05

Ryxias reviewed Aug 23, 2019

View reviewed changes

Fixed wording, whitespace, added comments

3d45ae0

chunyong-lin reviewed Aug 26, 2019

View reviewed changes

ryandeivert added classifier logs log schemas labels Aug 26, 2019

ryandeivert added this to the 3.0.0 milestone Aug 26, 2019

ryandeivert added improvement and removed logs labels Aug 26, 2019

Blake Motl added 3 commits August 26, 2019 13:16

Fixed more josn formatting

0f80f5b

Added comments for SchemaSorter and clarified docs

069c193

More docs and fixed whitespace issues

988bdb1

ryandeivert reviewed Aug 26, 2019

View reviewed changes

Blake Motl added 4 commits August 26, 2019 15:32

Docs clarifications, fixed schemas dir test

69e21f9

Fixed another docs issue

991a855

Added test for logs and schema exists

27b388b

Consolidated schemas logic

163b75b

ryandeivert reviewed Aug 30, 2019

View reviewed changes

Blake Motl added 2 commits September 4, 2019 17:18

Fixed docs, added schemas as a TLK

302754b

Merged with release-3-0-0

8e8d061

Removed SchemaSorter inheritance from object

28d154a

Exclude logs.json from being written CLI

a9c9780

Ryxias approved these changes Sep 5, 2019

View reviewed changes

blakemotl merged commit 8edb0dd into release-3-0-0 Sep 5, 2019

Ryxias deleted the breakup-log-schema-file branch September 5, 2019 23:48

ryandeivert changed the title ~~Breakup log schema file~~ [logs] support schema definitions in multiple files vs logs.json file Feb 3, 2020

ryandeivert changed the title ~~[logs] support schema definitions in multiple files vs logs.json file~~ [conf] support schema definitions in multiple files vs logs.json file Feb 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[conf] support schema definitions in multiple files vs logs.json file #981

[conf] support schema definitions in multiple files vs logs.json file #981

blakemotl commented Aug 23, 2019 •

edited by ryandeivert

chunyong-lin left a comment

ryandeivert left a comment

blakemotl commented Aug 29, 2019

chunyong-lin commented Aug 30, 2019

ryandeivert Aug 30, 2019

blakemotl commented Sep 5, 2019

coveralls commented Sep 5, 2019 •

edited

blakemotl commented Sep 5, 2019

Ryxias left a comment

[conf] support schema definitions in multiple files vs logs.json file #981

[conf] support schema definitions in multiple files vs logs.json file #981

Conversation

blakemotl commented Aug 23, 2019 • edited by ryandeivert

Background

Changes

Testing

chunyong-lin left a comment

Choose a reason for hiding this comment

ryandeivert left a comment

Choose a reason for hiding this comment

blakemotl commented Aug 29, 2019

chunyong-lin commented Aug 30, 2019

ryandeivert Aug 30, 2019

Choose a reason for hiding this comment

blakemotl commented Sep 5, 2019

coveralls commented Sep 5, 2019 • edited

blakemotl commented Sep 5, 2019

Ryxias left a comment

Choose a reason for hiding this comment

blakemotl commented Aug 23, 2019 •

edited by ryandeivert

coveralls commented Sep 5, 2019 •

edited