Add CarbonBlack downloader #52

austinbyers · 2017-08-29T01:57:03Z

Overview

Size: Extra Large

CarbonBlack automatically uploads new binaries that it finds on endpoints; users who have CarbonBlack can now optionally enable a CarbonBlack downloader Lambda function to copy binaries from CarbonBlack into BinaryAlert.

The downloader can be enabled by running the new python3 manage.py configure command to allow the user to set the CarbonBlack URL and encrypt its API key.

Additionally, python3 manage.py cb_copy_all allows users to copy the entire CarbonBlack binary corpus into BinaryAlert in one go.

Full documentation will be added in the next PR.

Change Summary

Adds lambda_functions/downloader
- Added a enable_carbon_black_downloader Terraform variable. All downloader resources are created only if the downloader is explicitly enabled.
Changes to manage.py:
- Adds configure and cb_copy_all commands
- Renames test command to unit_test (to distinguish it from live_test)
- Refactored configuration management into a separate class
- Adds unit test coverage for the first time
Pylint has been slightly relaxed to allow longer variable names and more branches
Travis email notifications disabled
Updates pip requirements to their latest versions
Includes type annotations in all new code

Resolves: #29 (add downloader)
Resolves: #48 (additional name prefix validation)
Contributes to: #34 (type annotations)

Tested

CI

Added unit tests for downloader code as well as most of manage.py. As you can see from the commit history, mocking everything correctly was a huge pain. In the future, I think we should remove moto entirely (we already have to do our own Dynamo and S3 mocks due to 2 separate moto issues)

Test Deploy: Downloader Disabled (Default)

$ python3 manage.py deploy
ERROR: name_prefix "" does not match format [a-z][a-z0-9_]{3,50}
Please run "python3 manage.py configure"

$ python3 manage.py configure
AWS Region (us-east-1):
Unique name prefix, e.g. "company_team": ba_test_638
Enable the CarbonBlack downloader [yes/no]? (no): no
Updated configuration successfully saved to terraform/terraform.tfvars!

$ python3 manage.py deploy
...
Apply complete! Resources: 41 added, 0 changed, 0 destroyed.

$ python3 manage.py live_test
Live test succeeded!

$ python3 manage.py cb_copy_all
ERROR: CarbonBlack downloader is not enabled.
Please run "python3 manage.py configure"

Test Deploy: Enable Downloader

After the previous deploy, we can easily re-configure and re-deploy:

$ python3 manage.py configure
AWS Region (us-east-1):
Unique name prefix, e.g. "company_team" (ba_test_638):
Enable the CarbonBlack downloader [yes/no]? (no): yes
CarbonBlack URL: [URL redacted]
CarbonBlack API token (only needs binary read access):
Terraforming KMS key...
aws_kms_key.carbon_black_credentials: Creation complete
aws_kms_alias.encrypt_credentials_alias: Creation complete
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
Encrypting API token...
Updated configuration successfully saved to terraform/terraform.tfvars!

$ python3 manage.py deploy
Apply complete! Resources: 8 added

$ python3 manage.py cb_copy_all
[Sampled Output]
2017-08-28 18:49:39,630 INFO   Start 32 consumers
2017-08-28 18:49:40,042 DEBUG  Enqueuing CopyTask [#0] MD5: ...
2017-08-28 18:49:40,045 DEBUG  Enqueuing CopyTask [#1] MD5: ...
2017-08-28 18:49:40,046 DEBUG  Enqueuing CopyTask [#2] MD5: ...
2017-08-28 18:49:40,046 INFO   [Consumer-1] Executing CopyTask [#0] MD5: ...
2017-08-28 18:49:40,047 INFO   [Consumer-2] Executing CopyTask [#1] MD5: ...
2017-08-28 18:49:40,047 INFO   [Consumer-3] Executing CopyTask [#2] MD5: ...
... More logs ...
2017-08-28 18:49:55,089 INFO   [Consumer-1] Exiting
2017-08-28 18:49:57,675 INFO   [Consumer-2] Exiting
2017-08-28 18:50:07,184 INFO   [Consumer-3] Exiting
2017-08-28 18:50:07,184 INFO   All CopyTasks Finished!

Reviewers

to: @ryandeivert
cc: @mime-frame @airbnb/binaryalert-maintainers

coveralls · 2017-08-29T01:59:00Z

Coverage increased (+3.6%) to 90.134% when pulling bb60c79 on abb--downloader into 6134b33 on master.

ryandeivert · 2017-08-30T18:17:39Z

lambda_functions/downloader/copy_all.py

+        self.failed_queue = failed_queue
+
+        # Each process needs its own logger to avoid race conditions.
+        self.logger = logging.getLogger(self.name)


As an aside, the logging package actually handles race conditions pretty well through the use of locks, etc (as seen here. That said, this is dependent on the threading package being available, and I don't think you'll have that in lambda(?). So, in normal use cases, having a separate logger for each is probably unnecessary, but is probably the safer/better solution in this case.

This is actually a one-off script that is not designed to run in Lambda. It's probably confusing that it's part of the lambda_functions hierarchy, but I'm not sure where would be a better place to put it

You're right; Python logging is thread-safe. I'll remove this

ryandeivert

Great work!! This is super exciting to open source. A few comments/questions throughout. Also excellent job on all the unit tests & mocks. You're a pro!

ryandeivert · 2017-08-30T18:19:48Z

lambda_functions/downloader/copy_all.py

+    if failed_md5s:
+        logger.error(
+            '%d %s failed to copy: \n%s', len(failed_md5s),
+            'binary' if len(failed_md5s) == 1 else 'binaries', '\n'.join(sorted(failed_md5s)))


ryandeivert · 2017-08-30T18:24:17Z

lambda_functions/downloader/main.py

+LOGGER = logging.getLogger()
+LOGGER.setLevel(logging.INFO)
+
+ENCRYPTED_TOKEN = os.environ['ENCRYPTED_CARBON_BLACK_API_TOKEN']


Since these are required env variables, would you consider putting this into a try/except KeyError block that could log an error informing the user they must export said env variables (and then raising the exception as well). Currently the KeyError will be raised but without much context as to how to fix. For instance:

try: CARBON_BLACK_URL = os.environ['CARBON_BLACK_URL'] ENCRYPTED_TOKEN = os.environ['ENCRYPTED_CARBON_BLACK_API_TOKEN'] TARGET_S3_BUCKET = os.environ['TARGET_S3_BUCKET'] except KeyError as err: LOGGER.error('Please export the environment variable \'%s\' using...blah', err.message) raise

I notice the 'TARGET_S3_BUCKET' env var is accessed within the _upload_to_s3 function and could presumably result in bigger problems if it doesn't exist. If there is not a chance that these would ever be missing, then you can probably safely ignore this :).

That's a good idea. I'm going to leave it for now because I want to standardize this across all of the Lambda functions in a subsequent PR

For now, I will definitely add this check in copy_all.py, since that's the only code which is designed to be invoked by a local user

ryandeivert · 2017-08-30T18:33:18Z

lambda_functions/downloader/main.py

+    """Upload a binary to S3, keyed by a UUID.
+
+    Args:
+        local_file_path: [string] Path to the file to upload.


Consideration - switching to parenths style arguments (similar to what we started doing in SA)

I think you must have been looking at an older commit; all types in new code should be replaced by explicit annotations (which obviates the need for types in the docstrings)

ryandeivert · 2017-08-30T18:36:45Z

lambda_functions/downloader/main.py

+
+    with open(local_file_path, 'rb') as target_file:
+        S3_CLIENT.put_object(
+            Bucket=os.environ['TARGET_S3_BUCKET'],


See other comment about env variables (and my concern with accessing this within a function/loop in the instance that it doesn't exist).

ryandeivert · 2017-08-30T18:38:17Z

tests/lambda_functions/build_test.py

    def test_build_all(self, mock_print):
        """Verify that the top-level build function executes without error."""
        build.build(self._tempdir)
-        self.assertEqual(3, mock_print.call_count)
+        self.assertEqual(4, mock_print.call_count)


++ for call count. easy tests are the best tests

ryandeivert · 2017-08-30T20:45:10Z

manage.py

-        except ManagerError as error:
-            # Print error type and message, not full stack trace.
-            sys.exit('{}: {}'.format(type(error).__name__, error))
+        self._parse_config(allow_empty=(command == 'test'))


Can you clarify (to me) what the allow_empty param is used for here? It looks like you mention it's only used in unit tests, but then it's used here (not in a unit test).

This was a confusing flag to help with validation; it has since been removed and it is hopefully clearer now

ryandeivert · 2017-08-30T20:46:38Z

lambda_functions/downloader/main.py


 # Exponential backoff: try up to 4 times, waiting longer each time.
 RETRY_SLEEP_SECS = [0, 30, 60, 120]


-def _download_from_carbon_black(md5):
+@backoff.on_exception(backoff.expo, ObjectNotFoundError, max_tries=5)
+def _download_from_carbon_black(binary: Binary) -> str:


++ for typing

ryandeivert · 2017-08-30T20:47:55Z

lambda_functions/downloader/main.py

-
-def _upload_to_s3(local_file_path, md5, observed_path):
-    """Upload a binary to S3, keyed by a UUID.
+    download_path = '/tmp/carbonblack_{}'.format(binary.md5)


Instead of using a hardcoded tmp dir here, I'd suggest using the tempdir package that offers interoperability across OSes

Hmmm, that's a good point. Lambda explicitly allocates /tmp (which may not be the value returned by gettempdir). I'll try it and see!

ryandeivert · 2017-08-30T20:56:21Z

tests/lambda_functions/downloader/copy_all_test.py

@@ -30,7 +30,7 @@ class MockMain(object):
    """Mock out the downloader Lambda main.py."""
    CARBON_BLACK = MockCarbonBlack()

-    def __init__(self, inject_errors: bool=False):
+    def __init__(self, inject_errors: bool = False):


Is the space around the equals something Typing requires? Typically this causes a pylint error so just wondering

Pylint actually complains if there is not the extra space:

************* Module tests.lambda_functions.downloader.copy_all_test C: 39, 0: Exactly one space required around keyword argument assignment def __init__(self, inject_errors: bool=False): ^ (bad-whitespace)

However, this is not keyword argument assignment, so I'm guessing pylint doesn't exactly know how to handle type annotations?

I prefer the no-space version, but I don't want to have to disable the whitespace rule, so I guess I'll live with it :)

Another note:

def __init__(self, inject_errors=False: bool):

is invalid syntax

Thanks for clarifying!

ryandeivert · 2017-08-30T21:05:09Z

tests/manage_test.py

+    @mock.patch.object(manage.Manager, 'build')
+    @mock.patch.object(manage.Manager, 'apply')
+    @mock.patch.object(manage.Manager, 'analyze_all')
+    def test_deploy(self, mock_analyze: mock.MagicMock, mock_apply: mock.MagicMock,


stacks on stacks on stacks

boots and cats and stacks on stacks

coveralls · 2017-08-31T01:31:43Z

Coverage increased (+3.6%) to 90.139% when pulling b4b8ae4 on abb--downloader into 548fbfb on master.

…loader

coveralls · 2017-08-31T01:53:34Z

Coverage increased (+3.6%) to 90.139% when pulling be70930 on abb--downloader into 548fbfb on master.

austinbyers · 2017-08-31T01:59:29Z

@ryandeivert PTAL

I've squashed the commits to simplify things and addressed all of the feedback. Be sure to look at the downloader code again because your last review was an an older commit for some reason

ryandeivert · 2017-08-31T16:47:22Z

@austinbyers your'e right - I had been stepping through commits since this PR was so large :)

ryandeivert

LGTM

austinbyers · 2017-08-31T17:53:36Z

@ryandeivert Thanks for your review! I should have done a better job of keeping track of and squashing the commits. I did another deploy just to test everything one last time and once the Travis tests pass I'll go ahead and merge

coveralls · 2017-08-31T17:54:33Z

Coverage increased (+3.6%) to 90.139% when pulling bb0e217 on abb--downloader into 548fbfb on master.

austinbyers requested a review from ryandeivert August 29, 2017 01:57

austinbyers added downloader cli testing terraform labels Aug 29, 2017

austinbyers modified the milestone: 1.0.0 Aug 29, 2017

ryandeivert reviewed Aug 30, 2017

View reviewed changes

ryandeivert requested changes Aug 30, 2017

View reviewed changes

airbnb deleted a comment from ryandeivert Aug 31, 2017

austinbyers force-pushed the abb--downloader branch from b4b8ae4 to 5551964 Compare August 31, 2017 01:38

Austin Byers added 2 commits August 30, 2017 18:41

Add downloader and update requirements

225eafb

Address PR feedback

ef0dfc6

austinbyers force-pushed the abb--downloader branch from 5551964 to ef0dfc6 Compare August 31, 2017 01:41

Austin Byers added 2 commits August 30, 2017 18:41

Merge branch 'master' of github.com:airbnb/binaryalert into abb--down…

1f14af5

…loader

Fix bug from rebase

be70930

ryandeivert approved these changes Aug 31, 2017

View reviewed changes

Fix conditional downloader build and terraform fmt

bb0e217

austinbyers merged commit 7e6af6d into master Aug 31, 2017

austinbyers deleted the abb--downloader branch August 31, 2017 17:55

austinbyers mentioned this pull request Sep 6, 2017

Analyzer Revamp #57

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CarbonBlack downloader #52

Add CarbonBlack downloader #52

austinbyers commented Aug 29, 2017 •

edited

Loading

coveralls commented Aug 29, 2017

ryandeivert Aug 30, 2017

austinbyers Aug 30, 2017

austinbyers Aug 30, 2017 •

edited

Loading

ryandeivert left a comment

ryandeivert Aug 30, 2017

ryandeivert Aug 30, 2017

austinbyers Aug 31, 2017

austinbyers Aug 31, 2017

ryandeivert Aug 30, 2017

austinbyers Aug 30, 2017

ryandeivert Aug 30, 2017

ryandeivert Aug 30, 2017

ryandeivert Aug 30, 2017

austinbyers Aug 31, 2017

ryandeivert Aug 30, 2017

ryandeivert Aug 30, 2017

austinbyers Aug 30, 2017

ryandeivert Aug 30, 2017

austinbyers Aug 31, 2017

austinbyers Aug 31, 2017

ryandeivert Aug 31, 2017

ryandeivert Aug 30, 2017

austinbyers Aug 31, 2017 •

edited

Loading

coveralls commented Aug 31, 2017

coveralls commented Aug 31, 2017

austinbyers commented Aug 31, 2017

ryandeivert commented Aug 31, 2017

ryandeivert left a comment

austinbyers commented Aug 31, 2017

coveralls commented Aug 31, 2017

Add CarbonBlack downloader #52

Add CarbonBlack downloader #52

Conversation

austinbyers commented Aug 29, 2017 • edited Loading

Overview

Change Summary

Tested

CI

Test Deploy: Downloader Disabled (Default)

Test Deploy: Enable Downloader

Reviewers

coveralls commented Aug 29, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austinbyers Aug 30, 2017 • edited Loading

Choose a reason for hiding this comment

ryandeivert left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austinbyers Aug 31, 2017 • edited Loading

Choose a reason for hiding this comment

coveralls commented Aug 31, 2017

coveralls commented Aug 31, 2017

austinbyers commented Aug 31, 2017

ryandeivert commented Aug 31, 2017

ryandeivert left a comment

Choose a reason for hiding this comment

austinbyers commented Aug 31, 2017

coveralls commented Aug 31, 2017

austinbyers commented Aug 29, 2017 •

edited

Loading

austinbyers Aug 30, 2017 •

edited

Loading

austinbyers Aug 31, 2017 •

edited

Loading