Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for AWS KMS #527

Merged
merged 27 commits into from
Feb 19, 2021
Merged

Added support for AWS KMS #527

merged 27 commits into from
Feb 19, 2021

Conversation

mpurusottamc
Copy link
Contributor

This is an initial version of pulling KMS key, alias and grant details and populating graph db.

@achantavy
Copy link
Contributor

@mpurusottamc Thanks so much for your PR and sorry for the delays. We've all been a bit more tied up than normal but I'll be able to get you a review late this week/early next week if someone else doesn't beat me to it.

docs/schema/aws.md Show resolved Hide resolved
cartography/intel/aws/kms.py Outdated Show resolved Hide resolved
cartography/intel/aws/kms.py Show resolved Hide resolved
Copy link
Contributor

@achantavy achantavy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR! I left a few comments about performance, schema conventions, and potentially using resource permission relationships. Let me know if you have any questions and hope this helps.

cartography/intel/aws/kms.py Outdated Show resolved Hide resolved
for kms_key in data:
neo4j_session.run(
ingest_keys,
KeyId=kms_key["KeyId"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between a KeyId and an Arn?

For AWS syncs, we should to do something like

ON CREATE SET kmskey.id = kms_key.arn,
kmskey.firstseen = timestamp()
SET kmskey.arn = kms_key.arn,
[...]

It's a bit duplicative, but at least viewers of your data will be able to tell what field(s) the id was based off of.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KeyId is just the Key Identifier, whereas ARN is fully formed URI for the Key. While populating Neo4j, we are using KeyId as the primary identifier.

As suggested, we have used UNWIND method for populating keys, grants and aliases.

@timeit
def parse_policy(key, policy):
"""
Uses PolicyUniverse to parse S3 policies and returns the internet accessibility results
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copypasta; this isn't an s3 policy

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed (oops!)

ingest_policies = """
UNWIND {policies} AS policy
MATCH (k:KMSKey) where k.name = policy.kms_key
SET k.anonymous_access = (coalesce(k.anonymous_access, false) OR policy.internet_accessible),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your schema documentation should include what kmskey.anonymous_access means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schema updated

Ingest KMS Key Grants into neo4j.
"""
ingest_grants = """
MERGE (g:KMSGrant{id: {GrantId}})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about using unwind.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

for grant in grants_list:
neo4j_session.run(
ingest_grants,
GrantId=grant["GrantId"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this based off an Arn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each KMS Key Grant has an Id. but no ARN. That's why we are using the GrantId as the primary field.



@timeit
def load_kms_key_details(neo4j_session, policy_alias_grants_data, region, aws_account_id, update_tag):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize you used the s3 sync as a model here, and while this works, I wonder if using Resource Permission Relationships would be more powerful. See https://github.com/lyft/cartography/blob/master/docs/usage/permissions-mapping.md and (for a longer read) https://eng.lyft.com/iam-whatever-you-say-iam-febce59d1e3b.

The rough idea is that you edit the permission_relationships.yaml file and cartography draws links from the IAM principals to the resource nodes based on the allowed actions:
image

This RPR change for KMS keys might not get merged here depending on how it impacts performance (which is why we have that yaml config file that lets users decide which permissions they care about syncing) but I wanted to let you know that this feature exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achantavy you are right, we used S3 as the reference while populating KMS entities into Cartography. Our idea is to capture all the policies, statements, etc. which would allow the users for more fine grained queries

@achantavy
Copy link
Contributor

Oh another thing, please write integration tests to ensure your new nodes and relationships are created as expected; see https://github.com/lyft/cartography/blob/master/tests/integration/cartography/intel/aws/test_eks.py.

@mpurusottamc
Copy link
Contributor Author

@achantavy we have made all the required changes based on your feedback.

achantavy
achantavy previously approved these changes Feb 18, 2021
Copy link
Contributor

@achantavy achantavy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your contribution! :)

key['DeletionDate'] = str(key.get('DeletionDate'))
key['ValidTo'] = str(key.get('ValidTo'))

# The owner data returned by the API maps to the aws account nickname and not the IAM user
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this comment was copied from the s3 sync, is it still relevant here?

Copy link
Collaborator

@kedarghule kedarghule Feb 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achantavy No it's not relevant. I have removed it in latest commit :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to update your branch with the latest master and then I can merge it in.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achantavy Done :D

@achantavy achantavy merged commit 69eda2e into lyft:master Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants