A security toolkit for Amazon S3
Branch: master
Clone or download
Latest commit 684db5a Dec 17, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin Catch client errors as well Apr 20, 2018
s3tk Version bump to 0.2.1 Oct 16, 2018
tests
.gitignore First commit Sep 13, 2017
CHANGELOG.md
Dockerfile Added Docker image - closes #6 Apr 20, 2018
LICENSE.txt First commit Sep 13, 2017
MANIFEST.in First commit Sep 13, 2017
Makefile
README.md Use https link Dec 18, 2018
requirements.txt First commit Sep 13, 2017
setup.py Version bump to 0.2.1 Oct 16, 2018

README.md

s3tk

A security toolkit for Amazon S3

Another day, another leaky Amazon S3 bucket

— The Register, 12 Jul 2017

Don’t be the... next... big... data... leak

Screenshot

🍊 Battle-tested at Instacart

Installation

Run:

pip install s3tk

You can use the AWS CLI to set up your AWS credentials:

pip install awscli
aws configure

See IAM policies needed for each command.

Commands

Scan

Scan your buckets for:

  • ACL open to public
  • policy open to public
  • logging enabled
  • versioning enabled
  • default encryption enabled
s3tk scan

Only run on specific buckets

s3tk scan my-bucket my-bucket-2

Also works with wildcards

s3tk scan "my-bucket*"

Confirm correct log bucket(s) and prefix

s3tk scan --log-bucket my-s3-logs --log-bucket other-region-logs --log-prefix "{bucket}/"

Skip logging, versioning, or default encryption

s3tk scan --skip-logging --skip-versioning --skip-default-encryption

Get email notifications of failures (via SNS)

s3tk scan --sns-topic arn:aws:sns:...

List Policy

List bucket policies

s3tk list-policy

Only run on specific buckets

s3tk list-policy my-bucket my-bucket-2

Show named statements

s3tk list-policy --named

Set Policy

Note: This replaces the previous policy

Only private uploads

s3tk set-policy my-bucket --no-object-acl

Delete Policy

Delete policy

s3tk delete-policy my-bucket

Enable Logging

Enable logging on all buckets

s3tk enable-logging --log-bucket my-s3-logs

Only on specific buckets

s3tk enable-logging my-bucket my-bucket-2 --log-bucket my-s3-logs

Set log prefix ({bucket}/ by default)

s3tk enable-logging --log-bucket my-s3-logs --log-prefix "logs/{bucket}/"

Use the --dry-run flag to test

A few notes about logging:

  • buckets with logging already enabled are not updated at all
  • the log bucket must in the same region as the source bucket - run this command multiple times for different regions
  • it can take over an hour for logs to show up

Enable Versioning

Enable versioning on all buckets

s3tk enable-versioning

Only on specific buckets

s3tk enable-versioning my-bucket my-bucket-2

Use the --dry-run flag to test

Enable Default Encryption

Enable default encryption on all buckets

s3tk enable-default-encryption

Only on specific buckets

s3tk enable-default-encryption my-bucket my-bucket-2

This does not encrypt existing objects - use the encrypt command for this

Use the --dry-run flag to test

Scan Object ACL

Scan ACL on all objects in a bucket

s3tk scan-object-acl my-bucket

Only certain objects

s3tk scan-object-acl my-bucket --only "*.pdf"

Except certain objects

s3tk scan-object-acl my-bucket --except "*.jpg"

Reset Object ACL

Reset ACL on all objects in a bucket

s3tk reset-object-acl my-bucket

This makes all objects private. See bucket policies for how to enforce going forward.

Use the --dry-run flag to test

Specify certain objects the same way as scan-object-acl

Encrypt

Encrypt all objects in a bucket with server-side encryption

s3tk encrypt my-bucket

Use S3-managed keys by default. For KMS-managed keys, use:

s3tk encrypt my-bucket --kms-key-id arn:aws:kms:...

For customer-provided keys, use:

s3tk encrypt my-bucket --customer-key secret-key

Use the --dry-run flag to test

Specify certain objects the same way as scan-object-acl

Note: Objects will lose any custom ACL

Delete Unencrypted Versions

Delete all unencrypted versions of objects in a bucket

s3tk delete-unencrypted-versions my-bucket

For safety, this will not delete any current versions of objects

Use the --dry-run flag to test

Specify certain objects the same way as scan-object-acl

Scan DNS

Scan Route 53 for buckets to make sure you own them

s3tk scan-dns

Otherwise, you may be susceptible to subdomain takeover

Credentials

Credentials can be specified in ~/.aws/credentials or with environment variables. See this guide for an explanation of environment variables.

You can specify a profile to use with:

AWS_PROFILE=your-profile s3tk

IAM Policies

Here are the permissions needed for each command. Only include statements you need.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Scan",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:GetBucketAcl",
                "s3:GetBucketPolicy",
                "s3:GetBucketLogging",
                "s3:GetBucketVersioning",
                "s3:GetEncryptionConfiguration"
            ],
            "Resource": "*"
        },
        {
            "Sid": "ScanDNS",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "route53:ListHostedZones",
                "route53:ListResourceRecordSets"
            ],
            "Resource": "*"
        },
        {
            "Sid": "ListPolicy",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:GetBucketPolicy"
            ],
            "Resource": "*"
        },
        {
            "Sid": "SetPolicy",
            "Effect": "Allow",
            "Action": [
                "s3:PutBucketPolicy"
            ],
            "Resource": "*"
        },
        {
            "Sid": "DeletePolicy",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteBucketPolicy"
            ],
            "Resource": "*"
        },
        {
            "Sid": "EnableLogging",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:PutBucketLogging"
            ],
            "Resource": "*"
        },
        {
            "Sid": "EnableVersioning",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:PutBucketVersioning"
            ],
            "Resource": "*"
        },
        {
            "Sid": "EnableDefaultEncryption",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:PutEncryptionConfiguration"
            ],
            "Resource": "*"
        },
        {
            "Sid": "ResetObjectAcl",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObjectAcl",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::my-bucket",
                "arn:aws:s3:::my-bucket/*"
            ]
        },
        {
            "Sid": "Encrypt",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::my-bucket",
                "arn:aws:s3:::my-bucket/*"
            ]
        },
        {
            "Sid": "DeleteUnencryptedVersions",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucketVersions",
                "s3:GetObjectVersion",
                "s3:DeleteObjectVersion"
            ],
            "Resource": [
                "arn:aws:s3:::my-bucket",
                "arn:aws:s3:::my-bucket/*"
            ]
        }
    ]
}

Access Logs

Amazon Athena is great for querying S3 logs. Create a table (thanks to this post for the table structure) with:

CREATE EXTERNAL TABLE my_bucket (
    bucket_owner string,
    bucket string,
    time string,
    remote_ip string,
    requester string,
    request_id string,
    operation string,
    key string,
    request_verb string,
    request_url string,
    request_proto string,
    status_code string,
    error_code string,
    bytes_sent string,
    object_size string,
    total_time string,
    turn_around_time string,
    referrer string,
    user_agent string,
    version_id string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
    'serialization.format' = '1',
    'input.regex' = '([^ ]*) ([^ ]*) \\[(.*?)\\] ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" (-|[0-9]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\"[^\"]*\\") ([^ ]*)$'
) LOCATION 's3://my-s3-logs/my-bucket/';

Change the last line to point to your log bucket (and prefix) and query away

SELECT
    date_parse(time, '%d/%b/%Y:%H:%i:%S +0000') AS time,
    request_url,
    remote_ip,
    user_agent
FROM
    my_bucket
WHERE
    requester = '-'
    AND status_code LIKE '2%'
    AND request_url LIKE '/some-keys%'
ORDER BY 1

CloudTrail Logs

Amazon Athena is also great for querying CloudTrail logs. Create a table (thanks to this post for the table structure) with:

CREATE EXTERNAL TABLE cloudtrail_logs (
    eventversion STRING,
    userIdentity STRUCT<
        type:STRING,
        principalid:STRING,
        arn:STRING,
        accountid:STRING,
        invokedby:STRING,
        accesskeyid:STRING,
        userName:String,
        sessioncontext:STRUCT<
            attributes:STRUCT<
                mfaauthenticated:STRING,
                creationdate:STRING>,
            sessionIssuer:STRUCT<
                type:STRING,
                principalId:STRING,
                arn:STRING,
                accountId:STRING,
                userName:STRING>>>,
    eventTime STRING,
    eventSource STRING,
    eventName STRING,
    awsRegion STRING,
    sourceIpAddress STRING,
    userAgent STRING,
    errorCode STRING,
    errorMessage STRING,
    requestId  STRING,
    eventId  STRING,
    resources ARRAY<STRUCT<
        ARN:STRING,
        accountId:STRING,
        type:STRING>>,
    eventType STRING,
    apiVersion  STRING,
    readOnly BOOLEAN,
    recipientAccountId STRING,
    sharedEventID STRING,
    vpcEndpointId STRING,
    requestParameters STRING,
    responseElements STRING,
    additionalEventData STRING,
    serviceEventDetails STRING
)
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED  AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION  's3://my-cloudtrail-logs/'

Change the last line to point to your CloudTrail log bucket and query away

SELECT
    eventTime,
    eventName,
    userIdentity.userName,
    requestParameters
FROM
    cloudtrail_logs
WHERE
    eventName LIKE '%Bucket%'
ORDER BY 1

Best Practices

Keep things simple and follow the principle of least privilege to reduce the chance of mistakes.

  • Strictly limit who can perform bucket-related operations
  • Avoid mixing objects with different permissions in the same bucket (use a bucket policy to enforce this)
  • Don’t specify public read permissions on a bucket level (no GetObject in bucket policy)
  • Monitor configuration frequently for changes

Bucket Policies

Only private uploads

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:PutObjectAcl",
            "Resource": "arn:aws:s3:::my-bucket/*"
        }
    ]
}

Notes

The set-policy, enable-logging, enable-versioning, and enable-default-encryption commands are provided for convenience. We recommend Terraform for managing your buckets.

resource "aws_s3_bucket" "my_bucket" {
  bucket = "my-bucket"
  acl    = "private"

  logging {
    target_bucket = "my-s3-logs"
    target_prefix = "my-bucket/"
  }

  versioning {
    enabled = true
  }
}

Upgrading

Run:

pip install s3tk --upgrade

To use master, run:

pip install git+https://github.com/ankane/s3tk.git --upgrade

Docker

Run:

docker run -it ankane/s3tk aws configure

Commit your credentials:

docker commit $(docker ps -l -q) my-s3tk

And run:

docker run -it my-s3tk s3tk scan

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help: