# Amazon S3
> Introduction to AWS S3 & Glacier

- toc: true 
- comments: true
- author: Ankush Agarwal
- categories: [aws,S3,Glacier]

### Introduction

    Amazon S3 serves as the durable target storage for Amazon Kinesis and Amazon Elastic MapReduce (Amazon EMR),
        it is used as the storage for Amazon Elastic Block Store (Amazon EBS) and Amazon Relational Database
        Service (Amazon RDS) snapshots, and it is used as a data staging or loading storage mechanism for 
        Amazon Redshift and Amazon DynamoDB.
           
    Amazon S3 objects are automatically replicated on multiple devices in multiple facilities within a region.
    You can create and use multiple buckets; you can have up to 100 per account by default.
    Objects can range in size from 0 bytes upto 5TB,and a single bucket can store an unlimited number of objects
    
    The native interface for Amazon S3 is a REST (Representational State Transfer) API. 
        With the REST interface, you use standard HTTP or HTTPS requests to create and delete buckets, 
        list keys, and read and write objects.
        
    Amazon S3 achieves high durability by automatically storing data redundantly on multiple devices in
        multiple facilities within a region. It is designed to sustain the concurrent loss of data in two
        facilities without loss of user data. 

### Access Control
    Amazon S3 is secure by default; when you create a bucket or object in Amazon S3, only you have access. 
        To allow you to give controlled access to others, Amazon S3 provides both coarse-grained access 
        controls (Amazon S3 Access Control Lists [ACLs]), and fine-grained access controls (Amazon S3 
        bucket policies, AWS Identity and Access Management [IAM] policies, and query-string authentication).
        
    Using an Amazon S3 bucket policy, you can specify who can access the bucket, from where (by Classless 
        Inter-Domain Routing [CIDR] block or IP address), and during what time of day.
        
    Finally, IAM policies may be associated directly with IAM principals that grant access to an Amazon 
        S3 bucket, just as it can grant access to any AWS service and resource. 
        
    Lifecycle configurations are attached to the bucket and can apply to all objects in the bucket or 
        only to objects specified by a prefix.

### Encryption

    To encrypt your Amazon S3 data in flight, you can use the Amazon S3 Secure Sockets Layer (SSL) API 
        endpoints. This ensures that all data sent to and from Amazon S3 is encrypted while in transit 
        using the HTTPS protocol.
        
    To encrypt your Amazon S3 data at rest, you can use several variations of Server-Side Encryption (SSE). 
        Amazon S3 encrypts your data at the object level as it writes it to disks in its data centers and 
        decrypts it for you when you access it. All SSE performed by Amazon S3 and AWS Key Management Service
        (Amazon KMS) uses the 256-bit Advanced Encryption Standard (AES). You can also encrypt your Amazon 
        S3 data at rest using Client-Side Encryption, encrypting your data on the client before sending it 
        to Amazon S3.
        
    SSE-S3 (AWS-Managed Keys)
        This is a fully integrated “check-box-style” encryption solution where AWS handles the key management 
        and key protection for Amazon S3. Every object is encrypted with a unique key. The actual object key 
        itself is then further encrypted by a separate master key. A new master key is issued at least monthly, 
        with AWS rotating the keys. Encrypted data, encryption keys, and master keys are all stored separately
        on secure hosts, further enhancing protection.
        
    SSE-KMS (AWS KMS Keys)
        This is a fully integrated solution where Amazon handles your key management and protection for 
        Amazon S3, but where you manage the keys. SSE-KMS offers several additional benefits compared to SSE-S3.
        Using SSE-KMS, there are separate permissions for using the master key, which provide protection against
        unauthorized access to your objects stored in Amazon S3 and an additional layer of control.
        
    SSE-C (Customer-Provided Keys)
        This is used when you want to maintain your own encryption keys but don’t want to manage or implement 
        your own client-side encryption library. With SSE-C, AWS will do the encryption/decryption of your 
        objects while you maintain full control of the keys used to encrypt/decrypt the objects in Amazon S3.
   
    Client-Side Encryption
        Client-side encryption refers to encrypting data on the client side of your application before sending 
        it to Amazon S3. 

### Pointers

    Pre-Signed URLs
        All Amazon S3 objects by default are private, meaning that only the owner has access. However, the 
        object owner can optionally share objects with others by creating a pre-signed URL, using their own
        security credentials to grant time-limited permission to download the objects. When you create a 
        pre-signed URL for your object, you must provide your security credentials and specify a bucket name, 
        an object key, the HTTP method (GET to download the object), and an expiration date and time. The 
        pre-signed URLs are valid only for the specified duration. This is particularly useful to protect 
        against “content scraping” of web content such as media files stored in Amazon S3.
        
    Multipart Upload
        To better support uploading or copying of large objects, Amazon S3 provides the Multipart Upload API. 
        This allows you to upload large objects as a set of parts, which generally gives better network 
        utilization (through parallel transfers), the ability to pause and resume, and the ability to 
        upload objects where the size is initially unknown.
        
    Range GETs
        It is possible to download (GET) only a portion of an object in both Amazon S3 and Amazon Glacier by 
        using something called a Range GET. Using the Range HTTP header in the GET request or equivalent 
        parameters in one of the SDK wrapper libraries, you specify a range of bytes of the object. This can 
        be useful in dealing with large objects when you have poor connectivity or to download only a known 
        portion of a large Amazon Glacier backup.
        
    Cross-Region Replication
        Cross-region replication is a feature of Amazon S3 that allows you to asynchronously replicate all 
        new objects in the source bucket in one AWS region to a target bucket in another region. Any metadata 
        and ACLs associated with the object are also part of the replication. After you set up cross-region
        replication on your source bucket, any changes to the data, metadata, or ACLs on an object trigger a 
        new replication to the destination bucket. To enable cross-region replication, versioning must be 
        turned on for both source and destination buckets, and you must use an IAM policy to give Amazon 
        S3 permission to replicate objects on your behalf.
        
    Logging
        In order to track requests to your Amazon S3 bucket, you can enable Amazon S3 server access logs. 
        Logging is off by default, but it can easily be enabled.
        
    Event Notifications
        Amazon S3 event notifications can be sent in response to actions taken on objects uploaded or stored 
        in Amazon S3. Event notifications enable you to run workflows, send alerts, or perform other actions 
        in response to changes in your objects stored in Amazon S3. You can use Amazon S3 event notifications 
        to set up triggers to perform actions, such as transcoding media files when they are uploaded,processing
        data files when they become available, and synchronizing Amazon S3 objects with other data stores.
        
    Another common pattern is to use Amazon S3 as bulk “blob” storage for data, while keeping an index to 
        that data in another service, such as Amazon DynamoDB or Amazon RDS. This allows quick searches and 
        complex queries on key names without listing keys continually.

### Amazon Glacier

    Archives
        In Amazon Glacier, data is stored in archives. An archive can contain up to 40TB of data, and you 
        can have an unlimited number of archives. 
        
    Vaults
        Vaults are containers for archives. Each AWS account can have up to 1,000 vaults. You can control 
        access to your vaults and the actions allowed using IAM policies or vault access policies.
        
    Vaults Locks
        You can easily deploy and enforce compliance controls for individual Amazon Glacier vaults with a 
        vault lock policy. You can specify controls such as Write Once Read Many (WORM) in a vault lock 
        policy and lock the policy from future edits. Once locked, the policy can no longer be changed.
        
    