# What is s3 object storage?

## Definition 1
An ordered collection of bytes.

<img src="images/array.png">

Generally, this collection of bytes has a structure of some sort that allows you to interpret the data in a useful way. E.g. - ASCII, jpg, NetCDF, geoTIFF, etc.

## Definition 2
An API for network storage.

<img src="images/cloud_storage.png" width=500>

# Fake AWS: localstack

This tutorial uses [localstack](https://github.com/localstack/localstack) to emulate AWS. For this tutorial, we'll be focusing on [Amazon S3](https://aws.amazon.com/s3/) object storage.

## Setup: Create the bucket. Do this only once!

You only need to create the s3 bucket once after bringing up localstack. If you try to create the bucket again, you will get an error.

In [1]:
import boto3
s3_client = boto3.client('s3',endpoint_url="http://localstack:4566")
bucket_name = "demo-bucket"
s3_client.create_bucket(Bucket=bucket_name)

{'ResponseMetadata': {'RequestId': '66E257A1DEC6EF22',
  'HostId': 'MzRISOwyjmnup66E257A1DEC6EF227/JypPGXLh0OVFGcJaaO3KW/hRAqKOpIEEp',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/xml; charset=utf-8',
   'content-length': '165',
   'access-control-allow-origin': '*',
   'last-modified': 'Wed, 06 Jan 2021 15:54:22 GMT',
   'x-amz-request-id': '66E257A1DEC6EF22',
   'x-amz-id-2': 'MzRISOwyjmnup66E257A1DEC6EF227/JypPGXLh0OVFGcJaaO3KW/hRAqKOpIEEp',
   'access-control-allow-methods': 'HEAD,GET,PUT,POST,DELETE,OPTIONS,PATCH',
   'access-control-allow-headers': 'authorization,content-type,content-md5,cache-control,x-amz-content-sha256,x-amz-date,x-amz-security-token,x-amz-user-agent,x-amz-target,x-amz-acl,x-amz-version-id,x-localstack-target,x-amz-tagging',
   'access-control-expose-headers': 'x-amz-version-id',
   'connection': 'close',
   'date': 'Wed, 06 Jan 2021 15:54:22 GMT',
   'server': 'hypercorn-h11'},
  'RetryAttempts': 0}}

# An ASCII-encoded object

## Create an object with the characters "GES DISC" inside

Reminder, the encoding will be:

<img src="images/GES_DISC_array.svg">

We'll use the key **"GES DISC.txt"**.


In [2]:
def str_to_binary(str_):
    """Convert a string to a binary string using the 'ord' function"""
    bin_arr = bytearray()
    bin_arr.extend(map(ord,str_))
    return bin_arr

bin_str = str_to_binary("GES DISC")

object_key = "GES DISC.txt"

s3_client.put_object(
    Body=bin_str, 
    Bucket=bucket_name,
    Key=object_key,
    ContentType="text/plain"
)

{'ResponseMetadata': {'RequestId': '77E0A7AF01FE1C73',
  'HostId': 'MzRISOwyjmnup77E0A7AF01FE1C737/JypPGXLh0OVFGcJaaO3KW/hRAqKOpIEEp',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'text/html; charset=utf-8',
   'content-length': '0',
   'etag': '"f262183818ac528524d199fcdca26ccc"',
   'last-modified': 'Wed, 06 Jan 2021 15:54:22 GMT',
   'access-control-allow-origin': '*',
   'x-amz-request-id': '77E0A7AF01FE1C73',
   'x-amz-id-2': 'MzRISOwyjmnup77E0A7AF01FE1C737/JypPGXLh0OVFGcJaaO3KW/hRAqKOpIEEp',
   'access-control-allow-methods': 'HEAD,GET,PUT,POST,DELETE,OPTIONS,PATCH',
   'access-control-allow-headers': 'authorization,content-type,content-md5,cache-control,x-amz-content-sha256,x-amz-date,x-amz-security-token,x-amz-user-agent,x-amz-target,x-amz-acl,x-amz-version-id,x-localstack-target,x-amz-tagging',
   'access-control-expose-headers': 'x-amz-version-id',
   'connection': 'close',
   'date': 'Wed, 06 Jan 2021 15:54:22 GMT',
   'server': 'hypercorn-h11'},
  'RetryAttemp

## List the contents of the bucket to see our new object

In [3]:
out = s3_client.list_objects(Bucket=bucket_name)
out["Contents"]

[{'Key': 'GES DISC.txt',
  'LastModified': datetime.datetime(2021, 1, 6, 15, 54, 22, 834000, tzinfo=tzlocal()),
  'ETag': '"f262183818ac528524d199fcdca26ccc"',
  'Size': 8,
  'StorageClass': 'STANDARD',
  'Owner': {'DisplayName': 'webfile',
   'ID': '75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c078efc7c6caea54ba06a'}}]

## Example 1: read the entire object into memory as binary

In [4]:
resp = s3_client.get_object(Bucket=bucket_name,Key=object_key)
resp

{'ResponseMetadata': {'RequestId': 'D93C2A8E5C368260',
  'HostId': 'MzRISOwyjmnupD93C2A8E5C3682607/JypPGXLh0OVFGcJaaO3KW/hRAqKOpIEEp',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'text/plain',
   'content-length': '8',
   'content-md5': '8mIYOBisUoUk0Zn83KJszA==',
   'etag': '"f262183818ac528524d199fcdca26ccc"',
   'last-modified': 'Wed, 06 Jan 2021 15:54:22 GMT',
   'access-control-allow-origin': '*',
   'x-amz-request-id': 'D93C2A8E5C368260',
   'x-amz-id-2': 'MzRISOwyjmnupD93C2A8E5C3682607/JypPGXLh0OVFGcJaaO3KW/hRAqKOpIEEp',
   'accept-ranges': 'bytes',
   'content-language': 'en-US',
   'cache-control': 'no-cache',
   'content-encoding': 'identity',
   'access-control-allow-methods': 'HEAD,GET,PUT,POST,DELETE,OPTIONS,PATCH',
   'access-control-allow-headers': 'authorization,content-type,content-md5,cache-control,x-amz-content-sha256,x-amz-date,x-amz-security-token,x-amz-user-agent,x-amz-target,x-amz-acl,x-amz-version-id,x-localstack-target,x-amz-tagging',
   'access-

In [5]:
list(resp["Body"])

[b'GES DISC']

## Example 2: read just bytes 2-4 out

In [6]:
resp = s3_client.get_object(Bucket=bucket_name,Key=object_key,Range='bytes=2-4')
list(resp["Body"])

[b'S D']

## Example 3: update the entire object to be lower case 

In [7]:
s3_client.put_object(
    Body=str_to_binary("ges disc"), 
    Bucket=bucket_name,
    Key=object_key,
    ContentType="text/plain"
)

{'ResponseMetadata': {'RequestId': '3A05160A94B14F6B',
  'HostId': 'MzRISOwyjmnup3A05160A94B14F6B7/JypPGXLh0OVFGcJaaO3KW/hRAqKOpIEEp',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'text/html; charset=utf-8',
   'content-length': '0',
   'etag': '"e9848c445d1398547c0161189d849ba0"',
   'last-modified': 'Wed, 06 Jan 2021 15:54:22 GMT',
   'access-control-allow-origin': '*',
   'x-amz-request-id': '3A05160A94B14F6B',
   'x-amz-id-2': 'MzRISOwyjmnup3A05160A94B14F6B7/JypPGXLh0OVFGcJaaO3KW/hRAqKOpIEEp',
   'access-control-allow-methods': 'HEAD,GET,PUT,POST,DELETE,OPTIONS,PATCH',
   'access-control-allow-headers': 'authorization,content-type,content-md5,cache-control,x-amz-content-sha256,x-amz-date,x-amz-security-token,x-amz-user-agent,x-amz-target,x-amz-acl,x-amz-version-id,x-localstack-target,x-amz-tagging',
   'access-control-expose-headers': 'x-amz-version-id',
   'connection': 'close',
   'date': 'Wed, 06 Jan 2021 15:54:22 GMT',
   'server': 'hypercorn-h11'},
  'RetryAttemp

## Example 4: update part of the object

<br />
<br />
<font size=15> ??????? </font>
<br />
<br />

This is a trick example. It's not possible. You cannot update part of an s3 object. You can only update the whole thing: https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html.