# S3 Read API

## What is S3 Read API

AWS S3 provides [wide range of APIs](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html), but some of these functions only retrieve information from the server without changing the state of the S3 bucket (e.g. no files are moved, changed, or deleted). Unlike Write API functions, using Read API functions improperly **will NOT** cause any negative impact. Therefore, it is recommended to start by exploring the Read API functions before diving into the Write API.

## Configure the AWS Context object

In [4]:
!aws sts get-caller-identity

{
    "UserId": "ABCDEFABCDEFABCDEFABC",
    "Account": "111122223333",
    "Arn": "arn:aws:iam::111122223333:user/johndoe"
}


The ``Context`` object stores a pre-authenticated [boto session](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/session.html), which is created using your default credentials (if available). However, you can also configure a custom boto session yourself and attach it to the context.

In [6]:
import boto3
from s3pathlib import context

context.attach_boto_session(
    boto3.session.Session(
        region_name="us-east-1",
        profile_name="my_aws_profile",
    )
)

When ``s3pathlib`` making AWS API calls, it prioritize to use the boto session stored in the Context object. However, you can always explicitly pass in a custom boto session to the API call if needed.

In [10]:
from s3pathlib import S3Path
from boto_session_manager import BotoSesManager

bsm = BotoSesManager(
    region_name="us-east-1",
    profile_name="my_aws_profile",
)
s3path = S3Path("s3://my-bucket/test.txt")
_ = s3path.write_text("hello world", bsm=bsm) # explicit pass the boto session

If you are running the code from Cloud machine like AWS EC2 or AWS Lambda, follow [this official guide](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html) to grant your computational machine proper AWS S3 access.

## Get S3 Object Metadata

In [13]:
s3path = S3Path("s3://s3pathlib/test.txt")
s3path.write_text("hello world" * 1000) # create a test object

S3Path('s3://s3pathlib/test.txt')

In [14]:
s3path.etag

'4d5d1cba9eb18884a5410f4b83bc6951'

In [15]:
s3path.last_modified_at

datetime.datetime(2023, 4, 20, 7, 1, 13, tzinfo=tzutc())

In [16]:
s3path.size

11000

In [17]:
s3path.size_for_human

'10.74 KB'

In [18]:
s3path.version_id

'null'

In [23]:
print(s3path.expire_at)

None


In [26]:
# Create a test file
s3path = S3Path("s3://s3pathlib/file-with-metadata.txt")
s3path.write_text("hello world", metadata={"creator": "s3pathlib"})
print(s3path.size)
print(s3path.metadata)

11
{'creator': 's3pathlib'}


In [27]:
# The server side data is changed
s3path.write_text("hello charlice", metadata={"creator": "charlice"})
# You still see the old data
print(s3path.size)
print(s3path.metadata)

11
{'creator': 's3pathlib'}


In [28]:
# After you clear the cache, you got the latest data
s3path.clear_cache()
print(s3path.size)
print(s3path.metadata)

14
{'creator': 'charlice'}
