# AWS

- hide: false
- toc: true
- comments: true
- categories: [python, tools]

Basic AWS interaction patterns.

In [3]:
import platform
import botocore
import pandas as pd
import s3fs

## Setup

There are multiple ways to access your AWS account. I store config and credential files in `~/.aws` as discussed [here](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html). AWS access methods find these files automatically so I don't have to worry about anything.

## List bucket content

In [4]:
bucket = 'fgu-mdb'

fs = s3fs.S3FileSystem()
fs.ls(bucket)

['fgu-mdb/data_000.csv']

## Read from S3

In [5]:
fp = f's3://fgu-mdb/data_000.csv'
df = pd.read_csv(fp, sep='|')
df.shape

(1000, 27)

## Read with custom profile

AWS credential and config files allow you to store multiple profiles. Below I access data using a different profile depending on what machine I run the code on.

When Pandas reads files from S3 as above, it uses `botocore` under the hood and uses the default profile. To change the profile, we need to use `botocore` directly and set up a session.

In [None]:
profile = 'linux-profile'
if platform.system() == 'Darwin':
    profile = 'mac-profile'

session = botocore.session.Session(profile=profile)
fs = s3fs.S3FileSystem(session=session)

path = 'path-to-file'
df = pd.read_csv(fs.open(fp))

<!-- ## Sources

- [Fluent Python](https://www.oreilly.com/library/view/fluent-python/9781491946237/)
- [Python Cookbook](https://www.oreilly.com/library/view/python-cookbook-3rd/9781449357337/)
- [Learning Python](https://www.oreilly.com/library/view/learning-python-5th/9781449355722/)
- [The Hitchhiker's Guide to Python](https://docs.python-guide.org/writing/structure/)
- [Effective Python](https://effectivepython.com)
- [Python for Data Analysis](https://www.oreilly.com/library/view/python-for-data/9781491957653/)
- [Python Data Science Handbook](https://www.oreilly.com/library/view/python-data-science/9781491912126/) -->