# AWS

- hide: false
- toc: true
- comments: true
- categories: [python, tools]

Basic AWS interaction patterns.

In [1]:
import pandas as pd
import s3fs

## Setup

There are multiple ways to access your AWS account. I store config and credential files in `~/.aws` as discussed [here](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html). AWS access methods find these files automatically so I don't have to worry about anything.

## List content of bucket

In [3]:
bucket = 'fgu-mdb'

fs = s3fs.S3FileSystem()
fs.ls(bucket)

['fgu-mdb/data_000.csv']

## Read file from S3

Pandas can read data directly from S3.

In [5]:
fp = f's3://fgu-mdb/data_000.csv'
df = pd.read_csv(fp, sep='|')
df.shape

(1000, 27)

## Read data with selected profile

AWS credential and config files allow you to store multiple profiles. Below I access data using a different profile depending on what machine I run the code on.

When Pandas reads files from S3 as above, it uses `botocore` under the hood and uses the default profile. To change the profile, we need to use `botocore` directly and set up a session.

In [9]:
import platform
import botocore

In [None]:
profile = 'profile-name'
if platform.system() == 'Linux':
    profile = 'default'

session = botocore.session.Session(profile=profile)
fs = s3fs.S3FileSystem(session=session)

path = 'path-to-file'
df = pd.read_csv(fs.open(fp))