# AWS S3 Buckets Example

## Assumptions
 - Working from a Jupyter notebook locally
 - Not concerned about access or keeping data private

For the purpose of this lesson, the `wine_classifier.pickle` file has already been uploaded to Greg's AWS account.  To make your pickle file available from code, you would need to make an account and upload the file.

## CLI Interface
Installation instructions [here](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html), CLI docs [here](https://docs.aws.amazon.com/cli/latest/reference/s3/).  You will need to use this to upload large files (somewhere around 160 MB) but it's clunkier than integrating directly into Python and won't work with all deployment techniques.

In [None]:
# !aws s3 cp s3://flatiron-ds-2020-07-28/wine_classifier.pickle wine_classifier.pickle

## Python Code

In [None]:
import pickle
BUCKET = "greg-flatiron-example"
PICKLE_FILE = "wine_classifier.pickle"
CSV_FILE = "wine.csv"

## Boto 3
This is an SDK for connecting Python to S3 buckets.  Docs [here](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html). This is a good tool for large pickle files.

In [None]:
#!conda install boto3 -y

In [None]:
import boto3
s3 = boto3.resource("s3")

obj = s3.Object(BUCKET, PICKLE_FILE)

In [None]:
obj.get()

In [None]:
response_body = obj.get()["Body"].read()
response_body[:100]

In [None]:
loaded_model = pickle.loads(response_body)

In [None]:
loaded_model.get_params()

## Pandas

If you are trying to load a CSV rather than a pickle, you can do that directly in Pandas!  First you'll need to install `S3FS` (docs [here](https://s3fs.readthedocs.io/en/latest/)) which allows Python to access S3 buckets like they are part of the local file system.  

In [None]:
#!conda install s3fs -c conda-forge -y

In [None]:
from os import path
import pandas as pd
file_path = path.join("s3://", BUCKET, CSV_FILE)

In [None]:
df = pd.read_csv(file_path)
df.head()