
<span>This notebook connects to a public bucket on Amazon Webservice's S3 Cloud Service and uploads the data from your notebook/script. Having the data stored on the cloud makes it possible for anyone on our team to pull data and use it for their analysis. Note, that you will need to create an Amazon S3 bucket and bucket policy beforehand. The second half of the notebook will teach you how to do the opposite. Giving you the code to pull your data back down.</span>


### Import Data

In [None]:
# Import modules
import pandas as pd
import boto3
from io import StringIO

# Import some pokemon data
df = pd.read_csv('Data/pokemon.csv')

# View the head of the data
df.head()

### Writing Data to AWS S3

In [None]:
# Create a csv string buffer
csv_buffer = StringIO()

# same dataframe to csv buffer
df.to_csv(csv_buffer)

# Intializes the S3 resource in boto3
s3 = boto3.resource('s3')

# Connect to a public bucker and save our data 
# 'bucker_name' is a placeholder for this example
s3.Bucket('bucket_name').put_object(v
    Key='demo/pokemon.csv', Body=csv_buffer.getvalue())

### Reading a File from AWS S3

In [None]:
# Import modules
import boto3
import pandas as pd
from io import BytesIO

# Set up a client, resource and bucket connection
client = boto3.client('s3')
resource = boto3.resource('s3')
bucket = resource.Bucket('demobucket')

# pull down the aws csv file from your bucket
csv_file = client.get_object(
    Bucket='demobucket', Key='pokemon.csv')

# Conver the dataframe to a csv file
df = pd.read_csv(BytesIO(csv_file['Body'].read()))

# View the head of the dataframe
df.head(5)

### Read From S3 Function

In [None]:
# Implementing the above code as a function
def read_s3_csv(bucket_name, file_path):
    '''
    Connect to S3 given the bucker named and file key that your have 
    provided and returns the dataframe
    
    Parameters
    ----------
    bucket_names (str): the name of the S3 buckers your connecting to
    file_path (str): name of the file within the S3 bucket
    
    Example
    ----------
    >>>> read_s3_csv('demobucket','pokemon.csv')
    >>>> read_s3_csv('demobucket','digimon.csv')
    
    '''
    client = boto3.client('s3')
    resource = boto3.resource('s3')
    bucket = resource.Bucket(bucket_name)

    csv_file = client.get_object(
        Bucket=bucket_name, Key=file_path)

    df = pd.read_csv(BytesIO(csv_file['Body'].read()))
    return df

### Calling the Function

In [None]:
# Rrun the read_s3_function
df = read_s3_csv('demobucket','pokemon.csv')

# view the head of the dataframe
df.head()