# S3 file manipulations using boto3

The purpose of this notebook is to illustrate how to move files in S3 buckets, by copying and deleting files using the boto3 python library.

The example I will use consists of an S3 bucket that has three folders: <b>input</b>, <b>archived</b> and <b>errors</b>. 

<img src='files/s3_all_three_folders.png' width=400> <br>
<par>
Initially, the <b>archived</b> and <b>errors</b> folders are empty, whereas the <b>input</b> folder has a single file in it: *IF009_1.txt* <br>

<img src='files/s3_input_folder.png' width=400>


In [1]:
import pandas as pd
import yaml
import boto3

In [2]:
# get authorization keys for S3
with open('files/aws_config.yml', 'r') as f:
    doc = yaml.load(f)
    accessKey = doc['accessKey']
    secretKey = doc['secretKey']
    bucket_name = doc['bucket_name']

In [3]:
# instantiate S3 bucket 
s3 = boto3.resource('s3',
                    aws_access_key_id=accessKey,
                    aws_secret_access_key=secretKey)
bucket = s3.Bucket(bucket_name)


In [4]:
# the paths to the input file and the destination where I want to move the input file to 
input_key = 'IF009_data/input/IF009_1.txt'
output_key = 'IF009_data/archived/IF009_1.txt'


In [5]:
# instantiate output object
output_obj = bucket.Object(output_key)

# copy file from input_key to output_key
return_message = output_obj.copy_from(CopySource={'Bucket':bucket_name, 'Key':input_key})
HTTP_Status = return_message["ResponseMetadata"]["HTTPStatusCode"]
print HTTP_Status

200


This has successfully copied the file from the <b>input</b> folder to the <b>archived</b> folder. Now we need to remove the file from the <b>input</b> folder.

In [6]:
# if the copy object call was successful, delete the original 
if str(HTTP_Status) == "200":
    input_obj = bucket.Object(input_key)
    delete_message = input_obj.delete()


If you want to add a new folder, the key to the folder must end with a "/", otherwise the new folder will be added as a file.

In [7]:
new_key = 'IF009_data/new_folder/'
replace_folder = bucket.put_object(Bucket=bucket_name, Body='', Key=new_key)


The new directory structure of the bucket looks as follows:<br>
<img src='files/s3_new_folder.png' width=400>
