# Upload, Read, Download and Delete S3 Bucket objects/files

## To configure IAM

- __*Get Access Key ID and Secret Access Key*__
     - After logging in AWS account, in the AWS Console, in the top search bar, enter __*IAM*__ (iam)
     - You will see related options. Select __*'IAM'*__ box. It is easily understandable which one to select.
     - In the left panel, select __*Users*__
     - In the top-right, click on the button __*Add users*__
     - In the __*User name*__ box, enter the name.
     - In the __*Select AWS credential type*__, select any or both the checkboxes as per your business requirements. I am selecting the 1st checkbox.
     - In th left-right, click on the __*Next: Permissions*__ button.
     - Under __*Set permissions*__, click on the __*Create Group button*__
     - In the __*Group name*__ box, give the group name and then select one of the policy checkbox to have control over access. I am selecting __*AdministratorAccess*__
     - In the left-right, click on the __*Create Group button*__
     - In the lft-right, click on the __*Next: Tags*__, then __*Next: Review*__, then __*Create user*__
     - Finally, you will the __*Access Key ID*__ and __*Secret Key*__. You can copy paste this data.
     - You can even download CSV for the keys. This should not be shared with anyone. It should be confidential.
     - We will need these keys to create buckets and add objects/files in it.
     
- __*Configure IAM user*__
     - After getting __*Access Key ID*__ and __*Secret Access Key*__, in the CMD or anaconda prompt, type __*aws configure*__
     - It will ask __*Access Key ID*__, __*Secret Access Key*__, __*Region Name*__ and __*Output Format*__, give them all.
     
##### Refer to the video: https://www.youtube.com/watch?v=qGS9UiCFVbo&ab_channel=AWSMadeEasy

In [5]:
import pandas as pd
import boto3
import os

### Create 2 dummy dataframe to upload to S3 Bucket

In [3]:
df_dummy_1 = pd.DataFrame({'name':['sandeep', 'sandy', 'maddy'],
                            'age':['25','22','26']})

df_dummy_1

Unnamed: 0,name,age
0,sandeep,25
1,sandy,22
2,maddy,26


In [4]:
df_dummy_2 = pd.DataFrame({'name':['sam', 'sammy', 'maddy'],
                            'age':['21','23','24']})

df_dummy_2

Unnamed: 0,name,age
0,sam,21
1,sammy,23
2,maddy,24


### Save both files to our local path

In [21]:
df_dummy_1.to_csv('dummy_1.csv', index=False)
df_dummy_2.to_csv('dummy_2.csv', index=False)

print('Displaying only CSV files...')
print([i for i in os.listdir() if i.split('.')[-1]=='csv'])

Displaying only CSV files...
['dummy_1.csv', 'dummy_2.csv', 'new_user_credentials.csv']


### Get Access Key ID and Secret Access Key from the downloaded csv while configuring IAM user
    - To configure, IAM, refer the 1st marked down cell

In [22]:
df_keys = pd.read_csv('new_user_credentials.csv')

# getting access key id and secret access key
access_key_id = df_keys['Access key ID'][0]
secret_access_key = df_keys['Secret access key'][0]

# Creating boto3 resource instance

In [25]:
# connect to the s3 service with credentials and info
s3_boto = boto3.resource(service_name='s3',
                         region_name='us-east-1',
                         aws_access_key_id=access_key_id,
                         aws_secret_access_key=secret_access_key)

### Check the buckets we have

In [28]:
for b in s3_boto.buckets.all():
    print(b)

s3.Bucket(name='first-bucket-555')


##### We have only one bucket with the name first-bucket-555. We will upload files to it

# Uploading files to s3 bucket first-bucket-555

In [29]:
# select the bucket
bucket_select = s3_boto.Bucket('first-bucket-555')

In [33]:
# check if we have any objects/files in it

for f in bucket_select.objects.all():
    print(f)

##### It printed nothing, meaning no files in the bucket

In [36]:
# bucket_select is the instance for selected bucket created above

# uploading dummy_1 file
bucket_select.upload_file(Filename='dummy_1.csv', Key='s3_dummy_1.csv')

# uploading dummy_2 file
bucket_select.upload_file(Filename='dummy_2.csv', Key='s3_dummy_2.csv')

In [37]:
# Now check if we have any objects/files in it

for f in bucket_select.objects.all():
    print(f)

s3.ObjectSummary(bucket_name='first-bucket-555', key='s3_dummy_1.csv')
s3.ObjectSummary(bucket_name='first-bucket-555', key='s3_dummy_2.csv')


##### 2 objects/files created inside the first-bucket-555.

# Delete one of the files from the bucket

In [40]:
# select the bucket
bucket_select = s3_boto.Bucket('first-bucket-555')

bucket_select.Object('s3_dummy_2.csv').delete()

{'ResponseMetadata': {'RequestId': 'A9JMQE6PDB47JR09',
  'HostId': 'egfFX+krb8SQe0r2o9qZxA99TM53/AWBDRB8T2di8ttp+7FhsqWHwMCXK1BOnhoiXiqiqTlL/cE=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'x-amz-id-2': 'egfFX+krb8SQe0r2o9qZxA99TM53/AWBDRB8T2di8ttp+7FhsqWHwMCXK1BOnhoiXiqiqTlL/cE=',
   'x-amz-request-id': 'A9JMQE6PDB47JR09',
   'date': 'Tue, 14 Dec 2021 12:44:35 GMT',
   'server': 'AmazonS3'},
  'RetryAttempts': 0}}

In [41]:
# Now check if we have any objects/files in it

for f in bucket_select.objects.all():
    print(f)

s3.ObjectSummary(bucket_name='first-bucket-555', key='s3_dummy_1.csv')


##### Deleted s3_dummy_2.csv object from the bucket

# Get the files from bucket

In [48]:
# select the bucket
bucket_select = s3_boto.Bucket('first-bucket-555')

# get s3_dummy_1 and save in local
obj = bucket_select.Object('s3_dummy_1.csv').get()

In [49]:
obj

{'ResponseMetadata': {'RequestId': 'KJYQ653RG4X7D9D9',
  'HostId': 'Js3kL0gEz7C8A3LajvI+xx85uF1LYLvJxv3NvutTKB8yBuuCixXLT+NwZ84gZ5i6EicuYWh23l4=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'Js3kL0gEz7C8A3LajvI+xx85uF1LYLvJxv3NvutTKB8yBuuCixXLT+NwZ84gZ5i6EicuYWh23l4=',
   'x-amz-request-id': 'KJYQ653RG4X7D9D9',
   'date': 'Tue, 14 Dec 2021 12:47:31 GMT',
   'last-modified': 'Tue, 14 Dec 2021 12:38:58 GMT',
   'etag': '"11bcd569cbd8205402394ecaf557ee4b"',
   'accept-ranges': 'bytes',
   'content-type': 'binary/octet-stream',
   'server': 'AmazonS3',
   'content-length': '42'},
  'RetryAttempts': 0},
 'AcceptRanges': 'bytes',
 'LastModified': datetime.datetime(2021, 12, 14, 12, 38, 58, tzinfo=tzutc()),
 'ContentLength': 42,
 'ETag': '"11bcd569cbd8205402394ecaf557ee4b"',
 'ContentType': 'binary/octet-stream',
 'Metadata': {},
 'Body': <botocore.response.StreamingBody at 0x28c547bee80>}

##### Our data is inside the key Body

In [50]:
df_get = pd.read_csv(obj['Body'])

df_get

Unnamed: 0,name,age
0,sandeep,25
1,sandy,22
2,maddy,26


# Download the file from the bucket

In [51]:
# check the csv files we have in our local
print('Displaying only CSV files...')
print([i for i in os.listdir() if i.split('.')[-1]=='csv'])

Displaying only CSV files...
['dummy_1.csv', 'dummy_2.csv', 'new_user_credentials.csv']


In [53]:
# select the bucket
bucket_select = s3_boto.Bucket('first-bucket-555')

# download and save in our local
bucket_select.download_file(Key='s3_dummy_1.csv', Filename='dummy_1_downloaded.csv')

df_download = pd.read_csv('dummy_1_downloaded.csv')
df_download

Unnamed: 0,name,age
0,sandeep,25
1,sandy,22
2,maddy,26


In [54]:
# check the csv files we have in our local
print('Displaying only CSV files...')
print([i for i in os.listdir() if i.split('.')[-1]=='csv'])

Displaying only CSV files...
['dummy_1.csv', 'dummy_1_downloaded.csv', 'dummy_2.csv', 'new_user_credentials.csv']


##### File dummy_1.csv downloaded and saved as dummy_1_downloaded.csv