# S3 Bucket: List / Read / Write Functions


## First Section: is how to List files currently on the S3


## Second Section: is to call a CSV from S3 and convert it to a DF to have something to test with.  


## Third Section: Shows two different methods for Write functions to the S3 bucket


##  All of the below relies on haveing AWS CLI keys in place and working. Without out the AWS CLI AWS will reject the call.

In [47]:
# Libraries required for List / Read / Write.
import boto3
import pandas as pd
from io import StringIO #This library will only be used for the write.  It is not necessary to Read.

## !This line needs to be called for all functions. List / Read / Write!

In [None]:
client = boto3.client('s3')

# List Function
### The below function allows you to call all files in the S3 without direct access.

In [79]:
# List Function from the S3

bucket_name = 'ucsd.final.project.music'
response = client.list_objects_v2(Bucket=bucket_name)
files = response.get("Contents")
for file in files:
    print(f"File_Name: {file['Key']}, Size: {file['Size']}, Date_M: {file['LastModified']}")

File_Name: stuff_df.csv, Size: 129, Date_M: 2022-10-20 02:46:55+00:00
File_Name: test_data.csv, Size: 119, Date_M: 2022-10-16 21:26:16+00:00
File_Name: test_upload.csv, Size: 129, Date_M: 2022-10-20 02:49:26+00:00
File_Name: top50.csv, Size: 3873, Date_M: 2022-10-16 22:51:49+00:00


## Read Function

In [50]:
file_name = input("What file do you want to pull into a dataframe? Don't forget the extension: ")
file_name

What file do you want to pull into a dataframe? Don't forget the extension: top50.csv


'top50.csv'

In [51]:
path = 's3://ucsd.final.project.music/' +str(file_name)
path

's3://ucsd.final.project.music/top50.csv'

In [52]:
df = pd.read_csv(path)

In [53]:
df.head()

Unnamed: 0.1,Unnamed: 0,Track.Name,Artist.Name,Genre,Beats.Per.Minute,Energy,Danceability,Loudness..dB..,Liveness,Valence.,Length.,Acousticness..,Speechiness.,Popularity
0,1,Señorita,Shawn Mendes,canadian pop,117,55,76,-6,8,75,191,4,3,79
1,2,China,Anuel AA,reggaeton flow,105,81,79,-4,8,61,302,8,9,92
2,3,boyfriend (with Social House),Ariana Grande,dance pop,190,80,40,-4,16,70,186,12,46,85
3,4,Beautiful People (feat. Khalid),Ed Sheeran,pop,93,65,64,-8,8,55,198,12,19,86
4,5,Goodbyes (Feat. Young Thug),Post Malone,dfw rap,150,65,58,-4,11,18,175,45,7,94


## This ends the Read Segment of Code.

##  This begins the Write Segment of Code.  There are two methods below.
##  This is Method I.

In [54]:
# Converting to different name to prove it uploads into S3.
aws_ul_test1_df = df
aws_ul_test1_df.head()

Unnamed: 0.1,Unnamed: 0,Track.Name,Artist.Name,Genre,Beats.Per.Minute,Energy,Danceability,Loudness..dB..,Liveness,Valence.,Length.,Acousticness..,Speechiness.,Popularity
0,1,Señorita,Shawn Mendes,canadian pop,117,55,76,-6,8,75,191,4,3,79
1,2,China,Anuel AA,reggaeton flow,105,81,79,-4,8,61,302,8,9,92
2,3,boyfriend (with Social House),Ariana Grande,dance pop,190,80,40,-4,16,70,186,12,46,85
3,4,Beautiful People (feat. Khalid),Ed Sheeran,pop,93,65,64,-8,8,55,198,12,19,86
4,5,Goodbyes (Feat. Young Thug),Post Malone,dfw rap,150,65,58,-4,11,18,175,45,7,94


In [55]:
to_s3_upload_name = input('What do you want to name this file in the S3 Bucket?  Do not forget the file extension: ')

What do you want to name this file in the S3 Bucket?  Do not forget the file extension: top50_test_method1.csv


In [56]:
to_s3_upload_name

'top50_test_method1.csv'

In [57]:
# If HTTPStatusCode = 200 and the RetryAttempts = 0 this is a good indicator of success.
filename = to_s3_upload_name
bucketName = 'ucsd.final.project.music'

csv_buffer = StringIO()
aws_ul_test1_df.to_csv(csv_buffer)

response = client.put_object(
    ACL='private',
    Body=csv_buffer.getvalue(),
    Bucket=bucketName,
    Key=filename)

response

{'ResponseMetadata': {'RequestId': 'S3D5BASVJD9TCG74',
  'HostId': 'J7T4cQHGL4iw/lzB+bD8uvZZyiyOwO7VSVDRHH2wJI3Ub36x3aS/fYhwDp4WDKqw6bFLw31Cc2U=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'J7T4cQHGL4iw/lzB+bD8uvZZyiyOwO7VSVDRHH2wJI3Ub36x3aS/fYhwDp4WDKqw6bFLw31Cc2U=',
   'x-amz-request-id': 'S3D5BASVJD9TCG74',
   'date': 'Thu, 20 Oct 2022 18:14:48 GMT',
   'etag': '"46e0f80d314f6721dd75a32b1e0f28df"',
   'server': 'AmazonS3',
   'content-length': '0'},
  'RetryAttempts': 0},
 'ETag': '"46e0f80d314f6721dd75a32b1e0f28df"'}

##  This ends Method I.

## This begins Method II.

In [58]:
to_s3_upload_name2 = input('What do you want to name this file in the S3 Bucket?  Do not forget the file extension: ')
to_s3_upload_name2

What do you want to name this file in the S3 Bucket?  Do not forget the file extension: top50_test_method2.csv


'top50_test_method2.csv'

In [59]:
# If HTTPStatusCode = 200 and the RetryAttempts = 0 this is a good indicator of success.
bucket = 'ucsd.final.project.music' # Already created on S3 so this is hardcoded.
csv_buffer = StringIO()
stuff_df.to_csv(csv_buffer)
s3_resource = boto3.resource('s3')
s3_resource.Object(bucket, to_s3_upload_name2).put(Body=csv_buffer.getvalue())

{'ResponseMetadata': {'RequestId': 'DB9KGZBHJNB693EV',
  'HostId': 'Svjbw16h/3uBHTidCEhGxo87+H+3gd2U4UpaX66ChgbExVEdRxDTTMLSFsVgpQ7T0gwEo+NHTzI=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'Svjbw16h/3uBHTidCEhGxo87+H+3gd2U4UpaX66ChgbExVEdRxDTTMLSFsVgpQ7T0gwEo+NHTzI=',
   'x-amz-request-id': 'DB9KGZBHJNB693EV',
   'date': 'Thu, 20 Oct 2022 18:15:08 GMT',
   'etag': '"d5347638b86b0ae87eb9e312df865f6d"',
   'server': 'AmazonS3',
   'content-length': '0'},
  'RetryAttempts': 0},
 'ETag': '"d5347638b86b0ae87eb9e312df865f6d"'}

## This Ends Method II