# boto3 set up and access

Create a credentials file (plain text) in the root folder `~/.aws/credentials`

Format:

[Profile Name]

aws_access_key_id = xxx

aws_secret_access_key = xxx/xxx

AWS Credentials (if stored in the location above) will be automatically detected by`boto3` when establishing a connection.


**Session** is where to initiate the connectivity to AWS services. E.g. following is default session that uses the default credential profile(e.g. ~/.aws/credentials, or assume your EC2 using IAM instance profile )

**Resource** This is the high-level service class recommended to be used. This allows you to tied particular AWS resources and passes it along, so you just use this abstraction than worry which target services are pointed to.

**Clients** provide a low-level interface to the AWS service. Their definitions are generated by a JSON service description present in the botocore library. The botocore package is shared between boto3 as well as the AWS CLI.

To summarize, resources are higher-level abstractions of AWS services compared to clients. Resources are the recommended pattern to use boto3 as you don’t have to worry about a lot of the underlying details when interacting with AWS services. As a result, code written with Resources tends to be simpler.

However, Resources aren’t available for all AWS services. In such cases, there is no other choice but to use a Client instead.

In [1]:
import boto3
import pandas as pd

In [4]:
session = boto3.Session(
    profile_name='Admin_Profile', 
    region_name= 'eu-west-2')

In [5]:
s3 = session.resource('s3')

In [6]:
buckets = s3.buckets.all()

In [7]:
for bucket in buckets:
    print(bucket)

s3.Bucket(name='spotify-etl-data')


Let's try and load a file to the bucket on S3

In [3]:
parquet_file_to_load = pd.read_parquet('tmp/rock_albums.parquet')

Access the bucket in the S3 resource using the s3.Bucket() method and invoke the upload_file() method to upload the files

In [9]:
s3.Bucket('spotify-etl-data').upload_file(Filename = 'tmp/rock_albums.parquet', Key='rock_albums.parquet')

In [11]:
for my_bucket_object in s3.Bucket('spotify-etl-data').objects.all():
    print(my_bucket_object.key)

rock_albums.parquet


In [12]:
from datetime import datetime

In [23]:
date_today = datetime.today().strftime('%Y-%m-%d')

In [24]:
date_today

'2023-07-08'

In [27]:
def upload_to_s3(local_path, key, bucket):
    '''
    Upload a file to S3
    Needs a boto3 session amd resource to be set up and named s3
    '''
    try:
        date_today = datetime.today().strftime('%Y-%m-%d')

        s3_filename = f'{date_today}_{key}'

        s3.Bucket(bucket).upload_file(Filename = local_path, Key = s3_filename)
        
        print(f'Uploaded file {s3_filename} to {bucket}')
    
    except:
        print('Something went wrong - check inputs')


In [30]:
upload_to_s3('tmp/rock_albums.parquet', key= 'rock_albums.parquet', bucket = 'spossssstify-etl-data')

Something went wrong - check inputs
