## A Beginner’s Guide to Data Engineering: Harnessing the Power of Python to Access AWS Services
- This notebook illustrates how you can access AWS services programmatically using Python.
- This notebook contains code for the blog [A Beginner’s Guide to Data Engineering: Harnessing the Power of Python to Access AWS Services]()

### 1. Setup AWS Credentials

In [2]:
#  To parse the cfg file to access security credentials
import configparser

In [3]:
# Initialize the configparser object
config = configparser.ConfigParser()
# read the configuration from local file
config.read_file(open('credentials.cfg'))
# Load the parameters from the config file into variables
ACCESS_KEY             = config.get('AWS','ACCESS_KEY')
SECRET_ACCESS_KEY      = config.get('AWS','SECRET_ACCESS_KEY')
AWS_REGION             = config.get("AWS","AWS_REGION")       
print("Configuration loaded successfully !")     

Configuration loaded successfully !


### 2. Installing Boto3

In [None]:
pip install boto3 

### 3. Configuring Boto3 Client to access S3 service

In [4]:
import boto3 

# configure S3 client
s3_client = boto3.client('s3',
                region_name=AWS_REGION,
                aws_access_key_id=ACCESS_KEY,
                aws_secret_access_key=SECRET_ACCESS_KEY)

### 4. Using Boto3 Client to access S3 service

#### a. Checking list of available S3 buckets

In [9]:
try:
    # get the list of all available buckets
    response = s3_client.list_buckets()
    # print(response)
    buckets_list = []
    for res in response['Buckets']:
        buckets_list.append(res['Name'])
    print(f"Bucket list fetched successfully.\nThere are currently {len(buckets_list)} S3 buckets in your account.\nIt includes:{buckets_list}")
except Exception as e:
    print(f"Failed to fetch the list of all available buckets\nFollowing error occured:{e}")

Bucket list fetched successfully.
There are currently 6 S3 buckets in your account.
It includes:['anish-shilpakar-athena-query-results', 'anish-shilpakar-ccp-2022-demo', 'anish-shilpakar-replication-demo', 'anish-shilpakar-server-access-logs', 'coderush-anish-etl-source-data', 'elasticbeanstalk-ap-south-1-977481651193']


In [11]:
# Name of data bucket : Should be globally unique
SOURCE_DATA_BUCKET="aws-python-ace-bucket"
# check if S3 bucket already exists or not
if SOURCE_DATA_BUCKET not in buckets_list:
    print(f"Creating new S3 bucket: {SOURCE_DATA_BUCKET} . . .")
    try:
        response = s3_client.create_bucket(
            Bucket=SOURCE_DATA_BUCKET,
            CreateBucketConfiguration={
                    'LocationConstraint': AWS_REGION
                    }
        )
        print(f"S3 bucket: {SOURCE_DATA_BUCKET} created successfully !")
    except Exception as e:
        print(f"Failed to create s3 bucket: {SOURCE_DATA_BUCKET}. Following error encountered:\n{e}")
else:
    print(f"{SOURCE_DATA_BUCKET} already exists!")

Creating new S3 bucket: aws-python-ace-bucket . . .
S3 bucket: aws-python-ace-bucket created successfully !


In [12]:
FILE_TO_UPLOAD="test_data.csv"

# Uploading the dataset files from local directory to s3 bucket
try:
    print("Started uploading the files . . .")
    # here in upload_file the first parameter specifies the path of local file to upload, 2nd parameter specifies the s3 bucket where data will be uploaded to and 3rd parameter specifies the name of key to upload to.
    s3_client.upload_file(FILE_TO_UPLOAD,SOURCE_DATA_BUCKET,FILE_TO_UPLOAD)
    print(f'Successfully uploaded the data {FILE_TO_UPLOAD} to {SOURCE_DATA_BUCKET}')
except Exception as e: 
    print(f"Failed to upload datasets to s3 bucket.\nFollowing error occured:\n{e}")

Started uploading the files . . .
Successfully uploaded the data test_data.csv to aws-python-ace-bucket
