# Boto3

Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python. It is a Python library that allows you to write software that makes use of services like Amazon S3, Amazon EC2, and Amazon DynamoDB, among others.

With Boto3, you can create, configure, and manage AWS services using Python code. It provides an easy-to-use, object-oriented API that abstracts the underlying AWS service APIs, making it simpler to interact with AWS services programmatically.

### Some key features of Boto3 include:

1. Resource APIs: Boto3 provides resource APIs that allow you to interact with AWS services using higher-level abstractions. These resource objects encapsulate the low-level service operations, making it more intuitive to work with AWS resources.
2. Session Management: Boto3 manages authentication and configuration settings through sessions. You can create a session by providing your AWS access key, secret key, and optional region and profile settings.
3. Automatic Retries: Boto3 has built-in support for automatic retries. It can handle common transient errors, such as throttling or service unavailability, by automatically retrying the failed requests with exponential backoff.
4. Pagination: Boto3 simplifies working with paginated results from AWS service APIs. It provides a paginator object that allows you to iterate over the results easily without having to handle pagination manually.
5. Waiters: Boto3 offers waiters that allow you to wait for specific conditions to be met before proceeding with your code execution. For example, you can use a waiter to wait for an EC2 instance to reach a specific state before performing further actions.|

To get started with Boto3, you need to install it using pip:

`pip install boto3`

Once installed, you can import the boto3 module in your Python code and start interacting with AWS services. You'll need to configure your AWS credentials and permissions to access the desired services.

In [1]:
pip install boto3

Collecting boto3
  Downloading boto3-1.38.44-py3-none-any.whl.metadata (6.6 kB)
Collecting botocore<1.39.0,>=1.38.44 (from boto3)
  Downloading botocore-1.38.44-py3-none-any.whl.metadata (5.7 kB)
Collecting jmespath<2.0.0,>=0.7.1 (from boto3)
  Downloading jmespath-1.0.1-py3-none-any.whl.metadata (7.6 kB)
Collecting s3transfer<0.14.0,>=0.13.0 (from boto3)
  Downloading s3transfer-0.13.0-py3-none-any.whl.metadata (1.7 kB)
Downloading boto3-1.38.44-py3-none-any.whl (139 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.9/139.9 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading botocore-1.38.44-py3-none-any.whl (13.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.7/13.7 MB[0m [31m34.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading jmespath-1.0.1-py3-none-any.whl (20 kB)
Downloading s3transfer-0.13.0-py3-none-any.whl (85 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m85.2/85.2 kB[0m [31m3.0 MB/s[0m eta [36m0:0

# S3

In [6]:
import boto3 # Boto3 lib
from botocore.exceptions import ClientError # AWS exceptions

before we start, go to your user and create a access key and secret for acsses

In [7]:
AWS_ACCESS_KEY_ID = "AKIAS7IX5HS42BUNUZMX"
AWS_SECRET_ACCESS_KEY = "F3o7VH1KLt2Y3s/aNXGA0r45yPT6Vdd7agsTZgNZ"



How to create access Key and secert:

1. Open the IAM Console:
> From the AWS Console, navigate to Services → IAM (Identity and Access Management).

2. Select the User:
> In the left sidebar, click on Users, then choose the IAM user you want to create credentials for.

3. Create Access Key:
> In the user’s summary page, go to the “Security credentials” tab.
> Scroll down to the “Access keys” section and click “Create access key”.
> Choose the use case (e.g., Command Line Interface (CLI)), then click Next and Create access key.

4. Copy and Store the Keys:

> Once generated, you’ll get:
  > Access key ID
  > Secret access key

Important: Copy the Secret Access Key now—you won’t be able to retrieve it again later.

Store both keys securely (e.g., in a password manager or AWS Secrets Manager).

To work with Amazon S3 using Boto3, you have two options: using the S3 client or the S3 resource.

S3 Client: The S3 client provides a low-level interface to interact with Amazon S3. You can create an S3 client using the boto3.client() method:

With the S3 client, you can perform various operations such as creating buckets, uploading objects, downloading objects, listing buckets and objects, and more. The client provides methods that correspond to the Amazon S3 API operations.

In [8]:
s3_client = boto3.client('s3',
                         aws_access_key_id= AWS_ACCESS_KEY_ID,
                         aws_secret_access_key= AWS_SECRET_ACCESS_KEY)

S3 Resource: The S3 resource provides a higher-level, object-oriented interface to work with Amazon S3. It abstracts away some of the low-level details and provides a more intuitive way to interact with S3 buckets and objects. You can create an S3 resource using the boto3.resource() method:

With the S3 resource, you can work with buckets and objects as Python objects. It provides methods and attributes that allow you to perform common S3 operations more easily, such as creating buckets, uploading files, downloading files, and iterating over objects in a bucket.

In [10]:
s3_resource = boto3.resource('s3')

In [9]:
session = boto3.session.Session(aws_access_key_id=AWS_ACCESS_KEY_ID,
                                aws_secret_access_key=AWS_SECRET_ACCESS_KEY)

## Create an Amazon S3 bucket

In [14]:
student_name = "yair"
bucket_name = f"{student_name}-bucket-2025"
region = None #'us-east-1'

In [15]:
bucket_name

'yair-bucket-2025'

In [9]:
# Create bucket
try:
    s3_client.create_bucket(Bucket=bucket_name)
except ClientError as e:
    print(e)

In [None]:
# Create bucket with region
try:
    s3_client = boto3.client('s3', region_name=region)
    location = {'LocationConstraint': region}
    s3_client.create_bucket(Bucket=bucket_name,CreateBucketConfiguration=location)
except ClientError as e:
    print(e)

## List existing buckets

In [12]:
response = s3_client.list_buckets()

# Output the bucket names
print('Existing buckets:')
for bucket in response['Buckets']:
    print(f'  {bucket["Name"]}')

Existing buckets:
  athena-results-204597968057-1740914320
  athena-results-204597968057-1740914322
  athena-results-204597968057-1740914343
  athena-results-204597968057-1740914345
  athena-results-204597968057-1740914380
  athena-results-204597968057-1740914382
  athena-results-204597968057-1740914478
  athena-results-204597968057-1740914535
  athena-results-204597968057-1741074348
  athena-results-204597968057-1741074350
  athena-results-204597968057-1741074433
  athena-results-204597968057-1741074436
  athena-results-204597968057-1741074461
  athena-results-204597968057-1741074463
  athena-results-204597968057-1741074513
  athena-results-204597968057-1741074516
  athena-results-204597968057-1741074943
  athena-results-204597968057-1741074945
  athena-results-204597968057-1741074951
  athena-results-204597968057-1741076004
  athena-results-204597968057-1741076014
  athena-results-204597968057-1741523653
  athena-results-204597968057-1741523661
  athena0303
  aviv-athena
  aviv-athen

## Read Files from S3

In [18]:
import pandas as pd

First upload from S3 console the next file: Cakes.csv

S3 console: https://console.aws.amazon.com/s3/home

In [16]:
file_name = "cakes.csv"

#### Options #1

In [15]:
# S3 client
response = s3_client.get_object(Bucket=bucket_name, Key=file_name)

In [16]:
response

{'ResponseMetadata': {'RequestId': 'JWQXZPJQ29FQJ392',
  'HostId': 'ForZ3+r0c53b2sIXqCYaE7HosHrtopuQh8E4ZYh5XMroAsGlz3SXF9/9fbUENRWBlsSDl+pCojXpnxDt+TtMdwT0ylyDGUnN',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'ForZ3+r0c53b2sIXqCYaE7HosHrtopuQh8E4ZYh5XMroAsGlz3SXF9/9fbUENRWBlsSDl+pCojXpnxDt+TtMdwT0ylyDGUnN',
   'x-amz-request-id': 'JWQXZPJQ29FQJ392',
   'date': 'Thu, 26 Jun 2025 17:09:49 GMT',
   'last-modified': 'Thu, 26 Jun 2025 17:09:29 GMT',
   'etag': '"d7393aa09d8fae412146a1a4f0699d59"',
   'x-amz-checksum-crc64nvme': 'bAnstcib0bU=',
   'x-amz-checksum-type': 'FULL_OBJECT',
   'x-amz-server-side-encryption': 'AES256',
   'accept-ranges': 'bytes',
   'content-type': 'text/csv',
   'content-length': '6703',
   'server': 'AmazonS3'},
  'RetryAttempts': 0},
 'AcceptRanges': 'bytes',
 'LastModified': datetime.datetime(2025, 6, 26, 17, 9, 29, tzinfo=tzutc()),
 'ContentLength': 6703,
 'ETag': '"d7393aa09d8fae412146a1a4f0699d59"',
 'ChecksumCRC64NVME': 'bAnstcib0bU=',
 'Che

In [17]:
df = pd.read_csv(response['Body'])

In [18]:
df.head()

Unnamed: 0,Radius [cm],Layers,Topping,Price
0,16,1,Picture,311
1,19,3,Simple,555
2,4,2,Decorative,89
3,7,1,Picture,90
4,7,2,Decorative,100


#### Options #2

In [2]:
pip install s3fs



In [19]:
# useing S3FileSystem
import s3fs

s3 = s3fs.S3FileSystem(anon=False, key= AWS_ACCESS_KEY_ID, secret=AWS_SECRET_ACCESS_KEY)
uri_file = f'{bucket_name}/{file_name}'

# Use 'w' for py3, 'wb' for py2
with s3.open(uri_file,'rb') as f:
    c = pd.read_csv(f)

In [20]:
df

Unnamed: 0,Radius [cm],Layers,Topping,Price
0,16,1,Picture,311
1,19,3,Simple,555
2,4,2,Decorative,89
3,7,1,Picture,90
4,7,2,Decorative,100
...,...,...,...,...
367,10,1,Writing,119
368,14,1,Simple,235
369,6,1,Writing,36
370,4,3,Extreme,180


#### Options #3

you can also use enverment varbles

1- Add access credentials to your ~/.aws/credentials config file

[default]
> aws_access_key_id=AKIAIOSFODNN7EXAMPLE

> aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

Or

2- Set the following environment variables with their proper values:

> aws_access_key_id

> aws_secret_access_key

> aws_session_token`


* Not working on notebook

In [None]:
# and then you can just call read_csv
df = pd.read_csv('s3://yair-bucket/cakes.csv',)

## Upload files to s3

In [None]:
df_to_upload = df.sample(100)

#### Options #1

In [None]:
# useing S3FileSystem
import s3fs

s3 = s3fs.S3FileSystem(anon=False)

# Use 'w' for py3, 'wb' for py2
with s3.open(f'{bucket_name}/new_dataset.csv','w') as f:
    df_to_upload.to_csv(f)

#### Options #2

In [None]:
df_to_upload.to_csv('temp.csv')

In [None]:
response = s3_client.upload_file('temp.csv', bucket_name,'new_dataset.csv')

# Your Turn

1. Create an access key and secret
2. Create new bucket with your name and the word bucket, for exmple: "yair_bucket" - you can use the console or python code
3. Upload the next Dataset into your bucket: [dataset_open_close](https://drive.google.com/file/d/1QpFOw6eLpwC6CJRqU_nXcfvITjJyTPid/view?usp=sharing)
4. Copy the file "dataset_high_low" from Bucket "naya-college-public" to your bucket (check in the internet from copy from one bucket to other)
5. Read both datasets and join then by key "Date"
6. Save the new Dataset into your bucket inside new folder call "Result"