# Boto3 & LocalStudio

![Boto3 & LocalStudio image](img/s3-10.png)


## üè† What is localstack?

**Localstack** is a platform that provides a local version of several cloud services, allowing you to simulate a development environment with AWS services. This allows you to debug and refine your code before deploying it to a production environment. For this reason, Localstack is a valuable tool for emulating essential AWS services such as object storage and message queues, among others.

Also, **Localstack** serves as an effective tool for learning to implement and deploy services using a Docker container without the need for an AWS account or the use of your credit card. 
In this tutorial, we create a Localstack container to implement the main functionalities of S3 services.

---

## What is boto3?

**`Boto3`** is a üêç Python library that allows the integration with AWS services, facilitating various tasks such as creation, management, and configuration of these services.

There are two primary implementations within Boto3: 
* **Resource implementation**: provides a higher-level, object-oriented interface, abstracting away low-level details and offering simplified interactions with AWS services. 
* **Client implementation**: offers a lower-level, service-oriented interface, providing more granular control and flexibility for interacting with AWS services directly.


---

## Prerequisites
Before you begin, ensure that you have the following installed:

* üê≥ Docker
* üêô Docker Compose


---

### üöÄ Build and run the Docker Compose environment

#### 1. Clone the repository
 ```bash
   git clone https://github.com/r0mymendez/LocalStack-boto3.git
   cd LocalStack-boto3
```
#### 2. Build an run the docker compose 
  
`docker-compose -f docker-compose.yaml up --build`

---

### üöÄ Using LocalStack with Boto3: A Step-by-Step Guide
### üõ†Ô∏è Install Boto3

```!pip install boto3```




In [109]:
import boto3
import json 
import requests
import pandas as pd
from datetime import datetime
import io
import os

### üõ†Ô∏è Create a session using the localstack endpoint
The following code snippet initializes a client for accessing the S3 service using the LocalStack endpoint.

In [29]:

s3 = boto3.client(
    service_name='s3',
    aws_access_key_id='test',
    aws_secret_access_key='test',
    endpoint_url='http://localhost:4566',
)

### üõ†Ô∏è Create new buckets
Below is the code snippet to create new buckets using the Boto3 library

In [55]:
# create buckets
bucket_name_news = 'news'
bucket_name_config = 'news-config'

s3.create_bucket(Bucket= bucket_name_new )
s3.create_bucket(Bucket=bucket_name_config)

{'ResponseMetadata': {'RequestId': 'b421bab1-f7d4-4da5-9559-99a9be7132d6',
  'HostId': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/xml',
   'access-control-allow-origin': '*',
   'access-control-allow-methods': 'HEAD,GET,PUT,POST,DELETE,OPTIONS,PATCH',
   'access-control-allow-headers': 'authorization,cache-control,content-length,content-md5,content-type,etag,location,x-amz-acl,x-amz-content-sha256,x-amz-date,x-amz-request-id,x-amz-security-token,x-amz-tagging,x-amz-target,x-amz-user-agent,x-amz-version-id,x-amzn-requestid,x-localstack-target,amz-sdk-invocation-id,amz-sdk-request',
   'access-control-expose-headers': 'etag,x-amz-version-id',
   'vary': 'Origin',
   'location': '/news-config',
   'x-amz-request-id': 'b421bab1-f7d4-4da5-9559-99a9be7132d6',
   'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
   'connection': 'close',
   '

### üìã List all buckets
After creating a bucket, you can use the following code to list all the buckets available at your endpoint.

In [56]:
# List all buckets
response = s3.list_buckets()
pd.json_normalize(response['Buckets'])

Unnamed: 0,Name,CreationDate
0,news,2024-02-11 13:34:50+00:00
1,news-config,2024-02-11 13:34:50+00:00


### üì§ Upload the JSON file to s3
Once we extract data from the API to gather information about news topics, the following code generates a JSON file and uploads it to the S3 bucket previously created.

In [57]:
# invoke the config news
url = 'https://ok.surf/api/v1/cors/news-section-names' 
response = requests.get(url)
if response.status_code==200:
    data = response.json()
    # ad json file to s3
    print('data', data)
    # upload the data to s3
    s3.put_object(Bucket=bucket_name_config, Key='news-section/data_config.json', Body=json.dumps(data))



data ['US', 'World', 'Business', 'Technology', 'Entertainment', 'Sports', 'Science', 'Health']


### üìã List all objects
Now, let's list all the objects stored in our bucket. Since we might have stored a JSON file in the previous step, we'll include code to retrieve all objects from the bucket.

In [77]:
def list_objects(bucket_name):
    response = s3.list_objects(Bucket=bucket_name)
    return pd.json_normalize(response['Contents'])

In [78]:
# list all objects in the bucket
list_objects(bucket_name=bucket_name_config)

Unnamed: 0,Key,LastModified,ETag,Size,StorageClass,Owner.DisplayName,Owner.ID
0,news-section/data_config.json,2024-02-11 16:45:59+00:00,"""d61029b184d21dae1febcb46062216d3""",89,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...


###  üìÑ Upload multiple CSV files to s3 
In the following code snippet, we will request another method from the API to extract news for each topic. Subsequently, we will create different folders in the bucket to save CSV files containing the news for each topic. This code enables you to save multiple files in the same bucket while organizing them into folders based on the topic and the date of the data request.

In [59]:
# Request the news feed API Method
url = 'https://ok.surf/api/v1/news-feed' 
response = requests.get(url)
if response.status_code==200:
    data = response.json()

# Add the json file to s3
folder_dt =  f'dt={datetime.now().strftime("%Y%m%d")}'

for item in data.keys():
    tmp = pd.json_normalize(data[item])
    tmp['section'] = item   
    tmp['download_date'] = datetime.now()
    tmp['date'] = pd.to_datetime(tmp['download_date']).dt.date
    path = f"s3://{bucket_name_news}/{item}/{folder_dt}/data_{item}_news.csv"

    # upload multiple files to s3
    bytes_io = io.BytesIO()
    tmp.to_csv(bytes_io, index=False)
    bytes_io.seek(0)
    s3.put_object(Bucket=bucket_name_news, Key=path, Body=bytes_io)



In [76]:
# list all objects in the bucket
list_objects(bucket_name=bucket_name_news)


Unnamed: 0,Key,LastModified,ETag,Size,StorageClass,Owner.DisplayName,Owner.ID
0,s3://news/Business/dt=20240211/data_Business_n...,2024-02-11 16:54:44+00:00,"""726c3efe54a64430696e9b439b86359d""",47379,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
1,s3://news/Entertainment/dt=20240211/data_Enter...,2024-02-11 16:54:45+00:00,"""98077f8ac226ed86dedc1386d9d5a83d""",48912,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
2,s3://news/Health/dt=20240211/data_Health_news.csv,2024-02-11 16:54:45+00:00,"""fdef00b0c81c43688ac13070711b8fcd""",48812,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
3,s3://news/Science/dt=20240211/data_Science_new...,2024-02-11 16:54:45+00:00,"""f09bbefec5f3d2ebe735c350ba94cf04""",41337,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
4,s3://news/Sports/dt=20240211/data_Sports_news.csv,2024-02-11 16:54:45+00:00,"""63f410354b8428f0f3b8bb6c410da9ef""",44785,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
5,s3://news/Technology/dt=20240211/data_Technolo...,2024-02-11 16:54:46+00:00,"""0ccf318f020879415b4427e6089e830d""",46095,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
6,s3://news/US/dt=20240211/data_US_news.csv,2024-02-11 16:54:46+00:00,"""3dbaee3dca2c9f558f142df702a74c00""",48665,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
7,s3://news/World/dt=20240211/data_World_news.csv,2024-02-11 16:54:46+00:00,"""2d8ecaf9f961f9b1040da9ede525654d""",46466,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...


###  üìÑ Read csv file from s3 
In this section, we aim to read a file containing news about technology topics from S3. To accomplish this, we first retrieve the name of the file in the bucket. Then, we read this file and print the contents as a pandas dataframe.

In [88]:
# Get the technology file
files = list_objects(bucket_name=bucket_name_news)
technology_file = files[files['Key'].str.find('Technology')>=0]['Key'].values[0]
print('file_name',technology_file)

file_name s3://news/Technology/dt=20240211/data_Technology_news.csv


In [89]:
# get the file from s3 using boto3
obj = s3.get_object(Bucket=bucket_name_news, Key=technology_file)
data_tech = pd.read_csv(obj['Body'])

data_tech

Unnamed: 0,link,og,source,source_icon,title,section,download_date,date
0,https://news.google.com/articles/CBMiqgFodHRwc...,https://encrypted-tbn2.gstatic.com/images?q=tb...,Times of India,https://encrypted-tbn2.gstatic.com/faviconV2?u...,OpenAI CEO Sam Altman says Apple Vision Pro is...,Technology,2024-02-11 17:54:45.929840,2024-02-11
1,https://news.google.com/articles/CBMiNGh0dHBzO...,https://encrypted-tbn1.gstatic.com/images?q=tb...,9to5Google,https://encrypted-tbn3.gstatic.com/faviconV2?u...,Gemini voice commands no longer require you to...,Technology,2024-02-11 17:54:45.929840,2024-02-11
2,https://news.google.com/articles/CBMiP2h0dHA6L...,https://encrypted-tbn0.gstatic.com/images?q=tb...,BikeRadar,https://encrypted-tbn0.gstatic.com/faviconV2?u...,"With Zwift planning to test Apple Vision Pro, ...",Technology,2024-02-11 17:54:45.929840,2024-02-11
3,https://news.google.com/articles/CBMihQFodHRwc...,https://encrypted-tbn2.gstatic.com/images?q=tb...,Engadget,https://encrypted-tbn1.gstatic.com/faviconV2?u...,Two of our favorite Anker power banks are on s...,Technology,2024-02-11 17:54:45.929840,2024-02-11
4,https://news.google.com/articles/CBMiamh0dHBzO...,https://encrypted-tbn2.gstatic.com/images?q=tb...,TechRadar,https://encrypted-tbn2.gstatic.com/faviconV2?u...,"Whatever you do, don't buy a Samsung Galaxy S2...",Technology,2024-02-11 17:54:45.929840,2024-02-11
...,...,...,...,...,...,...,...,...
65,https://news.google.com/articles/CBMiQGh0dHBzO...,https://encrypted-tbn3.gstatic.com/images?q=tb...,Space.com,https://encrypted-tbn2.gstatic.com/faviconV2?u...,New 'Judas' game trailer teases a dystopian sc...,Technology,2024-02-11 17:54:45.929840,2024-02-11
66,https://news.google.com/articles/CCAiC3g0dS1JV...,https://encrypted-tbn3.gstatic.com/images?q=tb...,IGN,https://yt3.ggpht.com/aBCeBf7Qlr3OwsS-RB3Mgql_...,Florida Joker Wants to Voice Himself in GTA 6 ...,Technology,2024-02-11 17:54:45.929840,2024-02-11
67,https://news.google.com/articles/CBMiTGh0dHBzO...,https://encrypted-tbn2.gstatic.com/images?q=tb...,Fox News,https://encrypted-tbn3.gstatic.com/faviconV2?u...,How to know when it is time to replace your Mac,Technology,2024-02-11 17:54:45.929840,2024-02-11
68,https://news.google.com/articles/CBMiY2h0dHBzO...,https://encrypted-tbn0.gstatic.com/images?q=tb...,ScienceAlert,https://encrypted-tbn2.gstatic.com/faviconV2?u...,NASA Plan to Put a Nuclear Reactor on The Moon...,Technology,2024-02-11 17:54:45.929840,2024-02-11


 ### üè∑Ô∏è Add tags to the bucket
 When creating a resource in the cloud, it is considered a best practice to add tags for organizing resources, controlling costs, or applying security policies based on these labels. The following code demonstrates how to add tags to a bucket using a method from the boto3 library.

In [90]:
s3.put_bucket_tagging(
    Bucket=bucket_name_news,
    Tagging={
        'TagSet': [
            {
                'Key': 'Environment',
                'Value': 'Test'
            },
            {
                'Key': 'Project',
                'Value': 'Localstack+Boto3'
            }
        ]
    }
)

{'ResponseMetadata': {'RequestId': '4667f3d4-7b62-4576-bc49-819a236ed5c0',
  'HostId': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'content-type': 'application/xml',
   'x-amz-request-id': '4667f3d4-7b62-4576-bc49-819a236ed5c0',
   'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
   'connection': 'close',
   'date': 'Sun, 11 Feb 2024 17:12:43 GMT',
   'server': 'hypercorn-h11'},
  'RetryAttempts': 0}}

In [93]:
# get the tagging
pd.json_normalize(s3.get_bucket_tagging(Bucket=bucket_name_news)['TagSet'])

Unnamed: 0,Key,Value
0,Environment,Test
1,Project,Localstack+Boto3


### üîÑ Versioning in the bucket
Another good practice to apply is enabling versioning for your bucket. Versioning provides a way to recover and keep different versions of the same object. In the following code, we will create a file with the inventory of objects in the bucket and save the file twice. 

In [94]:

# allow versioning in the bucket
s3.put_bucket_versioning(
    Bucket=bucket_name_news,
    VersioningConfiguration={
        'Status': 'Enabled'
    }
)



{'ResponseMetadata': {'RequestId': '133b3c62-aa82-4faf-af13-cca5228c83d4',
  'HostId': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/xml',
   'x-amz-request-id': '133b3c62-aa82-4faf-af13-cca5228c83d4',
   'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
   'connection': 'close',
   'content-length': '0',
   'date': 'Sun, 11 Feb 2024 18:06:56 GMT',
   'server': 'hypercorn-h11'},
  'RetryAttempts': 0}}

In [96]:
# Add new file to the bucket

# file name
file_name = 'inventory.csv'

# list all objects in the bucket
files = list_objects(bucket_name=bucket_name_news)
bytes_io = io.BytesIO()
files.to_csv(bytes_io, index=False)
bytes_io.seek(0)
# upload the data to s3
s3.put_object(Bucket=bucket_name_news, Key=file_name, Body=bytes_io)



{'ResponseMetadata': {'RequestId': '533f69ad-a631-46d4-bbff-fa982be92bcd',
  'HostId': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/xml',
   'etag': '"9735f75dc2f10f885b3531d40bbef7b7"',
   'x-amz-server-side-encryption': 'AES256',
   'x-amz-version-id': '8ZflZE-hRLua_o8GWkLQzA',
   'x-amz-request-id': '533f69ad-a631-46d4-bbff-fa982be92bcd',
   'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
   'connection': 'close',
   'content-length': '0',
   'date': 'Sun, 11 Feb 2024 18:30:03 GMT',
   'server': 'hypercorn-h11'},
  'RetryAttempts': 0},
 'ETag': '"9735f75dc2f10f885b3531d40bbef7b7"',
 'ServerSideEncryption': 'AES256',
 'VersionId': '8ZflZE-hRLua_o8GWkLQzA'}

In [97]:
#¬†add again the same file
s3.put_object(Bucket=bucket_name_news, Key=file_name, Body=bytes_io)

{'ResponseMetadata': {'RequestId': '0d521fb8-e63d-4277-ada9-b44595f1a257',
  'HostId': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/xml',
   'etag': '"d41d8cd98f00b204e9800998ecf8427e"',
   'x-amz-server-side-encryption': 'AES256',
   'x-amz-version-id': 'hrasQC8zpVfHThaKL5mZ9g',
   'x-amz-request-id': '0d521fb8-e63d-4277-ada9-b44595f1a257',
   'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
   'connection': 'close',
   'content-length': '0',
   'date': 'Sun, 11 Feb 2024 18:30:18 GMT',
   'server': 'hypercorn-h11'},
  'RetryAttempts': 0},
 'ETag': '"d41d8cd98f00b204e9800998ecf8427e"',
 'ServerSideEncryption': 'AES256',
 'VersionId': 'hrasQC8zpVfHThaKL5mZ9g'}

In [107]:
# List all the version of the object
versions = s3.list_object_versions(Bucket=bucket_name, Prefix=file_name)

pd.json_normalize(versions['Versions'])

Unnamed: 0,ETag,Size,StorageClass,Key,VersionId,IsLatest,LastModified,Owner.DisplayName,Owner.ID
0,"""d41d8cd98f00b204e9800998ecf8427e""",0,STANDARD,inventory.csv,hrasQC8zpVfHThaKL5mZ9g,True,2024-02-11 18:30:18+00:00,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
1,"""9735f75dc2f10f885b3531d40bbef7b7""",1709,STANDARD,inventory.csv,8ZflZE-hRLua_o8GWkLQzA,False,2024-02-11 18:30:03+00:00,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...


### üóëÔ∏è Create a static site using s3 bucket

In this section, we need to utilize a different command, which requires prior installation of the `awscli-local` tool specifically designed for use with **LocalStack**.

The `awscli-local` tool facilitates developers in seamlessly engaging with the **LocalStack** instance, because you can automatically redirecting commands to local endpoints instead of real AWS endpoints.

In [None]:
# install awslocal to use the cli to interact with localstack
!pip3.11 install awscli-local

In [118]:
# the following command creates a static website in s3
!awslocal s3api create-bucket --bucket docs-web
# add the website configuration
!awslocal s3 website s3://docs-web/ --index-document index.html --error-document error.html
# syncronize the static site with the s3 bucket
!awslocal s3 sync static-site s3://docs-web

#------------------------------------------------------------------------------------------

# If you are using localstack, you can access the website using the following url
#  http://docs-web.s3-website.localhost.localstack.cloud:4566/


#------------------------------------------------------------------------------------------

{
    "Location": "/docs-web"
}
upload: static-site/error.html to s3://docs-web/error.html       
upload: static-site/index.html to s3://docs-web/index.html       
upload: static-site/img/img_localstack.png to s3://docs-web/img/img_localstack.png


Url Site: http://docs-web.s3-website.localhost.localstack.cloud:4566/

![](img/localstack-static-site.png)

# üìö References
If you want to learn...

* [AWS Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#installation)
* [LocalStack](https://docs.localstack.cloud/overview/)
* [API:OkSurf News](https://ok.surf/#endpoints)