# Boto3 & LocalStudio

![Boto3 & LocalStudio image](img/s3-10.png)


## 🏠 What is localstack?

**Localstack** is a platform that provides a local version of several cloud services, allowing you to simulate a development environment with AWS services. This allows you to debug and refine your code before deploying it to a production environment. For this reason, Localstack is a valuable tool for emulating essential AWS services such as object storage and message queues, among others.

Also, **Localstack** serves as an effective tool for learning to implement and deploy services using a Docker container without the need for an AWS account or the use of your credit card. 
In this tutorial, we create a Localstack container to implement the main functionalities of S3 services.

---

## What is boto3?

**`Boto3`** is a 🐍 Python library that allows the integration with AWS services, facilitating various tasks such as creation, management, and configuration of these services.

There are two primary implementations within Boto3: 
* **Resource implementation**: provides a higher-level, object-oriented interface, abstracting away low-level details and offering simplified interactions with AWS services. 
* **Client implementation**: offers a lower-level, service-oriented interface, providing more granular control and flexibility for interacting with AWS services directly.


---

## Prerequisites
Before you begin, ensure that you have the following installed:

* 🐳 Docker
* 🐙 Docker Compose


---

### 🚀 Build and run the Docker Compose environment

#### 1. Clone the repository
 ```bash
   git clone https://github.com/r0mymendez/LocalStack-boto3.git
   cd LocalStack-boto3
```
#### 2. Build an run the docker compose 
  
`docker-compose -f docker-compose.yaml up --build`

---

### 🚀 Using LocalStack with Boto3: A Step-by-Step Guide
### 🛠️ Install Boto3

```!pip install boto3```




In [2]:
import boto3
import json 
import requests
import pandas as pd
from datetime import datetime
import io
import os

### 🛠️ Create a session using the localstack endpoint
The following code snippet initializes a client for accessing the S3 service using the LocalStack endpoint.

In [3]:

s3 = boto3.client(
    service_name='s3',
    aws_access_key_id='test',
    aws_secret_access_key='test',
    endpoint_url='http://localhost:4566',
)

### 🛠️ Create new buckets
Below is the code snippet to create new buckets using the Boto3 library

In [4]:
# create buckets
bucket_name_news = 'news'
bucket_name_config = 'news-config'

s3.create_bucket(Bucket= bucket_name_news)
s3.create_bucket(Bucket=bucket_name_config)

{'ResponseMetadata': {'RequestId': 'a2b07001-d2ac-43f5-9ba1-c819c1ae6ff0',
  'HostId': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'TwistedWeb/24.3.0',
   'date': 'Fri, 10 May 2024 21:36:23 GMT',
   'access-control-allow-origin': '*',
   'access-control-allow-methods': 'HEAD,GET,PUT,POST,DELETE,OPTIONS,PATCH',
   'access-control-allow-headers': 'authorization,cache-control,content-length,content-md5,content-type,etag,location,x-amz-acl,x-amz-content-sha256,x-amz-date,x-amz-request-id,x-amz-security-token,x-amz-tagging,x-amz-target,x-amz-user-agent,x-amz-version-id,x-amzn-requestid,x-localstack-target,amz-sdk-invocation-id,amz-sdk-request',
   'access-control-expose-headers': 'etag,x-amz-version-id',
   'vary': 'Origin',
   'location': '/news-config',
   'x-amz-request-id': 'a2b07001-d2ac-43f5-9ba1-c819c1ae6ff0',
   'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VL

### 📋 List all buckets
After creating a bucket, you can use the following code to list all the buckets available at your endpoint.

In [7]:
# List all buckets
response = s3.list_buckets()
pd.json_normalize(response['Buckets'])

Unnamed: 0,Name,CreationDate
0,news,2024-05-10 21:36:23+00:00
1,news-config,2024-05-10 21:36:23+00:00


### 📤 Upload the JSON file to s3
Once we extract data from the API to gather information about news topics, the following code generates a JSON file and uploads it to the S3 bucket previously created.

In [9]:
# invoke the config news
url = 'https://ok.surf/api/v1/cors/news-section-names'
response = requests.get(url)
if response.status_code==200:
    data = response.json()
    # ad json file to s3
    print('data', data)
    # upload the data to s3
    s3.put_object(Bucket=bucket_name_config, Key='news-section/data_config.json', Body=json.dumps(data))



data ['US', 'World', 'Business', 'Technology', 'Entertainment', 'Sports', 'Science', 'Health']


### 📋 List all objects
Now, let's list all the objects stored in our bucket. Since we might have stored a JSON file in the previous step, we'll include code to retrieve all objects from the bucket.

In [10]:
def list_objects(bucket_name):
    response = s3.list_objects(Bucket=bucket_name)
    return pd.json_normalize(response['Contents'])

In [11]:
# list all objects in the bucket
list_objects(bucket_name=bucket_name_config)

Unnamed: 0,Key,LastModified,ETag,Size,StorageClass,Owner.DisplayName,Owner.ID
0,news-section/data_config.json,2024-05-10 21:41:35+00:00,"""d61029b184d21dae1febcb46062216d3""",89,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...


###  📄 Upload multiple CSV files to s3 
In the following code snippet, we will request another method from the API to extract news for each topic. Subsequently, we will create different folders in the bucket to save CSV files containing the news for each topic. This code enables you to save multiple files in the same bucket while organizing them into folders based on the topic and the date of the data request.

In [12]:
# Request the news feed API Method
url = 'https://ok.surf/api/v1/news-feed' 
response = requests.get(url)
if response.status_code==200:
    data = response.json()

# Add the json file to s3
folder_dt =  f'dt={datetime.now().strftime("%Y%m%d")}'

for item in data.keys():
    tmp = pd.json_normalize(data[item])
    tmp['section'] = item   
    tmp['download_date'] = datetime.now()
    tmp['date'] = pd.to_datetime(tmp['download_date']).dt.date
    path = f"s3://{bucket_name_news}/{item}/{folder_dt}/data_{item}_news.csv"

    # upload multiple files to s3
    bytes_io = io.BytesIO()
    tmp.to_csv(bytes_io, index=False)
    bytes_io.seek(0)
    s3.put_object(Bucket=bucket_name_news, Key=path, Body=bytes_io)



In [13]:
# list all objects in the bucket
list_objects(bucket_name=bucket_name_news)


Unnamed: 0,Key,LastModified,ETag,Size,StorageClass,Owner.DisplayName,Owner.ID
0,s3://news/Business/dt=20240511/data_Business_n...,2024-05-10 21:46:55+00:00,"""c9cfff1962c8959c93aa60e644e68cd3""",47523,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
1,s3://news/Entertainment/dt=20240511/data_Enter...,2024-05-10 21:46:55+00:00,"""9c2a7cb7fe97d95dc6759f5503c46e19""",25610,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
2,s3://news/Health/dt=20240511/data_Health_news.csv,2024-05-10 21:46:55+00:00,"""d97d1d0fd3c2d70fd9f0f2ed0636de5e""",44806,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
3,s3://news/Science/dt=20240511/data_Science_new...,2024-05-10 21:46:55+00:00,"""85e00f8ff2e116ae7436eaf1c2974576""",44191,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
4,s3://news/Sports/dt=20240511/data_Sports_news.csv,2024-05-10 21:46:55+00:00,"""9152f7a9de110bb08b92d38525ca9317""",45166,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
5,s3://news/Technology/dt=20240511/data_Technolo...,2024-05-10 21:46:55+00:00,"""b25168c99e9b5abf785461451d2a0006""",14151,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
6,s3://news/US/dt=20240511/data_US_news.csv,2024-05-10 21:46:55+00:00,"""2542a4ee73d2bbc32a9823c9d91bb111""",46965,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
7,s3://news/World/dt=20240511/data_World_news.csv,2024-05-10 21:46:55+00:00,"""6aba1e054eb1e750c43219df3a19cf2c""",27795,STANDARD,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...


###  📄 Read csv file from s3 
In this section, we aim to read a file containing news about technology topics from S3. To accomplish this, we first retrieve the name of the file in the bucket. Then, we read this file and print the contents as a pandas dataframe.

In [14]:
# Get the technology file
files = list_objects(bucket_name=bucket_name_news)
technology_file = files[files['Key'].str.find('Technology')>=0]['Key'].values[0]
print('file_name',technology_file)

file_name s3://news/Technology/dt=20240511/data_Technology_news.csv


In [15]:
# get the file from s3 using boto3
obj = s3.get_object(Bucket=bucket_name_news, Key=technology_file)
data_tech = pd.read_csv(obj['Body'])

data_tech

Unnamed: 0,link,og,source,source_icon,title,section,download_date,date
0,https://news.google.com/articles/CBMiPGh0dHBzO...,https://encrypted-tbn3.gstatic.com/images?q=tb...,MacRumors,https://encrypted-tbn1.gstatic.com/faviconV2?u...,"Apple Apologizes for 'Crush' iPad Pro Ad, Won'...",Technology,2024-05-11 00:46:55.444180,2024-05-11
1,https://news.google.com/articles/CBMiXGh0dHBzO...,https://encrypted-tbn2.gstatic.com/images?q=tb...,The Verge,https://encrypted-tbn2.gstatic.com/faviconV2?u...,Sam Altman shoots down reports of search engin...,Technology,2024-05-11 00:46:55.444180,2024-05-11
2,https://news.google.com/articles/CBMiRmh0dHBzO...,https://encrypted-tbn2.gstatic.com/images?q=tb...,The New York Times,https://encrypted-tbn2.gstatic.com/faviconV2?u...,Apple Will Revamp Siri to Catch Up to Its Chat...,Technology,2024-05-11 00:46:55.444180,2024-05-11
3,https://news.google.com/articles/CBMiQmh0dHBzO...,https://encrypted-tbn2.gstatic.com/images?q=tb...,9to5Mac,https://encrypted-tbn0.gstatic.com/faviconV2?u...,M4 iPad Pro lacks always-on display despite OL...,Technology,2024-05-11 00:46:55.444180,2024-05-11
4,https://news.google.com/articles/CBMigAFodHRwc...,https://encrypted-tbn0.gstatic.com/images?q=tb...,IGN,https://encrypted-tbn1.gstatic.com/faviconV2?u...,"100 Days Later, Neuralink’s First Human Patien...",Technology,2024-05-11 00:46:55.444180,2024-05-11
5,https://news.google.com/articles/CBMiR2h0dHBzO...,https://encrypted-tbn0.gstatic.com/images?q=tb...,9to5Google,https://encrypted-tbn3.gstatic.com/faviconV2?u...,YouTube for Android TV gets new sidebar animat...,Technology,2024-05-11 00:46:55.444180,2024-05-11
6,https://news.google.com/articles/CBMiUWh0dHBzO...,https://encrypted-tbn1.gstatic.com/images?q=tb...,Gizmodo,https://encrypted-tbn1.gstatic.com/faviconV2?u...,Nintendo Doesn't Want to Deal With X/Twitter E...,Technology,2024-05-11 00:46:55.444180,2024-05-11
7,https://news.google.com/articles/CBMiXGh0dHBzO...,https://encrypted-tbn2.gstatic.com/images?q=tb...,CNET,https://encrypted-tbn3.gstatic.com/faviconV2?u...,Put a 512GB Google Pixel 7 Pro in Your Pocket ...,Technology,2024-05-11 00:46:55.444180,2024-05-11
8,https://news.google.com/articles/CBMiVmh0dHBzO...,https://encrypted-tbn1.gstatic.com/images?q=tb...,Android Authority,https://encrypted-tbn3.gstatic.com/faviconV2?u...,Google Messages will soon hide texts from bloc...,Technology,2024-05-11 00:46:55.444180,2024-05-11
9,https://news.google.com/articles/CBMifmh0dHBzO...,https://encrypted-tbn1.gstatic.com/images?q=tb...,Game Informer,https://encrypted-tbn2.gstatic.com/faviconV2?u...,Xbox President Addresses Bethesda Studio Closu...,Technology,2024-05-11 00:46:55.444180,2024-05-11


 ### 🏷️ Add tags to the bucket
 When creating a resource in the cloud, it is considered a best practice to add tags for organizing resources, controlling costs, or applying security policies based on these labels. The following code demonstrates how to add tags to a bucket using a method from the boto3 library.

In [16]:
s3.put_bucket_tagging(
    Bucket=bucket_name_news,
    Tagging={
        'TagSet': [
            {
                'Key': 'Environment',
                'Value': 'Test'
            },
            {
                'Key': 'Project',
                'Value': 'Localstack+Boto3'
            }
        ]
    }
)

{'ResponseMetadata': {'RequestId': '86e6d957-b63a-4935-bfd9-b68bc0d9b70c',
  'HostId': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'server': 'TwistedWeb/24.3.0',
   'date': 'Fri, 10 May 2024 21:50:40 GMT',
   'x-amz-request-id': '86e6d957-b63a-4935-bfd9-b68bc0d9b70c',
   'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234='},
  'RetryAttempts': 0}}

In [17]:
# get the tagging
pd.json_normalize(s3.get_bucket_tagging(Bucket=bucket_name_news)['TagSet'])

Unnamed: 0,Key,Value
0,Environment,Test
1,Project,Localstack+Boto3


### 🔄 Versioning in the bucket
Another good practice to apply is enabling versioning for your bucket. Versioning provides a way to recover and keep different versions of the same object. In the following code, we will create a file with the inventory of objects in the bucket and save the file twice. 

In [18]:

# allow versioning in the bucket
s3.put_bucket_versioning(
    Bucket=bucket_name_news,
    VersioningConfiguration={
        'Status': 'Enabled'
    }
)



{'ResponseMetadata': {'RequestId': '4560040d-028c-4b46-aa58-54577669aa8f',
  'HostId': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'TwistedWeb/24.3.0',
   'date': 'Fri, 10 May 2024 21:53:40 GMT',
   'x-amz-request-id': '4560040d-028c-4b46-aa58-54577669aa8f',
   'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
   'content-length': '0'},
  'RetryAttempts': 0}}

In [22]:
# Add new file to the bucket

# file name
file_name = 'inventory.csv'

# list all objects in the bucket
files = list_objects(bucket_name=bucket_name_news)
bytes_io = io.BytesIO()
files.to_csv(bytes_io, index=False)
bytes_io.seek(0)
# upload the data to s3
s3.put_object(Bucket=bucket_name_news, Key=file_name, Body=bytes_io)



{'ResponseMetadata': {'RequestId': 'a9fc51fc-d56d-40a4-a342-f39060ac2f96',
  'HostId': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'TwistedWeb/24.3.0',
   'date': 'Fri, 10 May 2024 21:57:58 GMT',
   'etag': '"98a5cdd5e6d4e6726fbc582591c523c0"',
   'x-amz-server-side-encryption': 'AES256',
   'x-amz-version-id': 'oZZ9rpTx5fC6EbVWwEAOcQ',
   'x-amz-request-id': 'a9fc51fc-d56d-40a4-a342-f39060ac2f96',
   'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
   'content-length': '0'},
  'RetryAttempts': 0},
 'ETag': '"98a5cdd5e6d4e6726fbc582591c523c0"',
 'ServerSideEncryption': 'AES256',
 'VersionId': 'oZZ9rpTx5fC6EbVWwEAOcQ'}

In [20]:
# add again the same file
s3.put_object(Bucket=bucket_name_news, Key=file_name, Body=bytes_io)

{'ResponseMetadata': {'RequestId': '63a47799-d56c-4208-a944-7796fa6ab813',
  'HostId': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'TwistedWeb/24.3.0',
   'date': 'Fri, 10 May 2024 21:57:12 GMT',
   'etag': '"d41d8cd98f00b204e9800998ecf8427e"',
   'x-amz-server-side-encryption': 'AES256',
   'x-amz-version-id': 'yzmhajY7VxnbFeIc7dR5cQ',
   'x-amz-request-id': '63a47799-d56c-4208-a944-7796fa6ab813',
   'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
   'content-length': '0'},
  'RetryAttempts': 0},
 'ETag': '"d41d8cd98f00b204e9800998ecf8427e"',
 'ServerSideEncryption': 'AES256',
 'VersionId': 'yzmhajY7VxnbFeIc7dR5cQ'}

In [23]:
# List all the version of the object
versions = s3.list_object_versions(Bucket=bucket_name_news, Prefix=file_name)

pd.json_normalize(versions['Versions'])

Unnamed: 0,ETag,Size,StorageClass,Key,VersionId,IsLatest,LastModified,Owner.DisplayName,Owner.ID
0,"""98a5cdd5e6d4e6726fbc582591c523c0""",1882,STANDARD,inventory.csv,oZZ9rpTx5fC6EbVWwEAOcQ,True,2024-05-10 21:57:58+00:00,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
1,"""d41d8cd98f00b204e9800998ecf8427e""",0,STANDARD,inventory.csv,yzmhajY7VxnbFeIc7dR5cQ,False,2024-05-10 21:57:12+00:00,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...
2,"""49f0a741c10b856635b2a2a413d930cc""",1718,STANDARD,inventory.csv,2h7LcY4Yh9D3m4BkAPUQ2A,False,2024-05-10 21:56:09+00:00,webfile,75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c07...


### 🗑️ Create a static site using s3 bucket

In this section, we need to utilize a different command, which requires prior installation of the `awscli-local` tool specifically designed for use with **LocalStack**.

The `awscli-local` tool facilitates developers in seamlessly engaging with the **LocalStack** instance, because you can automatically redirecting commands to local endpoints instead of real AWS endpoints.

In [24]:
# install awslocal to use the cli to interact with localstack
!pip3.11 install awscli-local

Collecting awscli-local
  Downloading awscli-local-0.22.0.tar.gz (11 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting localstack-client
  Downloading localstack-client-2.5.tar.gz (10 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: awscli-local, localstack-client
  Building wheel for awscli-local (setup.py): started
  Building wheel for awscli-local (setup.py): finished with status 'done'
  Created wheel for awscli-local: filename=awscli_local-0.22.0-py3-none-any.whl size=12038 sha256=703ebc6c06fabb82a48c4c70ef0e6207075a37f57125373f7ea6bf39f0d0f155
  Stored in directory: c:\users\vm140\appdata\local\pip\cache\wheels\a9\b7\36\a10e7c94446e33854d2f6e03a51daf6396f46007fb3cad1f12
  Building wheel for localstack-client (setup.py): started
  Building wheel for localstack-client (setup.py): finished with status 'done'
  Created whe


[notice] A new release of pip available: 22.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


In [27]:
# the following command creates a static website in s3
!awslocal s3api create-bucket --bucket docs-web
# add the website configuration
!awslocal s3 website s3://docs-web/ --index-document index.html --error-document error.html
# syncronize the static site with the s3 bucket
!awslocal s3 sync static-site s3://docs-web

#------------------------------------------------------------------------------------------

# If you are using localstack, you can access the website using the following url
#  http://docs-web.s3-website.localhost.localstack.cloud:4566/


#------------------------------------------------------------------------------------------

{
    "Location": "/docs-web"
}
Completed 698 Bytes/70.0 KiB (52.5 KiB/s) with 3 file(s) remaining
upload: static-site\error.html to s3://docs-web/error.html        
Completed 698 Bytes/70.0 KiB (52.5 KiB/s) with 2 file(s) remaining
Completed 1.8 KiB/70.0 KiB (107.9 KiB/s) with 2 file(s) remaining 
upload: static-site\index.html to s3://docs-web/index.html        
Completed 1.8 KiB/70.0 KiB (107.9 KiB/s) with 1 file(s) remaining
Completed 70.0 KiB/70.0 KiB (950.5 KiB/s) with 1 file(s) remaining
upload: static-site\img\img_localstack.png to s3://docs-web/img/img_localstack.png


Url Site: http://docs-web.s3-website.localhost.localstack.cloud:4566/

![](img/localstack-static-site.png)

# 📚 References
If you want to learn...

* [AWS Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#installation)
* [LocalStack](https://docs.localstack.cloud/overview/)
* [API:OkSurf News](https://ok.surf/#endpoints)