In [1]:
import boto3 #Amazon AWS Python SDK
from botocore.config import Config #Config for SDK
from dotenv import load_dotenv # Load .ENV file containing protected information
import os # Ability to manage and access neigboring files 

In [2]:
# Make the environment variables available to python from the .env file
load_dotenv()
# Load the environment variables into python variables
ACCESS_KEY = os.getenv("")
SECRET_KEY = os.getenv("")

In [3]:
# Initialize a session using the AWS keys
session = boto3.Session( # Session object used to configure users and environment control
    aws_access_key_id=ACCESS_KEY,
    aws_secret_access_key=SECRET_KEY,
)

In [4]:
# Create a client with session and speficy the endpoint (where the data is located)
s3 = session.client(
    's3', # Connecting to the S3 (Simple Storage Service) specifically (can connect to any aws service here)
    endpoint_url='https://files.polygon.io', # Base url for the service you want to access
    config=Config(signature_version='s3v4'), # Ensures client is using AWS signature Version 4 protocol by prohibiting api requests unless supplied with
                                             # a secret key. Used for hashsing
)
# The previous code is everything needed to accesss the S3 flatfiles, from here you can use commands like list objects or get objects

In [5]:
# Initialize a paginator for listing objects
paginator = s3.get_paginator('list_objects_v2')

## 🌐 Understanding Requests and Paginators in S3 (Conceptual Overview)

### 📤 What is a Request?

A **request** is a single operation sent from your client (e.g., Python code) to a server (e.g., AWS S3 or Polygon’s S3-compatible endpoint). For example, when you ask to list files in a folder-like structure in a bucket, that is a request.

S3’s `list_objects_v2` request returns a maximum of 1000 objects (files) at a time. If more files exist, it only returns the first "page" and indicates that more data is available.

---

### 🔁 What is a Paginator?

A **paginator** is a built-in tool provided by `boto3` that automatically handles repeated requests when the response is paginated. 

Instead of manually tracking continuation tokens and sending new requests, the paginator transparently performs this for you. It lets you iterate over all the data as if it were returned in one big response.

---

### 🪣 S3 Paginators Specifically

S3 paginators are used to retrieve more than 1000 files (objects) from a bucket. You create a paginator specifically for the `list_objects_v2` operation, which is the improved version of the original S3 listing API.

The paginator handles:
- Sending the first request
- Detecting if the result is truncated (chopped off)
- Sending follow-up requests with the continuation token
- Returning each full page of results one after the other

---

### 📌 Key Parameters Used with S3 Paginators

- **Bucket**: The name of the S3 bucket you are querying.
- **Prefix**: A folder-like path that limits the results to objects that begin with that string.
- **Delimiter** (optional): Used to group files as if they were in folders (commonly set to `/`).
- **PaginationConfig** (optional): Allows advanced control, like page size or starting from a specific point.

---

### ✅ Summary

- A **request** retrieves a single chunk of data from S3.
- A **paginator** automates multiple requests so you can work with large datasets easily.
- S3 paginators are essential when listing more than 1000 files in a bucket or folder-like structure.
