# AAPL Data Loading Test with S3 Stock Client

This notebook demonstrates how to use the S3StockDataClient to load Apple (AAPL) stock data from our S3 parquet storage.

## Setup and Configuration

In [1]:
import pandas as pd
import logging

from clients import DuckDBStockClient
# Import the S3 client
from clients.s3_stock_client import S3StockDataClient

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

print("Setup complete!")

Setup complete!


## Initialize S3 Stock Client

Configure the client to connect to our S3 bucket containing the processed stock data.

In [2]:
client = DuckDBStockClient()

INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials


In [7]:
client.get_data(tickers='AAPL', years=2025)

Unnamed: 0,window_start_et,open,high,low,close,volume,transactions,ticker,year
0,2025-03-12 09:30:00,220.1400,220.500,219.270,219.8450,1376617,15261,AAPL,2025
1,2025-03-12 09:31:00,219.7601,220.330,219.560,219.6600,294749,4466,AAPL,2025
2,2025-03-12 09:32:00,219.6000,220.245,219.600,219.9700,325246,4759,AAPL,2025
3,2025-03-12 09:33:00,219.9700,220.500,219.970,220.1650,388737,4894,AAPL,2025
4,2025-03-12 09:34:00,220.1800,221.300,220.130,220.4365,523188,5792,AAPL,2025
...,...,...,...,...,...,...,...,...,...
27295,2025-06-20 15:55:00,200.5600,201.700,200.560,201.3500,1799984,16704,AAPL,2025
27296,2025-06-20 15:56:00,201.3300,201.420,201.025,201.0900,757154,7816,AAPL,2025
27297,2025-06-20 15:57:00,201.0900,201.170,200.860,200.9750,796178,8860,AAPL,2025
27298,2025-06-20 15:58:00,200.9800,201.090,200.930,200.9500,885863,8519,AAPL,2025


In [4]:
# Initialize the S3 client
client_s3 = S3StockDataClient(
    bucket="anawatp-us-stocks",
    base_prefix="parquet", 
    aws_region="us-west-2",  # Match the region from our ETL job
    cache_enabled=False  # Disable caching for testing
)

INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials


## Explore Available Data

Let's see what years and tickers are available in our dataset.

In [8]:
client_s3.get_available_tickers()

['A',
 'AA',
 'AAA',
 'AAAU',
 'AACB',
 'AACBR',
 'AACBU',
 'AACG',
 'AACIU',
 'AACT',
 'AACT.U',
 'AACT.WS',
 'AADI',
 'AADR',
 'AAL',
 'AAM',
 'AAM.U',
 'AAM.WS',
 'AAME',
 'AAMI',
 'AAOI',
 'AAON',
 'AAP',
 'AAPB',
 'AAPD',
 'AAPG',
 'AAPL',
 'AAPR',
 'AAPU',
 'AAPW',
 'AAPX',
 'AAPY',
 'AARD',
 'AAT',
 'AAUC',
 'AAVM',
 'AAXJ',
 'AB',
 'ABAT',
 'ABBV',
 'ABCB',
 'ABCL',
 'ABCS',
 'ABEO',
 'ABEQ',
 'ABEV',
 'ABFL',
 'ABG',
 'ABHY',
 'ABIG',
 'ABL',
 'ABLD',
 'ABLG',
 'ABLLL',
 'ABLLW',
 'ABLS',
 'ABLV',
 'ABLVW',
 'ABM',
 'ABNB',
 'ABNY',
 'ABOS',
 'ABOT',
 'ABP',
 'ABPWW',
 'ABR',
 'ABRpD',
 'ABRpE',
 'ABRpF',
 'ABSI',
 'ABT',
 'ABTS',
 'ABUS',
 'ABVC',
 'ABVE',
 'ABVEW',
 'ABVX',
 'ABXB',
 'AC',
 'ACA',
 'ACAD',
 'ACB',
 'ACCD',
 'ACCO',
 'ACCS',
 'ACDC',
 'ACEL',
 'ACES',
 'ACET',
 'ACGL',
 'ACGLN',
 'ACGLO',
 'ACGR',
 'ACHC',
 'ACHL',
 'ACHR',
 'ACHR.WS',
 'ACHV',
 'ACI',
 'ACIC',
 'ACIO',
 'ACIU',
 'ACIW',
 'ACLC',
 'ACLO',
 'ACLS',
 'ACLX',
 'ACM',
 'ACMR',
 'ACN',
 'ACNB',
 '

## Examine Data Structure

Let's look at the first few rows and basic statistics.

In [5]:
result = client_s3.get_data(tickers='AAPL', years=2025)
result


Unnamed: 0,window_start_et,open,high,low,close,volume,transactions,year,ticker
0,2025-05-28 09:30:00,200.590,201.53,200.5000,201.34,931287,12470,2025,AAPL
1,2025-04-07 09:30:00,177.200,177.97,176.1000,177.31,7591929,53401,2025,AAPL
2,2025-05-28 09:31:00,201.270,201.77,201.2500,201.68,259238,3549,2025,AAPL
3,2025-04-07 09:31:00,177.355,177.98,175.6200,177.60,1637954,20621,2025,AAPL
4,2025-05-28 09:32:00,201.681,202.69,201.6810,202.61,425147,5212,2025,AAPL
...,...,...,...,...,...,...,...,...,...
27295,2025-04-15 15:55:00,202.590,202.77,202.4300,202.48,234190,3605,2025,AAPL
27296,2025-04-15 15:56:00,202.470,202.54,202.3601,202.51,167573,2532,2025,AAPL
27297,2025-04-15 15:57:00,202.515,202.53,202.2500,202.28,200028,2820,2025,AAPL
27298,2025-04-15 15:58:00,202.285,202.33,202.1300,202.14,291442,3747,2025,AAPL


## Test Date Range Filtering

Let's test loading AAPL data for a specific date range.

In [3]:
client.get_data(tickers='AAPL', start_date='2025-03-12', end_date='2025-03-15')

Unnamed: 0,window_start_et,open,high,low,close,volume,transactions,year,ticker
5460,2025-03-13 09:30:00,215.950,216.03,214.90,215.1732,1481429,15341,2025,AAPL
5462,2025-03-13 09:31:00,215.150,215.65,215.02,215.1300,244927,3009,2025,AAPL
5464,2025-03-13 09:32:00,215.170,215.66,214.69,214.7600,265537,4485,2025,AAPL
5466,2025-03-13 09:33:00,214.790,215.10,214.37,214.6050,407137,5654,2025,AAPL
5468,2025-03-13 09:34:00,214.615,214.96,214.37,214.8800,313900,4605,2025,AAPL
...,...,...,...,...,...,...,...,...,...
24565,2025-03-12 15:55:00,216.800,217.27,216.80,217.1200,545098,7065,2025,AAPL
24566,2025-03-12 15:56:00,217.100,217.19,216.91,217.1000,411316,4686,2025,AAPL
24567,2025-03-12 15:57:00,217.100,217.17,216.97,217.1700,391577,5161,2025,AAPL
24568,2025-03-12 15:58:00,217.165,217.27,217.07,217.2000,479251,5999,2025,AAPL
