### Explore Stock Data

In [1]:
import boto3
import json
import pandas as pd
from io import BytesIO

In [2]:
# s3 uri for one of the files to explore
s3_uri = 's3://wildwesttech-bronze-dev/alphavantage_bronze/SPY/dataload=20230217/SPY.json'

In [3]:
# split the file into bucket and key, load the object
bucket, key = "wildwesttech-bronze-dev", "alphavantage_bronze/SPY/dataload=20230217/SPY.json"
s3 = boto3.resource('s3')
obj = s3.Object(bucket, key)
data = json.load(obj.get()['Body'])

In [4]:
# after viewing the data, grab the Time Series data to work with
TimeSeries = data['Time Series (5min)']

In [5]:
# transpose the rows and columns for a cleaner dataframe
df = pd.DataFrame.from_dict(TimeSeries).transpose()

In [6]:
# let's rename the columns
columns={'index': 'refresh_datetime',
        '1. open':'open',
        '2. high':'high',
        '3. low':'low',
        '4. close':'close',
        '5. volume':'volume'}

In [7]:
# final look at the file as a dataframe
df = df.reset_index().rename(columns=columns)
df.head()

Unnamed: 0,refresh_datetime,open,high,low,close,volume
0,2023-02-16 20:00:00,407.31,407.31,407.26,407.3,7967
1,2023-02-16 19:55:00,407.24,407.3,407.19,407.3,22941
2,2023-02-16 19:50:00,407.34,407.35,407.21,407.25,28018
3,2023-02-16 19:45:00,407.32,407.4,407.3,407.38,2895
4,2023-02-16 19:40:00,407.34,407.34,407.26,407.3,3315


### Recap
- We explored this file: s3_uri = 's3://wildwesttech-bronze-dev/alphavantage_bronze/SPY/dataload=20230217/SPY.json'
- The json file contained two major componenents: Metadata and Time Series
- I am more interested in the time series data.
- After loading the time series data, I noticed the data was presented horizontally, rather than vertically, so I transposed the rows and columns.
- I also wanted to pull in the index as a regular column, and rename the columns so they would be easier to work with in the future.
- The final dataframe looks like something we can work with.
- Next steps will focus on how we'll handle multiple stock symbols in addition to SPY and how to handle multiple load dates.