# Stock Price Scraping
The purpose of this Jupyter Notebook is to utilise the Alpha Vantage API to scrape daily price and trading data on the Gamestop Stock.
In addition the initial data is used to plot a candlestick chart depicting the daily opening, closing, high and low price of the stock.

### Scraping and Processing

In [31]:
import os
import json

import pandas as pd
import mplfinance as mpf

In [32]:
os.makedirs("../data", exist_ok=True)
os.makedirs("../visualisations", exist_ok=True)
os.makedirs("../data/gme_data", exist_ok=True)

Scraping daily data using Alpha Vantage API and saving to a JSON file

In [33]:
file_path = "../data/gme_data/gme_daily_data.json"

# Check if file already exists and scrape the data if it does not exist
if not os.path.exists(file_path):
    url = 'https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol=GME&outputsize=full&apikey=71F522PIQRFAFZZO'
    response = requests.get(url)
    gme_daily_data = response.json()

    with open(file_path, "w") as file:
        json.dump(gme_daily_data, file)

Creating the dataframe

In [34]:
gme_daily_data_df = pd.read_json('../data/gme_data/gme_daily_data.json')

# Getting the time series data
time_series_daily_data = gme_daily_data_df['Time Series (Daily)'].dropna().to_dict()

# Making the Dataframe from the time series data and renaming columns
gme_daily_transformed_df = pd.DataFrame.from_dict(time_series_daily_data, orient='index')
gme_daily_transformed_df.reset_index(inplace=True)
gme_daily_transformed_df.rename(columns={'index': 'Date', '1. open': 'Open', '2. high': 'High', '3. low': 'Low', '4. close': 'Close', '5. volume': 'Volume'}, inplace=True)
gme_daily_transformed_df_sorted = gme_daily_transformed_df.sort_index()

# Checking the data types of the columns
gme_daily_transformed_df_sorted.dtypes

Date      object
Open      object
High      object
Low       object
Close     object
Volume    object
dtype: object

Changing datatypes

In [35]:
gme_daily_transformed_df = gme_daily_transformed_df.astype({
    'Date': 'datetime64[ns]',
    'Open': 'float',
    'High': 'float',
    'Low': 'float',
    'Close': 'float',
    'Volume': 'float'
})

gme_daily_transformed_df.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume
0,2024-02-05,14.5,14.61,13.4,13.46,4361519.0
1,2024-02-02,14.15,14.92,14.08,14.73,2924462.0
2,2024-02-01,14.34,14.42,14.02,14.42,2220154.0
3,2024-01-31,14.4,14.83,14.22,14.23,2684680.0
4,2024-01-30,14.54,14.82,14.51,14.55,1652561.0


Pickling the Dataframe to be used across Jupyter notebooks

In [36]:
gme_daily_transformed_df.to_pickle("../scraping/gme_daily_transformed_df.pkl")

### Visualisation
Plotting a Candlestick Chart of GME Stock Price during the Gamestop short squeeze using MPL Finance

In [37]:
# Filtering the DataFrame to include only the data from January 2021 to April 2021
gme_jan_apr2021_df = gme_daily_transformed_df[(gme_daily_transformed_df['Date'] >= '2021-01-01') & (gme_daily_transformed_df['Date'] <= '2021-04-30')].copy()
gme_jan_apr2021_df.set_index('Date', inplace=True)

# Sorting the DataFrame by 'Date' in ascending order
gme_jan_apr2021_df.sort_index(inplace=True)

# Plotting the candlestick chart
mpf.plot(
         gme_jan_apr2021_df, 
         type='candle', 
         style='yahoo', 
         volume=True,
         figsize=(14, 8),
         datetime_format='%b %d, %Y',
         xrotation=45,
         title=dict(title='GME Stock Price (Jan 2021-Apr 2021)', y=0.9, fontsize=16, color='black'),
            ylabel='Price ($)',
            ylabel_lower='Trading Volume',
         scale_width_adjustment=dict(volume=0.8, candle=1.0),
         savefig=dict(fname='../visualisations/gme_stock_price_candlestick_chart.png', dpi=100, bbox_inches='tight')
)