# Retrieving top 10 stock data from WSB

In this ipynb file, we simply gather the stock data for the top 10 most mentioned stocks in the dataset.
<br />
FYI, the sentiment model and entity recognition are basis for getting the top 10 most mentioned stocks. 

In [2]:
# These are the top 10 most mentioned stocks in the dataset.
stocks_tickers = [
    ("GME", 13717),
    ("AMC", 4872),
    ("BB", 2119),
    ("TSLA", 1023),
    ("PLTR", 873),
    ("AMZN", 520),
    ("AAPL", 505),
    ("SNDL", 380),
    ("F", 372),
    ("AMD", 347)
]

In [3]:
# We simply need the tickers
stocks_tickers = [ticker for ticker, _ in stocks_tickers]
stocks_tickers

['GME', 'AMC', 'BB', 'TSLA', 'PLTR', 'AMZN', 'AAPL', 'SNDL', 'F', 'AMD']

In [4]:
# Next utilising yfinance, we download daily stock data for each of the 10 tickrs
import yfinance as yf
import pandas as pd
import time

# Use the specified date range
start_date = "2020-09-01"
end_date = "2021-09-01"

print(f"Fetching stock data from {start_date} to {end_date}")

# Create empty DataFrame with dates as index
combined_stock_data = pd.DataFrame()

# Get data for each ticker
for ticker in stocks_tickers:
    try:
        print(f"Fetching data for {ticker}...")
        # Add small delay
        time.sleep(1)
        
        # Download stock data (only adjusted close)
        stock = yf.download(ticker, start=start_date, end=end_date)['Adj Close']
        
        # Add the stock prices as a new column
        if not stock.empty:
            combined_stock_data[ticker] = stock
            print(f"Successfully retrieved {len(stock)} days of data for {ticker}")
        else:
            print(f"No data found for {ticker}")
            
    except Exception as e:
        print(f"Error fetching data for {ticker}: {str(e)}")

# Save combined data
combined_stock_data.to_csv('stock_prices.csv')
print("Saved stock prices to stock_prices.csv")

Fetching stock data from 2020-09-01 to 2021-09-01
Fetching data for GME...


[*********************100%***********************]  1 of 1 completed


Successfully retrieved 252 days of data for GME
Fetching data for AMC...


[*********************100%***********************]  1 of 1 completed


Successfully retrieved 252 days of data for AMC
Fetching data for BB...


[*********************100%***********************]  1 of 1 completed


Successfully retrieved 252 days of data for BB
Fetching data for TSLA...


[*********************100%***********************]  1 of 1 completed


Successfully retrieved 252 days of data for TSLA
Fetching data for PLTR...


[*********************100%***********************]  1 of 1 completed


Successfully retrieved 232 days of data for PLTR
Fetching data for AMZN...


[*********************100%***********************]  1 of 1 completed


Successfully retrieved 252 days of data for AMZN
Fetching data for AAPL...


[*********************100%***********************]  1 of 1 completed


Successfully retrieved 252 days of data for AAPL
Fetching data for SNDL...


[*********************100%***********************]  1 of 1 completed


Successfully retrieved 252 days of data for SNDL
Fetching data for F...


[*********************100%***********************]  1 of 1 completed


Successfully retrieved 252 days of data for F
Fetching data for AMD...


[*********************100%***********************]  1 of 1 completed

Successfully retrieved 252 days of data for AMD
Saved stock prices to stock_prices.csv



