##### Stock Price Data using yfinance

In [None]:
# Imports
import yfinance as yf
import datetime as dt

In [16]:
stocks = ['AAPL', 'TSLA']
start_date = dt.datetime.now() - dt.timedelta(days=90)
end_date = dt.date(2024, 12, 31)

df = yf.download(stocks, start_date, end_date)
df.head()

[*********************100%***********************]  2 of 2 completed


Price,Close,Close,High,High,Low,Low,Open,Open,Volume,Volume
Ticker,AAPL,TSLA,AAPL,TSLA,AAPL,TSLA,AAPL,TSLA,AAPL,TSLA
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2
2024-11-14,227.96936,311.179993,228.61864,329.980011,224.752895,310.369995,224.772878,327.690002,44923900,120726100
2024-11-15,224.752884,320.720001,226.670773,324.679993,224.02369,309.220001,226.15134,310.570007,47923700,114440300
2024-11-18,227.769577,338.73999,229.487689,348.549988,224.922701,330.01001,225.002615,340.730011,44686000,126547500
2024-11-19,228.029282,346.0,229.907222,347.380005,226.411066,332.75,226.730706,335.76001,36211800,88852500
2024-11-20,228.748489,342.029999,229.677461,346.600006,225.641904,334.299988,227.809519,345.0,35169600,66340700


##### News Data using NY Times Archive API
More information on the API [here](https://developer.nytimes.com/docs/articlesearch-product/1/overview).

In [2]:
# Imports
import requests
from os import getenv
from dotenv import load_dotenv

In [3]:
# Parameters
load_dotenv()
API_KEY = getenv("NYT_API_KEY")
begin_date = "20250130"
end_date = "20250210"
fq = 'DeepSeek AND section_name:("Business" OR "Technology" OR "Markets")'

url = f"https://api.nytimes.com/svc/search/v2/articlesearch.json?fq={fq}&begin_date={begin_date}&end_date={end_date}&api-key={API_KEY}"
response = requests.get(url)

In [4]:
# Check response status
if response.status_code == 200:
    data = response.json()
    articles = data.get("response", {}).get("docs", [])

    print(f"Number of articles: {len(articles)}\n")
    
    # Print the first 5 articles
    for article in articles[:5]: 
        print(f"- {article.get('headline', {}).get('main', 'No title')}")
        print(f"  URL: {article.get('web_url', 'No URL')}\n")

else:
    print(f"Error: {response.status_code}, {response.text}")

Number of articles: 10

- Microsoft and Nvidia: The Tech Giants Taking a Quieter Approach to Trump
  URL: https://www.nytimes.com/2025/02/08/technology/microsoft-nvidia-trump.html

- How Helpful Is Operator, OpenAI’s New A.I. Agent?
  URL: https://www.nytimes.com/2025/02/01/technology/openai-operator-agent.html

- Alphabet Revenue Disappoints Investors on Weak Cloud Sales
  URL: https://www.nytimes.com/2025/02/04/technology/alphabet-google-earnings.html

- What DeepSeek? Big Tech Keeps Its A.I. Building Boom Alive.
  URL: https://www.nytimes.com/2025/02/08/technology/deepseek-data-centers-ai.html

- SoftBank Said to Be in Talks to Invest as Much as $25 Billion in OpenAI
  URL: https://www.nytimes.com/2025/01/30/technology/softbank-openai-ai-investment.html



##### Data used in Reviewed Papers (non-exhaustive)

[Stock market prediction analysis by incorporating social and news opinion and sentiment](https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=6485&context=sis_research)<br>
Stock Price Data: Uses DJIA data from Yahoo Finance from 1/1/2007 to 31/12/2016.<br>
**News** Data: Uses articles from NY Time Archive (via API), from 1/1/2007 to 31/12/2016. 

[Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism](https://arxiv.org/abs/2406.06594)<br>
3 of the 4 datasets from this paper can be found [here](https://github.com/deeptrade-public/slot).<br>
These datasets comprise of stock price data and **twitter** data from 2014 to 2020, each dataset contains data on 38-87 stocks.