Documentation - https://github.com/ranaroussi/yfinance/

# Overview:

### yfinance has all the core data needed for our project


*   **Financials** - extensive historical data for US stocks | Price, Income Statements, Balance Sheets, ratios, dividends/splits ...   
*   **SEC Filings** - has up to date records of SEC filings which we can extract and analyze
* **News** - Has stock specific news articles from Yahoo Finance website (which is said to be a good source of stock news)

#### What's Missing (see Team Deliverable 1 for alternative sources):


1.   Real-time stock prices
2.   General business and current affairs news articles that are not stock specific

Recommendations: Use yfinance for Phase 1, add alternative sources for additional functionality







In [13]:
# Setup
!pip install yfinance
pd.set_option('display.max_columns', None)
def print_full(x):
    pd.set_option('display.max_rows', None)
    pd.set_option('display.max_columns', None)
    pd.set_option('display.width', 2000)
    pd.set_option('display.float_format', '{:20,.2f}'.format)
    pd.set_option('display.max_colwidth', None)
    print(x)
    pd.reset_option('display.max_rows')
    pd.reset_option('display.max_columns')
    pd.reset_option('display.width')
    pd.reset_option('display.float_format')
    pd.reset_option('display.max_colwidth')


# **Stock Overview**

In [26]:
import yfinance as yf
import pandas as pd

msft = yf.Ticker("MSFT")

# get all stock info
ticker_info = pd.DataFrame.from_dict(msft.info).head(1)
display(ticker_info)
print_full(ticker_info)

Unnamed: 0,address1,city,state,zip,country,phone,website,industry,industryKey,industryDisp,...,returnOnEquity,freeCashflow,operatingCashflow,earningsGrowth,revenueGrowth,grossMargins,ebitdaMargins,operatingMargins,financialCurrency,trailingPegRatio
0,One Microsoft Way,Redmond,WA,98052-6399,United States,425 882 8080,https://www.microsoft.com,Software - Infrastructure,software-infrastructure,Software - Infrastructure,...,0.37133,56705249280,118547996672,0.097,0.152,0.69764,0.52804,0.43143,USD,2.2674


            address1     city state         zip        country         phone                    website                   industry              industryKey               industryDisp      sector   sectorKey  sectorDisp  \
0  One Microsoft Way  Redmond    WA  98052-6399  United States  425 882 8080  https://www.microsoft.com  Software - Infrastructure  software-infrastructure  Software - Infrastructure  Technology  technology  Technology   

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

In [17]:
# get historical market data
hist = msft.history(period="1mo")


Unnamed: 0_level_0,Open,High,Low,Close,Volume,Dividends,Stock Splits
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2024-09-17 00:00:00-04:00,440.230011,441.850006,432.269989,435.149994,18874200,0.0,0.0
2024-09-18 00:00:00-04:00,435.0,436.029999,430.410004,430.809998,18898000,0.0,0.0
2024-09-19 00:00:00-04:00,441.230011,441.5,436.899994,438.690002,21706600,0.0,0.0
2024-09-20 00:00:00-04:00,437.220001,439.23999,434.220001,435.269989,55167100,0.0,0.0
2024-09-23 00:00:00-04:00,434.279999,436.459991,430.390015,433.51001,15128900,0.0,0.0
2024-09-24 00:00:00-04:00,433.0,433.350006,426.100006,429.170013,17015800,0.0,0.0
2024-09-25 00:00:00-04:00,429.829987,433.119995,428.570007,432.109985,13396400,0.0,0.0
2024-09-26 00:00:00-04:00,435.089996,435.299988,429.130005,431.309998,14492000,0.0,0.0
2024-09-27 00:00:00-04:00,431.519989,431.850006,427.470001,428.019989,14896100,0.0,0.0
2024-09-30 00:00:00-04:00,428.209991,430.420013,425.369995,430.299988,16807300,0.0,0.0


In [18]:

# show meta information about the history (requires history() to be called first)
msft.history_metadata


{'currency': 'USD',
 'symbol': 'MSFT',
 'exchangeName': 'NMS',
 'fullExchangeName': 'NasdaqGS',
 'instrumentType': 'EQUITY',
 'firstTradeDate': 511108200,
 'regularMarketTime': 1729108801,
 'hasPrePostMarketData': True,
 'gmtoffset': -14400,
 'timezone': 'EDT',
 'exchangeTimezoneName': 'America/New_York',
 'regularMarketPrice': 416.12,
 'fiftyTwoWeekHigh': 416.36,
 'fiftyTwoWeekLow': 410.48,
 'regularMarketDayHigh': 416.36,
 'regularMarketDayLow': 410.48,
 'regularMarketVolume': 15062246,
 'longName': 'Microsoft Corporation',
 'shortName': 'Microsoft Corporation',
 'chartPreviousClose': 431.34,
 'priceHint': 2,
 'currentTradingPeriod': {'pre': {'timezone': 'EDT',
   'end': 1729085400,
   'start': 1729065600,
   'gmtoffset': -14400},
  'regular': {'timezone': 'EDT',
   'end': 1729108800,
   'start': 1729085400,
   'gmtoffset': -14400},
  'post': {'timezone': 'EDT',
   'end': 1729123200,
   'start': 1729108800,
   'gmtoffset': -14400}},
 'dataGranularity': '1d',
 'range': '1mo',
 'validR

In [30]:
# show actions (dividends, splits, capital gains)
msft.actions


Unnamed: 0_level_0,Dividends,Stock Splits
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
1987-09-21 00:00:00-04:00,0.00,2.0
1990-04-16 00:00:00-04:00,0.00,2.0
1991-06-27 00:00:00-04:00,0.00,1.5
1992-06-15 00:00:00-04:00,0.00,1.5
1994-05-23 00:00:00-04:00,0.00,2.0
...,...,...
2023-08-16 00:00:00-04:00,0.68,0.0
2023-11-15 00:00:00-05:00,0.75,0.0
2024-02-14 00:00:00-05:00,0.75,0.0
2024-05-15 00:00:00-04:00,0.75,0.0


In [31]:
# show share count
msft.get_shares_full(start="2022-01-01", end=None)


Unnamed: 0,0
2022-01-27 00:00:00-05:00,7496869888
2022-02-04 00:00:00-05:00,7800719872
2022-02-05 00:00:00-05:00,7496869888
2022-02-11 00:00:00-05:00,7496869888
2022-03-04 00:00:00-05:00,7605040128
...,...
2024-10-09 00:00:00-04:00,7433039872
2024-10-12 00:00:00-04:00,7433039872
2024-10-15 00:00:00-04:00,7433039872
2024-10-16 00:00:00-04:00,7433750016


In [36]:
# show targets/forecasts :
pd.DataFrame.from_dict(msft.calendar)


Unnamed: 0,Dividend Date,Ex-Dividend Date,Earnings Date,Earnings High,Earnings Low,Earnings Average,Revenue High,Revenue Low,Revenue Average
0,2024-12-12,2024-11-21,2024-10-30,3.17,2.96,3.09,65196500000,64221700000,64483900000


# **Text Data**

**SEC Filings**

In [37]:
# SEC Filings
msft.sec_filings

[{'date': datetime.date(2024, 8, 21),
  'epochDate': 1724198400,
  'type': '8-K',
  'title': 'Corporate Changes & Voting Matters',
  'edgarUrl': 'https://finance.yahoo.com/sec-filing/MSFT/0001193125-24-204403_789019',
  'exhibits': {'8-K': 'https://cdn.yahoofinance.com/prod/sec-filings/0000789019/000119312524204403/d846847d8k.htm',
   'EX-99.1': 'https://cdn.yahoofinance.com/prod/sec-filings/0000789019/000119312524204403/d846847dex991.htm',
   'EXCEL': 'https://s3.amazonaws.com/finance-pri-uw2/sec-filings/0000789019/000119312524204403/Financial_Report.xlsx'},
  'maxAge': 1},
 {'date': datetime.date(2024, 7, 30),
  'epochDate': 1722297600,
  'type': '10-K',
  'title': 'Periodic Financial Reports',
  'edgarUrl': 'https://finance.yahoo.com/sec-filing/MSFT/0000950170-24-087843_789019',
  'exhibits': {'10-K': 'https://cdn.yahoofinance.com/prod/sec-filings/0000789019/000095017024087843/msft-20240630.htm',
   'EXCEL': 'https://s3.amazonaws.com/finance-pri-uw2/sec-filings/0000789019/0000950170

**Financials Extracted from Reports**

In [39]:

# - income statement
msft.income_stmt
msft.quarterly_income_stmt
# - balance sheet
msft.balance_sheet
msft.quarterly_balance_sheet
# - cash flow statement
msft.cashflow
msft.quarterly_cashflow
# see `Ticker.get_income_stmt()` for more options



Unnamed: 0,2024-06-30,2023-06-30,2022-06-30,2021-06-30
Tax Effect Of Unusual Items,-99918000.0,-2850000.0,43754000.0,180160797.164637
Tax Rate For Calcs,0.182,0.19,0.131,0.138266
Normalized EBITDA,133558000000.0,105155000000.0,99905000000.0,83831000000.0
Total Unusual Items,-549000000.0,-15000000.0,334000000.0,1303000000.0
Total Unusual Items Excluding Goodwill,-549000000.0,-15000000.0,334000000.0,1303000000.0
Net Income From Continuing Operation Net Minority Interest,88136000000.0,72361000000.0,72738000000.0,61271000000.0
Reconciled Depreciation,22287000000.0,13861000000.0,14460000000.0,11686000000.0
Reconciled Cost Of Revenue,74114000000.0,65863000000.0,62650000000.0,52232000000.0
EBITDA,133009000000.0,105140000000.0,100239000000.0,85134000000.0
EBIT,110722000000.0,91279000000.0,85779000000.0,73448000000.0


**News**

In [45]:
# show news
msft.news

[{'uuid': 'bfb047c8-8575-35f3-a53a-1ad4be8328c0',
  'title': 'Amazon invests $500M to go nuclear after Microsoft and Google',
  'publisher': 'Yahoo Finance Video',
  'link': 'https://finance.yahoo.com/video/amazon-invests-500m-nuclear-microsoft-210914709.html',
  'providerPublishTime': 1729112954,
  'type': 'VIDEO',
  'thumbnail': {'resolutions': [{'url': 'https://s.yimg.com/uu/api/res/1.2/U00L0aofIuKpLDFf2nQj5g--~B/aD0yOTExO3c9NTE4NDthcHBpZD15dGFjaHlvbg--/https://s.yimg.com/os/creatr-uploaded-images/2024-10/2d4b5670-8c02-11ef-ab77-3612be33fd94',
     'width': 5184,
     'height': 2911,
     'tag': 'original'},
    {'url': 'https://s.yimg.com/uu/api/res/1.2/r5LY6fFGeieZciaUpj2axQ--~B/Zmk9ZmlsbDtoPTE0MDtweW9mZj0wO3c9MTQwO2FwcGlkPXl0YWNoeW9u/https://s.yimg.com/os/creatr-uploaded-images/2024-10/2d4b5670-8c02-11ef-ab77-3612be33fd94',
     'width': 140,
     'height': 140,
     'tag': '140x140'}]},
  'relatedTickers': ['GOOG', 'MSFT', 'AMZN']},
 {'uuid': 'a8d7ae12-1829-3dfd-aca0-243b0ffeb64

**Miscellaneous info from various sources**

In [44]:
# show holders
msft.major_holders
msft.institutional_holders
msft.mutualfund_holders
msft.insider_transactions
msft.insider_purchases
msft.insider_roster_holders

msft.sustainability

# show recommendations
msft.recommendations
msft.recommendations_summary
msft.upgrades_downgrades

# show analysts data
msft.analyst_price_targets
msft.earnings_estimate
msft.revenue_estimate
msft.earnings_history
msft.eps_trend
msft.eps_revisions
msft.growth_estimates

# Show future and historic earnings dates, returns at most next 4 quarters and last 8 quarters by default.
# Note: If more are needed use msft.get_earnings_dates(limit=XX) with increased limit argument.
msft.earnings_dates

# show ISIN code - *experimental*
# ISIN = International Securities Identification Number
msft.isin

# show options expirations
msft.options



# get option chain for specific expiration
opt = msft.option_chain('YYYY-MM-DD')
# data available via: opt.calls, opt.puts

[{'uuid': 'bfb047c8-8575-35f3-a53a-1ad4be8328c0',
  'title': 'Amazon invests $500M to go nuclear after Microsoft and Google',
  'publisher': 'Yahoo Finance Video',
  'link': 'https://finance.yahoo.com/video/amazon-invests-500m-nuclear-microsoft-210914709.html',
  'providerPublishTime': 1729112954,
  'type': 'VIDEO',
  'thumbnail': {'resolutions': [{'url': 'https://s.yimg.com/uu/api/res/1.2/U00L0aofIuKpLDFf2nQj5g--~B/aD0yOTExO3c9NTE4NDthcHBpZD15dGFjaHlvbg--/https://s.yimg.com/os/creatr-uploaded-images/2024-10/2d4b5670-8c02-11ef-ab77-3612be33fd94',
     'width': 5184,
     'height': 2911,
     'tag': 'original'},
    {'url': 'https://s.yimg.com/uu/api/res/1.2/r5LY6fFGeieZciaUpj2axQ--~B/Zmk9ZmlsbDtoPTE0MDtweW9mZj0wO3c9MTQwO2FwcGlkPXl0YWNoeW9u/https://s.yimg.com/os/creatr-uploaded-images/2024-10/2d4b5670-8c02-11ef-ab77-3612be33fd94',
     'width': 140,
     'height': 140,
     'tag': '140x140'}]},
  'relatedTickers': ['GOOG', 'MSFT', 'AMZN']},
 {'uuid': 'a8d7ae12-1829-3dfd-aca0-243b0ffeb64

# **Tips and Tricks**

**Multiple Stock Tickers at once**

In [48]:
import yfinance as yf

tickers = yf.Tickers('msft aapl goog')

# access each ticker using (example)
tickers.tickers['MSFT'].info
tickers.tickers['AAPL'].history(period="1mo")
tickers.tickers['GOOG'].actions

data = yf.download("SPY AAPL", period="1mo")


[*********************100%***********************]  2 of 2 completed


Price,Adj Close,Adj Close,Close,Close,High,High,Low,Low,Open,Open,Volume,Volume
Ticker,AAPL,SPY,AAPL,SPY,AAPL,SPY,AAPL,SPY,AAPL,SPY,AAPL,SPY
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
2024-09-17 00:00:00+00:00,216.789993,561.348206,216.789993,563.070007,216.899994,566.580017,214.5,560.789978,215.75,565.099976,45519300,49321000
2024-09-18 00:00:00+00:00,220.690002,559.68335,220.690002,561.400024,222.710007,568.690002,217.539993,560.830017,217.550003,563.73999,59894900,59044900
2024-09-19 00:00:00+00:00,228.869995,569.234009,228.869995,570.97998,229.820007,572.880005,224.630005,568.080017,224.990005,571.01001,66781300,75315500
2024-09-20 00:00:00+00:00,228.199997,568.25,228.199997,568.25,233.089996,569.309998,227.619995,565.169983,229.970001,567.840027,318679900,77503100
2024-09-23 00:00:00+00:00,226.470001,569.669983,226.470001,569.669983,229.449997,570.330017,225.809998,568.099976,227.339996,569.340027,54146000,44116900
2024-09-24 00:00:00+00:00,227.369995,571.299988,227.369995,571.299988,229.350006,571.359985,225.729996,567.599976,228.649994,570.47998,43556100,46805700
2024-09-25 00:00:00+00:00,226.369995,570.039978,226.369995,570.039978,227.289993,571.890015,224.020004,568.909973,224.929993,571.140015,42308700,38428600
2024-09-26 00:00:00+00:00,227.520004,572.299988,227.520004,572.299988,228.5,574.710022,225.410004,569.900024,227.300003,574.380005,36636700,48336000
2024-09-27 00:00:00+00:00,227.789993,571.469971,227.789993,571.469971,229.520004,574.219971,227.300003,570.419983,228.460007,573.390015,34026000,42100900
2024-09-30 00:00:00+00:00,233.0,573.76001,233.0,573.76001,233.0,574.380005,229.649994,568.080017,230.039993,570.419983,54541900,63557400


**Sector-level Queries**

In [55]:
import yfinance as yf

tech = yf.Sector('technology')
software = yf.Industry('software-infrastructure')

# Common information
tech.key
tech.name
tech.symbol
tech.ticker
tech.overview
tech.top_companies
tech.research_reports

# Sector information
tech.top_etfs
tech.top_mutual_funds
tech.industries

# Industry information
software.sector_key
software.sector_name
software.top_performing_companies
software.top_growth_companies

import yfinance as yf

# Ticker to Sector and Industry
msft = yf.Ticker('MSFT')
tech = yf.Sector(msft.info.get('sectorKey'))
software = yf.Industry(msft.info.get('industryKey'))

# Sector and Industry to Ticker
tech_ticker = tech.ticker
tech_ticker.info
software_ticker = software.ticker
software_ticker.history()

[{'id': 'MS_0P000004NY_AnalystReport_1729112188000',
  'headHtml': 'Analyst Report: BlackBerry Limited',
  'provider': 'Morningstar',
  'targetPrice': 3.4,
  'targetPriceStatus': 'Maintained',
  'investmentRating': 'Bullish',
  'reportDate': '2024-10-16T20:56:28Z',
  'reportTitle': 'BlackBerry, once known for being the world’s largest smartphone manufacturer, is now exclusively a software provider with a stated goal of end-to-end secure communication for enterprises. The firm provides endpoint management and protection to enterprises, specializing in regulated industries like government, as well as embedded software to the automotive, medical, and industrial markets. ',
  'reportType': 'Analyst Report'},
 {'id': 'MS_0P0000002X_AnalystReport_1729097397000',
  'headHtml': 'Analyst Report: ASML Holding N.V.',
  'provider': 'Morningstar',
  'targetPrice': 935.0,
  'targetPriceStatus': 'Maintained',
  'investmentRating': 'Bullish',
  'reportDate': '2024-10-16T16:49:57Z',
  'reportTitle': 'A

# Logging
yfinance now uses the logging module to handle messages, default behaviour is only print errors. If debugging, use yf.enable_debug_mode() to switch logging to debug with custom formatting.



# Smart Scraping and Rate-limiting


In [58]:
!pip install yfinance[nospam]

Collecting requests-cache>=1.0 (from yfinance[nospam])
  Downloading requests_cache-1.2.1-py3-none-any.whl.metadata (9.9 kB)
Collecting requests-ratelimiter>=0.3.1 (from yfinance[nospam])
  Downloading requests_ratelimiter-0.7.0-py3-none-any.whl.metadata (12 kB)
Collecting cattrs>=22.2 (from requests-cache>=1.0->yfinance[nospam])
  Downloading cattrs-24.1.2-py3-none-any.whl.metadata (8.4 kB)
Collecting url-normalize>=1.4 (from requests-cache>=1.0->yfinance[nospam])
  Downloading url_normalize-1.4.3-py2.py3-none-any.whl.metadata (3.1 kB)
Collecting pyrate-limiter<3.0 (from requests-ratelimiter>=0.3.1->yfinance[nospam])
  Downloading pyrate_limiter-2.10.0-py3-none-any.whl.metadata (15 kB)
Downloading requests_cache-1.2.1-py3-none-any.whl (61 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.4/61.4 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading requests_ratelimiter-0.7.0-py3-none-any.whl (9.3 kB)
Downloading cattrs-24.1.2-py3-none-any.whl (66 kB)
[2K  

In [59]:
import requests_cache
session = requests_cache.CachedSession('yfinance.cache')
session.headers['User-agent'] = 'my-program/1.0'
ticker = yf.Ticker('msft', session=session)
# The scraped response will be stored in the cache
ticker.actions


Unnamed: 0_level_0,Dividends,Stock Splits
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
1987-09-21 00:00:00-04:00,0.00,2.0
1990-04-16 00:00:00-04:00,0.00,2.0
1991-06-27 00:00:00-04:00,0.00,1.5
1992-06-15 00:00:00-04:00,0.00,1.5
1994-05-23 00:00:00-04:00,0.00,2.0
...,...,...
2023-08-16 00:00:00-04:00,0.68,0.0
2023-11-15 00:00:00-05:00,0.75,0.0
2024-02-14 00:00:00-05:00,0.75,0.0
2024-05-15 00:00:00-04:00,0.75,0.0


Combine requests_cache with rate-limiting to avoid triggering Yahoo's rate-limiter/blocker that can corrupt data.


In [None]:


from requests import Session
from requests_cache import CacheMixin, SQLiteCache
from requests_ratelimiter import LimiterMixin, MemoryQueueBucket
from pyrate_limiter import Duration, RequestRate, Limiter
class CachedLimiterSession(CacheMixin, LimiterMixin, Session):
    pass

session = CachedLimiterSession(
    limiter=Limiter(RequestRate(2, Duration.SECOND*5)),  # max 2 requests per 5 seconds
    bucket_class=MemoryQueueBucket,
    backend=SQLiteCache("yfinance.cache"),
)

# Managing Multi-Level Columns
The following answer on Stack Overflow is for How to deal with multi-level column names downloaded with yfinance?

yfinance returns a pandas.DataFrame with multi-level column names, with a level for the ticker and a level for the stock price data
The answer discusses:
How to correctly read the the multi-level columns after saving the dataframe to a csv with pandas.DataFrame.to_csv
How to download single or multiple tickers into a single dataframe with single level column names and a ticker column