Add a scraper for Yahoo Finance #3

mrhappyasthma · 2019-02-02T01:07:51Z

Particularly useful for the analysis. URL = https://finance.yahoo.com/quote/<symbol>/analysis.

Looking at the Next 5 Years (per annum).

The text was updated successfully, but these errors were encountered:

mrhappyasthma · 2019-08-03T06:20:52Z

We can use xpath query to scrape.

//table[last()]//tr[last()-1]//td[2]

https://finance.yahoo.com/quote/AMZN/analysis?p=AMZN

mrhappyasthma · 2021-01-19T04:01:25Z

Some info on the yahoo finance API: https://observablehq.com/@stroked/yahoofinance

mrhappyasthma · 2021-01-19T04:36:24Z

regularMarketPrice and marketCap from query1.finance.yahoo.com/v7/finance/quote?fields=regularMarketPrice,marketCap&symbols=.

Also trailingAnnualDividendRate and dividendDate.

For company info and sec filings: sector, website, industry, longBusinessSummary, companyOfficers

https://query1.finance.yahoo.com/v10/finance/quoteSummary/MSFT?modules=assetProfile,secFilings

Other modules are:

modules = Array(26) [
  0: "assetProfile"
  1: "incomeStatementHistory"
  2: "incomeStatementHistoryQuarterly"
  3: "balanceSheetHistory"
  4: "balanceSheetHistoryQuarterly"
  5: "cashFlowStatementHistory"
  6: "cashFlowStatementHistoryQuarterly"
  7: "defaultKeyStatistics"
  8: "financialData"
  9: "calendarEvents"
  10: "secFilings"
  11: "recommendationTrend"
  12: "upgradeDowngradeHistory"
  13: "institutionOwnership"
  14: "fundOwnership"
  15: "majorDirectHolders"
  16: "majorHoldersBreakdown"
  17: "insiderTransactions"
  18: "insiderHolders"
  19: "netSharePurchaseActivity"
  20: "earnings"
  21: "earningsHistory"
  22: "earningsTrend"
  23: "industryTrend"
  24: "indexTrend"
  25: "sectorTrend"
]

Ex-dividend date comes from calendarEvents.

Cash on hand comes from balanceSheetHistory. - #22

mrhappyasthma · 2021-01-19T04:37:16Z

The only thing I can't figure out how to get yet (which I need) is Next 5 Years (per annum). This is used as part of the calculations to determine pricing.

mrhappyasthma · 2021-01-22T05:53:04Z

This comes down during the main response, so we can just URL fetch the analysis page.

The json is populated in the reactjs root.App.main=.

https://stackoverflow.com/a/39635322/1366973

mrhappyasthma · 2021-05-14T22:22:46Z

I'm not entirely sure why, but doing a local test works fine. But porting the code to run on the server is not finding the string in the output.

import lxml.html as html
from json import loads
import re
import requests

def isPercentage(text):
  match = re.match('(\d+(\.\d+)?%)', text)
  return match != None

def parseNextPercentage(iterator):
  node = None
  while node is None or not isPercentage(node.text):
    node = next(iterator)
  return node.text

r = requests.get('https://finance.yahoo.com/quote/FB/analysis?p=FB')
tree = html.fromstring(bytes(r.text, encoding='utf8'))
tree_iterator = tree.iter()
for element in tree_iterator:
  text = element.text
  if text == 'Next 5 Years (per annum)':
    print(parseNextPercentage(tree_iterator))

mrhappyasthma · 2021-05-14T22:31:41Z

Oh, it was a copy paste error. Of course :P

mrhappyasthma · 2021-05-14T22:36:33Z

Work mostly complete in 250485a.

mrhappyasthma · 2021-05-15T00:14:41Z

As of afa1844, the code is being used to calculate margin of safety.

I still need to parse the current price from the quote and display that.

mrhappyasthma · 2021-05-15T00:41:37Z

Plenty of good data here too: https://stackoverflow.com/questions/44030983/yahoo-finance-url-not-working

mrhappyasthma · 2021-05-15T01:16:06Z

Quote scraping added in 155a766.

The only thing that's needed (although I don't have an immediate use for it) is for fetching quoteSummary modules: #3 (comment)

mrhappyasthma · 2021-05-15T01:30:36Z

Started the implementation here: 5bf4a85

mrhappyasthma · 2021-06-13T04:21:26Z

It seems like the parsing can be done along the lines of this:

results= data['quoteSummary']['result']
moduleData = {}
for module in self.modules:
  for result in results:
    if module in result:
      moduleData[module] = result[module]
      break

This should produce a dictionary with keys for each module, and the results being the result.

mrhappyasthma · 2021-06-13T04:32:57Z

Reading data from a file, I confirmed this approach works:

import json

f = open("temp.txt", "r")
content = f.read()
data = json.loads(content)

results = data['quoteSummary']['result']
modules = ['assetProfile', 'secFilings', 'financialData']
moduleData = {}
for module in modules:
  for result in results:
    if module in result:
      moduleData[module] = result[module]
      break

for key, value in moduleData.items():
  print(key)

This was referenced Jan 19, 2021

Add insider compensation information from Morningstar #9

Closed

Get company info from Polygon.io #25

Closed

Dividend info from seeking alpha #18

Closed

Query seeking alpha quote. #17

Closed

mrhappyasthma mentioned this issue May 15, 2021

[Feature request] Fill the 'meaning' section with information about the company #30

Closed

mrhappyasthma added the Priority label Jun 10, 2021

mrhappyasthma closed this as completed in 4c5acd1 Jun 13, 2021

mrhappyasthma mentioned this issue Nov 22, 2022

Fetch pe ratios from stockrow #58

Closed

mrhappyasthma mentioned this issue Jun 25, 2024

#28: Cleaning up a little to enable unit-testing API code. #78

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a scraper for Yahoo Finance #3

Add a scraper for Yahoo Finance #3

mrhappyasthma commented Feb 2, 2019

mrhappyasthma commented Aug 3, 2019

mrhappyasthma commented Jan 19, 2021

mrhappyasthma commented Jan 19, 2021 •

edited

Loading

mrhappyasthma commented Jan 19, 2021

mrhappyasthma commented Jan 22, 2021

mrhappyasthma commented May 14, 2021 •

edited

Loading

mrhappyasthma commented May 14, 2021

mrhappyasthma commented May 14, 2021

mrhappyasthma commented May 15, 2021

mrhappyasthma commented May 15, 2021

mrhappyasthma commented May 15, 2021

mrhappyasthma commented May 15, 2021

mrhappyasthma commented Jun 13, 2021

mrhappyasthma commented Jun 13, 2021

Add a scraper for Yahoo Finance #3

Add a scraper for Yahoo Finance #3

Comments

mrhappyasthma commented Feb 2, 2019

mrhappyasthma commented Aug 3, 2019

mrhappyasthma commented Jan 19, 2021

mrhappyasthma commented Jan 19, 2021 • edited Loading

mrhappyasthma commented Jan 19, 2021

mrhappyasthma commented Jan 22, 2021

mrhappyasthma commented May 14, 2021 • edited Loading

mrhappyasthma commented May 14, 2021

mrhappyasthma commented May 14, 2021

mrhappyasthma commented May 15, 2021

mrhappyasthma commented May 15, 2021

mrhappyasthma commented May 15, 2021

mrhappyasthma commented May 15, 2021

mrhappyasthma commented Jun 13, 2021

mrhappyasthma commented Jun 13, 2021

mrhappyasthma commented Jan 19, 2021 •

edited

Loading

mrhappyasthma commented May 14, 2021 •

edited

Loading