# 1. TRENDING STOCK SYMBOLS

API domain: https://yfapi.net
API path: /v1/finance/trending/{region}

Pull data (extract) from yahoofinancials API (URL path is /v1/finance/trending/{region}).
It returns 20 stock symbols based on the region in the URL.

Uses your basic subscription API key obtained from yahoofinancials web site.
Current region: US

In [1]:
# Import required modules
import pandas as pd

In [2]:
##  Ensure you have your own config.py file in same folder so you can import your api key
import requests
import pprint
from config import API_KEY

region_selected = "US"
url = "https://yfapi.net/v1/finance/trending/" + region_selected
## Alternate url
# url = "https://rest.yahoofinanceapi.com/v1/finance/trending/" + region_selected

# Credentials to include
headers = {
    'x-api-key': API_KEY
    }

In [3]:
## Make API request (no query string required for this API)
response1 = requests.request("GET", url, headers=headers).json()

# What does the response look like?  We will have to use indexing to get to the 'result' level
pprint.pprint(response1)

{'finance': {'error': None,
             'result': [{'count': 20,
                         'jobTimestamp': 1632172086193,
                         'quotes': [{'symbol': '^DJI'},
                                    {'symbol': 'BABA'},
                                    {'symbol': 'ZIVO'},
                                    {'symbol': 'BTC-USD'},
                                    {'symbol': '^GSPC'},
                                    {'symbol': 'EDSA'},
                                    {'symbol': '^IXIC'},
                                    {'symbol': 'AMZN'},
                                    {'symbol': '^VIX'},
                                    {'symbol': 'DISCB'},
                                    {'symbol': 'LEN'},
                                    {'symbol': '3333.HK'},
                                    {'symbol': 'EGRNY'},
                                    {'symbol': 'SPY'},
                                    {'symbol': 'NVDA'},
                              

# Optional: Check that 'error' was 'None'
# Query pulled 20 stock symbols in US.

In [5]:
## How many responses did we get?  Single result with multiple columns
len(response1['finance']['result'])

1

In [6]:
## Parse through the resulting dictionary tree to get at the 'results'.  Look at only the first
##  result in order to see what columns we received.
response1['finance']['result'][0]

{'count': 20,
 'quotes': [{'symbol': '^DJI'},
  {'symbol': 'BABA'},
  {'symbol': 'ZIVO'},
  {'symbol': 'BTC-USD'},
  {'symbol': '^GSPC'},
  {'symbol': 'EDSA'},
  {'symbol': '^IXIC'},
  {'symbol': 'AMZN'},
  {'symbol': '^VIX'},
  {'symbol': 'DISCB'},
  {'symbol': 'LEN'},
  {'symbol': '3333.HK'},
  {'symbol': 'EGRNY'},
  {'symbol': 'SPY'},
  {'symbol': 'NVDA'},
  {'symbol': 'NIO'},
  {'symbol': 'NWAU'},
  {'symbol': 'ARQQ'},
  {'symbol': 'AAL'},
  {'symbol': 'TQQQ'}],
 'jobTimestamp': 1632172086193,
 'startInterval': 202109202000}

In [7]:
# Create a new DataFrame from all the results in the dictionary
response1_df = pd.DataFrame(response1['finance']['result'])
response1_df.head()

Unnamed: 0,count,quotes,jobTimestamp,startInterval
0,20,"[{'symbol': '^DJI'}, {'symbol': 'BABA'}, {'sym...",1632172086193,202109202000


In [8]:
# Extraction of the quotes, timestamp, startInterval
start_interval = response1_df['startInterval']
start_interval

0    202109202000
Name: startInterval, dtype: int64

In [9]:
# Loop through each stock symbol retrieved. Use later in retrieving quote and detail
## Using 'for' loop
# stock_list=[]
# for stock in response1_df['quotes'][0]:
#     #print(stock['symbol'])
#     stock_list.append(stock['symbol'])
## Using list comprehension notation
# stock_list = [stock['symbol'] for stock in response1_df['quotes'][0]]
stock_list = [stock['symbol'] for stock in response1_df['quotes'][0]]
stock_list

['^DJI',
 'BABA',
 'ZIVO',
 'BTC-USD',
 '^GSPC',
 'EDSA',
 '^IXIC',
 'AMZN',
 '^VIX',
 'DISCB',
 'LEN',
 '3333.HK',
 'EGRNY',
 'SPY',
 'NVDA',
 'NIO',
 'NWAU',
 'ARQQ',
 'AAL',
 'TQQQ']

In [10]:
# The quote API query wants the list of symbols as a single comma separated string 
#  with no spaces.  Max is 10 per request.
query_string=''
for stock in stock_list:
    query_string += f"{stock},"
query_string = query_string.rstrip(',')
query_string

'^DJI,BABA,ZIVO,BTC-USD,^GSPC,EDSA,^IXIC,AMZN,^VIX,DISCB,LEN,3333.HK,EGRNY,SPY,NVDA,NIO,NWAU,ARQQ,AAL,TQQQ'

In [11]:
# For now, get the first 10 and run the quote API
query_string=''
for i in range(10):
    query_string += f"{response1_df['quotes'][0][i]['symbol']},"
query_string = query_string.rstrip(',')
query_string

'^DJI,BABA,ZIVO,BTC-USD,^GSPC,EDSA,^IXIC,AMZN,^VIX,DISCB'

# 2. REAL TIME QUOTE DATA

API domain: https://yfapi.net
API path: /v6/finance/quote
API variables:  querystring  {"symbols" : "string1,string2,string3,..."}

Pull data (extract) from yahoofinancials API (URL path is /v6/finance/quote) with "symbols" set to comma-separated list of up to 10 stock symbols at a time.

Use the query_string of 10 stock symbols.

In [12]:
## Query of /v6/finance/quote API

url = "https://yfapi.net/v6/finance/quote"
## Alternate url
# url = "https://rest.yahoofinanceapi.com/v6/finance/quote"

## Warning: no space between symbols
querystring = {"symbols" : query_string}

# Credentials to include
headers = {
    'x-api-key': API_KEY
    }

In [13]:
## Make API request (query string required for this API)
response2 = requests.request("GET", url, headers=headers, params=querystring).json()

# What does the response look like?  We will have to use indexing to get to the 'result' level
pprint.pprint(response2)

{'quoteResponse': {'error': None,
                   'result': [{'ask': 34053.03,
                               'askSize': 0,
                               'averageDailyVolume10Day': 358158333,
                               'averageDailyVolume3Month': 292187460,
                               'bid': 33866.97,
                               'bidSize': 0,
                               'currency': 'USD',
                               'esgPopulated': False,
                               'exchange': 'DJI',
                               'exchangeDataDelayedBy': 0,
                               'exchangeTimezoneName': 'America/New_York',
                               'exchangeTimezoneShortName': 'EDT',
                               'fiftyDayAverage': 35138.49,
                               'fiftyDayAverageChange': -1168.0195,
                               'fiftyDayAverageChangePercent': -0.03324046,
                               'fiftyTwoWeekHigh': 35631.19,
                     

In [14]:
## How many records did we get?
len(response2['quoteResponse']['result'])

10

In [15]:
# Create a new DataFrame from all the results in the dictionary
response2_df = pd.DataFrame(response2['quoteResponse']['result'])
response2_df

Unnamed: 0,language,region,quoteType,quoteSourceName,triggerable,currency,fiftyDayAverage,fiftyDayAverageChange,fiftyDayAverageChangePercent,twoHundredDayAverage,...,ipoExpectedDate,circulatingSupply,lastMarket,volume24Hr,volumeAllCurrencies,fromCurrency,toCurrency,startDate,coinImageUrl,dividendDate
0,en-US,US,INDEX,Delayed Quote,True,USD,35138.49,-1168.0195,-0.03324,34242.938,...,,,,,,,,,,
1,en-US,US,EQUITY,Nasdaq Real Time Price,True,USD,175.28589,-23.795883,-0.135755,209.69588,...,,,,,,,,,,
2,en-US,US,EQUITY,Nasdaq Real Time Price,True,USD,3.277059,1.382941,0.422007,2.215072,...,2021-05-28,,,,,,,,,
3,en-US,US,CRYPTOCURRENCY,CoinMarketCap,True,USD,46515.508,-2815.871,-0.060536,45906.29,...,,18821662.0,CoinMarketCap,43229430000.0,43229430000.0,BTC,USD=X,1367104000.0,https://s.yimg.com/uc/fin/img/reports-thumbnai...,
4,en-US,US,INDEX,Delayed Quote,True,USD,4466.533,-108.80322,-0.02436,4242.1626,...,,,,,,,,,,
5,en-US,US,EQUITY,Nasdaq Real Time Price,True,USD,5.075,6.845,1.348769,5.452971,...,,,,,,,,,,1560125000.0
6,en-US,US,INDEX,Delayed Quote,True,USD,14982.439,-268.53613,-0.017923,14146.466,...,,,,,,,,,,
7,en-US,US,EQUITY,Nasdaq Real Time Price,True,USD,3371.3582,-15.628174,-0.004636,3342.737,...,,,,,,,,,,
8,en-US,US,INDEX,,True,USD,17.817648,7.892351,0.442951,18.38942,...,,,,,,,,,,
9,en-US,US,EQUITY,Nasdaq Real Time Price,True,USD,44.98294,22.51706,0.500569,64.96022,...,,,,,,,,,,


# 3. STOCK HISTORY

API domain: https://yfapi.net
API path: /v8/finance/spark
API variables:  querystring  {"symbols" : "string1,string2,string3,..."}
                interval  (1 minute to 1 month)
                range     (1 day to 1 month to 5 years)

Pull data (extract) from yahoofinancials API (URL path is /v8/finance/spark) with "symbols" set to comma-separated list of up to 10 stock symbols at a time.  Specify data interval and range of desired time span.

Use the query_string of 10 stock symbols.  Returns 

In [16]:
## Query of /v8/finance/spark API

url = "https://yfapi.net/v8/finance/spark"
## Alternate url
# url = "https://rest.yahoofinanceapi.com/v8/finance/spark"

## Warning: no space between symbols
my_interval = "1d"
my_range = "1mo"
querystring = {"symbols" : query_string,
               "interval": my_interval,
               "range"   : my_range
              }

# Credentials to include
headers = {
    'x-api-key': API_KEY
    }

In [17]:
## Make API request (query string required for this API)
response3 = requests.request("GET", url, headers=headers, params=querystring).json()

# What does the response look like?  We will have to use indexing to get to the 'result' level
pprint.pprint(response3)

{'AMZN': {'chartPreviousClose': 3199.95,
          'close': [3265.87,
                    3305.78,
                    3299.18,
                    3316.0,
                    3349.63,
                    3421.57,
                    3470.79,
                    3479.0,
                    3463.12,
                    3478.05,
                    3509.29,
                    3525.5,
                    3484.16,
                    3469.15,
                    3457.17,
                    3450.0,
                    3475.79,
                    3488.24,
                    3462.52,
                    3355.73],
          'dataGranularity': 300,
          'end': None,
          'previousClose': None,
          'start': None,
          'symbol': 'AMZN',
          'timestamp': [1629725400,
                        1629811800,
                        1629898200,
                        1629984600,
                        1630071000,
                        1630330200,
                       

In [19]:
## How many records did we get?
len(response3)

10

In [20]:
# Create a new DataFrame from all the results in the dictionary
response3_df = pd.DataFrame(response3)
response3_df

Unnamed: 0,BABA,BTC-USD,ZIVO,^GSPC,^IXIC,^VIX,EDSA,DISCB,^DJI,AMZN
timestamp,"[1629725400, 1629811800, 1629898200, 162998460...","[1629414000, 1629500400, 1629586800, 162967320...","[1629725400, 1629811800, 1629984600, 163007100...","[1629725400, 1629811800, 1629898200, 162998460...","[1629725400, 1629811800, 1629898200, 162998460...","[1629725400, 1629811800, 1629898200, 162998460...","[1629725400, 1629811800, 1629898200, 162998460...","[1629725400, 1629811800, 1629898200, 162998460...","[1629725400, 1629811800, 1629898200, 162998460...","[1629725400, 1629811800, 1629898200, 162998460..."
symbol,BABA,BTC-USD,ZIVO,^GSPC,^IXIC,^VIX,EDSA,DISCB,^DJI,AMZN
previousClose,,,,,,,,,,
chartPreviousClose,157.96,49339.176,3.2,4441.67,14714.66,18.56,4.2,38.17,35120.08,3199.95
dataGranularity,300,300,300,300,300,300,300,300,300,300
end,,,,,,,,,,
start,,,,,,,,,,
close,"[161.06, 171.7, 169.1, 165.24, 159.47, 162.29,...","[49339.176, 48905.492, 49321.652, 49546.15, 47...","[3.39, 3.27, 3.25, 3.23, 3.24, 3.18, 3.14, 3.3...","[4479.53, 4486.23, 4496.19, 4470.0, 4509.37, 4...","[14942.65, 15019.8, 15041.86, 14945.81, 15129....","[17.15, 17.22, 16.79, 18.84, 16.39, 16.19, 16....","[4.35, 4.46, 4.68, 4.93, 5.01, 5.2, 5.87, 6.12...","[39.34, 40.73, 44.39, 43.13, 43.07, 50.83, 50....","[35335.71, 35366.26, 35405.5, 35213.12, 35455....","[3265.87, 3305.78, 3299.18, 3316.0, 3349.63, 3..."


In [28]:
# Or try to turn the dictionary of dictionaries into a list of dictionaries before
# Feeding it into a dataframe.  Dataframes are best as list of dictionaries.

list3 = []
for value in response3:
    # print(response3[value])
    list3.append(response3[value])
## Alternate syntax with list comprehension
# list3 = [response3[value] for value in response3]
list3[0]

{'timestamp': [1629725400,
  1629811800,
  1629898200,
  1629984600,
  1630071000,
  1630330200,
  1630416600,
  1630503000,
  1630589400,
  1630675800,
  1631021400,
  1631107800,
  1631194200,
  1631280600,
  1631539800,
  1631626200,
  1631712600,
  1631799000,
  1631885400,
  1632168002],
 'symbol': 'BABA',
 'previousClose': None,
 'chartPreviousClose': 157.96,
 'dataGranularity': 300,
 'end': None,
 'start': None,
 'close': [161.06,
  171.7,
  169.1,
  165.24,
  159.47,
  162.29,
  166.99,
  173.28,
  172.0,
  170.3,
  175.16,
  170.71,
  167.32,
  168.1,
  165.41,
  160.15,
  157.86,
  156.26,
  160.05,
  151.49]}

In [29]:
list3_df = pd.DataFrame(list3)
list3_df

Unnamed: 0,timestamp,symbol,previousClose,chartPreviousClose,dataGranularity,end,start,close
0,"[1629725400, 1629811800, 1629898200, 162998460...",BABA,,157.96,300,,,"[161.06, 171.7, 169.1, 165.24, 159.47, 162.29,..."
1,"[1629414000, 1629500400, 1629586800, 162967320...",BTC-USD,,49339.176,300,,,"[49339.176, 48905.492, 49321.652, 49546.15, 47..."
2,"[1629725400, 1629811800, 1629984600, 163007100...",ZIVO,,3.2,300,,,"[3.39, 3.27, 3.25, 3.23, 3.24, 3.18, 3.14, 3.3..."
3,"[1629725400, 1629811800, 1629898200, 162998460...",^GSPC,,4441.67,300,,,"[4479.53, 4486.23, 4496.19, 4470.0, 4509.37, 4..."
4,"[1629725400, 1629811800, 1629898200, 162998460...",^IXIC,,14714.66,300,,,"[14942.65, 15019.8, 15041.86, 14945.81, 15129...."
5,"[1629725400, 1629811800, 1629898200, 162998460...",^VIX,,18.56,300,,,"[17.15, 17.22, 16.79, 18.84, 16.39, 16.19, 16...."
6,"[1629725400, 1629811800, 1629898200, 162998460...",EDSA,,4.2,300,,,"[4.35, 4.46, 4.68, 4.93, 5.01, 5.2, 5.87, 6.12..."
7,"[1629725400, 1629811800, 1629898200, 162998460...",DISCB,,38.17,300,,,"[39.34, 40.73, 44.39, 43.13, 43.07, 50.83, 50...."
8,"[1629725400, 1629811800, 1629898200, 162998460...",^DJI,,35120.08,300,,,"[35335.71, 35366.26, 35405.5, 35213.12, 35455...."
9,"[1629725400, 1629811800, 1629898200, 162998460...",AMZN,,3199.95,300,,,"[3265.87, 3305.78, 3299.18, 3316.0, 3349.63, 3..."


# 4. Steps beyond

We could save the raw outputs into 3 database tables.

Transformations (either in the database or here in Jupyter Notebook):
From above, we would probably want another table with the stock symbol, timestamp, and closing price.  Do we need 'time of capture' column?
For time stamp, we would likely need the Python code that can transform this Unix time stamp into a MM-DD-YYYY format when inputing into PostGres table.  Or we can ingest this field as-is and then in database perform the calculation.

Load (in the database):
Decide for each raw table above which columns will be needed for the final set of tables.  Make those tables, calculations, etc, and produce the final set of tables.