<a href="https://colab.research.google.com/github/DavidBillayio/PriceDashboard/blob/main/PriceDashboard.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Welcome to the Price Dashboard

Here we scrape Yahoo Finance for price data and export the data to a csv file.

In [1]:
# first we import the required modules

from bs4 import BeautifulSoup
import requests
import sched, time, datetime
import pandas as pd
import numpy as np

In [2]:
#Then we define our scheduler class so the program can run automatically
class PeriodicScheduler(object):                                                  
    def __init__(self):                                                           
        self.scheduler = sched.scheduler(time.time, time.sleep)                   
                                                                            
    def setup(self, interval, action, actionargs=()):                             
        action(*actionargs)                                                       
        self.scheduler.enter(interval, 1, self.setup,                             
                        (interval, action, actionargs))                           
                                                                        
    def run(self):                                                                
        self.scheduler.run()

In [8]:
#Then we will define our various functions that will run our dashboard
def getPrice(link):
    """Scrape the links to get the prices"""
    r = requests.get(link)
    soup = BeautifulSoup(r.text, 'lxml')
    price = soup.find_all('div', {'class' :'D(ib) Mend(20px)'})[0].find('span').text
    return price

def periodic(interval, action, actionargs=()):
    """our periodic function to run our PriceRefresh function"""
    sched.scheduler.enter(interval, 1, periodic, priceRefresh, actionargs)
    action(*actionargs)

def priceRefresh(links, update_time,Price_df): 
    """Our priceRefresh function iterates through the links and pulls the prices
        This also creates our update function for the .csv file
    """
    global column_names
    update_df = pd.DataFrame([list(np.zeros(len(sitelinks) + 1))], columns = column_names)
    update_df['time'] = (update_time)
    for link in links:
        price = getPrice(links[link])
        print('The value of {} is {}'.format(link, price))
        update_df[link] = price 
    return update_df

def periodic_event():
    """This is our base function that iterates each time. 
    We call the priceRefresh function then use the update_df 
    to add to the .csv file. The csv file is overwritten each iteration."""
    global Price_df
    update_time = datetime.datetime.now()
    print(update_time)
    update_df = priceRefresh(sitelinks, update_time,Price_df)
    Price_df = Price_df.append(update_df)
    
    Price_df.to_csv('PriceData.csv', index = False)
    print('Done')
    

In [4]:
#Here are the sitelinks. Note, if not using these specific yahoo links the code
#may not work since the scraper looks at the same area of the website each time
# i.e. the price section
sitelinks = {
    'BTCUSD' : 'https://ca.finance.yahoo.com/quote/BTC-USD?p=BTC-USD',
    'ETHUSD' : 'https://ca.finance.yahoo.com/quote/ETH-USD?p=ETH-USD',
    'SP500' : 'https://ca.finance.yahoo.com/quote/%5EGSPC?p=^GSPC',
    'CADUSD' : 'https://ca.finance.yahoo.com/quote/CADUSD=X?p=CADUSD=X',
    'VIX' : 'https://ca.finance.yahoo.com/quote/%5EVIX?p=^VIX'
        }

In [5]:
#here is an example:
print(getPrice('https://ca.finance.yahoo.com/quote/%5EIXIC?p=^IXIC'))
print(getPrice('https://www.tradingview.com/symbols/NASDAQ-NDAQ/'))

#"List index out of range" means that there aren't any values that match the
#find_all function. i.e. no matches for the search.

#However, you will notice that the value for the NASDAQ was printed.

#This error won't affect our next code box.

11,420.98


IndexError: ignored

In [None]:
#After all of that setup we call our scheduler after initializing our Price dataframe

column_names = list(sitelinks.keys())
column_names.insert(0,'time')
Price_df = pd.DataFrame([list(np.zeros(len(sitelinks) + 1))], columns = column_names)



interval = 60 # every minute 
periodic_scheduler = PeriodicScheduler()  
periodic_scheduler.setup(interval, periodic_event) # it executes the event just once  
periodic_scheduler.run() # it starts the scheduler  

2020-10-08 20:25:36.024187
The value of BTCUSD is 10,876.52
The value of ETHUSD is 351.00
The value of SP500 is 3,446.83
The value of CADUSD is 0.7578
The value of VIX is 26.42
Done


If you open the file folder icon on the left hand side you will notice our PriceData.csv file is created after the first iteration and updated with new data every minute.