This notebook contains code to **retrieve** 'beta' and 'Marketcap' values from yahoo finance, and requires input file ```symbol_sgx.csv``` containing the stock symbols.

Note that the runtime for retrieving 'beta' and 'Marketcap' data is very long (>20min). Other methods explored did not result in shorter runtimes
* via ```yahoofinancials``` package (>90min)
* via ```info = yf.Ticker(stock).info```, and then ```beta = info['beta']``` (method buggy and crashes if beta or marketcap values not available for ticker)
* via ```info = yf.Ticker(stock).info```, and then ```beta = info.get('beta')``` (current method, >20min)

In [1]:
import pandas as pd

def symbols_to_list(filepath):
    df = pd.read_csv(filepath)
    return df['Symbol'].tolist()

symbol_sgx = symbols_to_list('symbol_sgx.csv')

In [2]:
import yfinance as yf
import time


df=pd.DataFrame(columns=['Stock','Beta','Marketcap'])

start_time = time.time()

for stock in symbol_sgx: 
    info = yf.Ticker(stock).info
    beta = info.get('beta')
    marketcap = info.get('marketCap')
    df_temp = pd.DataFrame({'Stock':stock,'Beta':[beta],'Marketcap':[marketcap]})
    df = pd.concat([df, df_temp], ignore_index=True)
    
end_time = time.time()    
print(f'duration = {((end_time-start_time)/60):.2f} min')

df

duration = 22.58 min


Unnamed: 0,Stock,Beta,Marketcap
0,502.SI,3.334184,6134100
1,NLC.SI,,34656628
2,1Y1.SI,,37729640
3,BQC.SI,-0.143465,12670270
4,BTJ.SI,1.117496,48901932
...,...,...,...
722,TID.SI,,
723,LG9.SI,,
724,KJ7.SI,,
725,KV4.SI,,


the following line of code has been commented so the user does not accidentally overwrite the input file ```data_beta_marketcap_raw.csv``` already included in the submission

In [3]:
# # save raw df to csv before processing 
# df.to_csv('data_beta_marketcap_raw.csv',index=False)