## Section 3.1: Extract data from Yahoo! Finance page
<hr>

This notebook does three things:
 - Load the ```yfinance``` module and the tickers from the output file of the previous notebook
 - Gets the top N tickers as specified by the user and gets the corresponding open price, closing price and volume of the stocks for the last 30 days.
 - Stores the values in a CSV file for further analysis and vizualization

In [None]:
import pandas as pd
from datetime import date, timedelta

In [None]:
#Loading yfinance module and installing it if it doesn't already exist
try:
    success_msg = 'Yahoo! Finance module loaded.'
    import yfinance as yf
    print(success_msg)
except ModuleNotFoundError as e:
    print('Installing Yahoo! Finance python module.\n')
    !pip install yfinance
    import yfinance as yf
    print(success_msg)

In [None]:
# load the tickers dataframe from the previous notebook's output csv file
df = pd.read_csv('./Outputfile.csv')

In [None]:
df.head()

In [None]:
# Get the number of stocks the user wants to analyze
num_of_stocks = int(input("How many stocks do you want to analyze : "))

In [None]:
# Get the num_of_stocks number of highest occuring tickers from the tickers dataframe 
tickers = df['Ticker'].dropna()
top_tickers = tickers.value_counts()[:num_of_stocks].index.tolist()
print(top_tickers)

Now that we have got the top N tickers we are going to loop through the list og tickers and download the last 1 month historical data using the `yfinance.download()` function. Then we add in the date column and ticker column to the dataset for the purpose of vizualization. Then we merge all the list of stocks and save it in a csv file.

In [None]:
# create an empty list to store the historical stock details temporarily.
temp_dataframes = []
# loop through the list of top N tickers and download the historical data (last 1 month) for each
for tick in top_tickers:
    temp_stock = yf.download(tick, start=(date.today()-timedelta(days=30)).isoformat(), end=date.today())
    
    # add the date column to the dataframe 
    date_list = temp_stock.index
    temp_stock['Date'] = date_list
    temp_stock['Date'] = pd.to_datetime(temp_stock['Date']).dt.strftime('%Y-%m-%d')
    
    # add the ticker column to the dataframe
    temp_stock['Ticker'] = tick
    # append the dataframe to the the temp_dataframes list
    temp_dataframes.append(temp_stock)

# concatenate the list into a pandas dataframe and store it as CSV
all_tickers = pd.concat(temp_dataframes)
all_tickers.to_csv('Tickers.csv', index=False)
    
