<a href="https://colab.research.google.com/github/anaemcaro/QuantitativeAnalysis/blob/main/Data_yfinance.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Extracting data using the library yfinance

yfinance is an an opensource library for accessing financial information from Yahoo! Finance.

For mor information about yfinance:

https://algotrading101.com/learn/yfinance-guide/

In [None]:
# Collecting the data from Yahoo Finance
import yfinance as yf

# Data manipulation
import pandas as pd

# Data cleanning
import datetime
from datetime import date, timedelta

In [None]:
def extraer_data(Ticker, Name):
  # Getting the data from the last 730 days (2 years)
  today = date.today()

  # Start and en dates
  end_date = today.strftime("%Y-%m-%d")
  start_date = date.today() - timedelta(days=730)
  start_date = start_date.strftime("%Y-%m-%d")

  # Download data into a dataframe
  data = yf.download(Ticker,
                        start=start_date,
                        end=end_date,
                        progress=False)

  # Insert the Ticker and the Name in a column for each
  # Sometimes is difficult to identify the Ticker alone
  data.insert(0, 'Name', Name)
  data.insert(0, 'Ticker', Ticker)
  return data

The resulting dataframe has the following columns:

Ticker: The stock ticker symbol.

Name: A name we give to the instrument to clarify

Date: The trading date.

Open: The opening price of the stock for the day.

High: The highest price of the stock during the day.

Low: The lowest price of the stock during the day.

Close: The closing price of the stock for the day.

Adj Close: The adjusted closing price, which accounts for
all corporate actions such as dividends, stock splits, etc.

Volume: The number of shares traded during the day.

In [None]:
# Project: Basic Quantitative Analysis of Index Markets
# The data here is extracted and formatted as is going to be required by this specific project.
# Other projects might have other requirements

# Obtaining the data for the indexs:
SP500 = extraer_data('^GSPC', 'SP500')
NASDAQ = extraer_data('^IXIC', 'NASDAQ')
DowJones = extraer_data('^DJI', 'DowJones')

# Concatenating the three dataframes
data = pd.concat([SP500, NASDAQ, DowJones], axis=0)
data['Ticker'].value_counts()
data.reset_index(inplace=True)

# Pivoting the resulting dataframe for the analysis over the Close price
index_data = data.pivot(index='Date', columns='Name', values='Close')

# Saving the dataset
# Write the correct path where the file is going to be saved
index_data.to_csv('/content/drive/MyDrive/Assets/index_data.csv')

In [None]:
index_data.describe()

Name,DowJones,NASDAQ,SP500
count,501.0,501.0,501.0
mean,33741.720625,12687.791115,4219.886035
std,1981.121175,1418.463676,325.163433
min,28725.509766,10213.290039,3577.030029
25%,32799.921875,11466.980469,3970.040039
50%,33684.53125,12500.570312,4158.240234
75%,34517.730469,13751.400391,4451.140137
max,39131.53125,16041.620117,5088.799805
