# Get Stock Price Data 

As machine learning practitioners, we need to collect stock price data for regression analysis and time series analysis.

We can easily download it from <a href="https://finance.yahoo.com/">Yahoo Finance.</a> But imagine if we want to create an application where we can analyze the real-time stock prices, we need to collect the latest dataset instead of using the downloaded dataset. So if you want to learn how to get the stock price data between any time interval by using the Python programming language

Yahoo Finance is one of the most popular websites to collect stock price data. You need to visit the website, enter the company’s name or stock symbol, and you can easily download the dataset. But if you want to get the latest dataset every time you are running your code, you need to use the yfinance API. yfinance is an API provided by Yahoo Finance to collect the latest stock price data.

**To use this API, you need to install it by using the pip command**

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline


import yfinance as yf


import warnings
warnings.filterwarnings("ignore")



In [11]:
import datetime
from datetime import date, timedelta

today = date.today()
day1 = today.strftime("%Y-%m-%d")
end_date = day1
day2 = date.today() - timedelta(days=360)
day2 = day2.strftime("%Y-%m-%d")
start_date = day2

data = yf.download('AAPL', start=start_date, end=end_date, progress=False)
data.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-05-16,145.550003,147.520004,144.179993,145.539993,144.876221,86643800
2022-05-17,148.860001,149.770004,146.679993,149.240005,148.559341,78336300
2022-05-18,146.850006,147.360001,139.899994,140.820007,140.17775,109742900
2022-05-19,139.880005,141.660004,136.600006,137.350006,136.723587,136095600
2022-05-20,139.089996,140.699997,132.610001,137.589996,136.962479,137426100


In [12]:
data.describe()

Unnamed: 0,Open,High,Low,Close,Adj Close,Volume
count,246.0,246.0,246.0,246.0,246.0,246.0
mean,149.566504,151.558089,147.877439,149.850691,149.514176,76687670.0
std,11.160746,10.988558,11.417049,11.2692,11.305615,22812650.0
min,126.010002,127.769997,124.169998,125.019997,124.829399,35195900.0
25%,142.067497,143.537502,139.924995,142.420002,141.96542,62145980.0
50%,148.864998,150.889999,147.264999,149.375,148.933762,73333900.0
75%,157.235004,158.472504,154.605,157.317505,156.698341,86620000.0
max,173.75,176.149994,173.119995,174.550003,173.99527,164762400.0


In [13]:
data.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 246 entries, 2022-05-16 to 2023-05-08
Data columns (total 6 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Open       246 non-null    float64
 1   High       246 non-null    float64
 2   Low        246 non-null    float64
 3   Close      246 non-null    float64
 4   Adj Close  246 non-null    float64
 5   Volume     246 non-null    int64  
dtypes: float64(5), int64(1)
memory usage: 13.5 KB


In [8]:
data.shape

(246, 6)

In [15]:
data["Date"] = data.index
data = data[["Date", "Open", "High", 
             "Low", "Close", "Adj Close", "Volume"]]
data.reset_index(drop=True, inplace=True)
data.head()

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,2022-05-16,145.550003,147.520004,144.179993,145.539993,144.876221,86643800
1,2022-05-17,148.860001,149.770004,146.679993,149.240005,148.559341,78336300
2,2022-05-18,146.850006,147.360001,139.899994,140.820007,140.17775,109742900
3,2022-05-19,139.880005,141.660004,136.600006,137.350006,136.723587,136095600
4,2022-05-20,139.089996,140.699997,132.610001,137.589996,136.962479,137426100


## Summary

So as you can see, the final dataset is just like the dataset that we download from Yahoo Finance. This is how we can get stock price data using Python.