Installing the yfinance libreary to extract the data from Yahoo Finance

Yahoo Finance (yfinance): A great source for stock market data, including historical stock prices, dividends, and financial indicators

!pip install yfinance

In [86]:
import yfinance as yf
import pandas as pd


#### # Download stock data for Apple (AAPL) from 2020-01-01 to the current date


In [18]:
data = yf.download('AAPL', start='2020-01-01', end='2025-01-01')


[*********************100%***********************]  1 of 1 completed


In [110]:
#Creating the dataframe
df = pd.DataFrame(data)

In [100]:
df.head()

Price,Close,High,Low,Open,Volume
Ticker,AAPL,AAPL,AAPL,AAPL,AAPL
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
2020-01-02,72.716072,72.776598,71.466812,71.721019,135480400
2020-01-03,72.009117,72.771745,71.783962,71.941328,146322800
2020-01-06,72.582909,72.621646,70.876075,71.127866,118387200
2020-01-07,72.241554,72.849231,72.021238,72.592601,108872000
2020-01-08,73.403648,73.706279,71.943759,71.943759,132079200


### Understanding of Data
#### This will give you historical stock data for Apple (AAPL) between the given start and end dates. The columns typically include:

Open: The opening price.

High: The highest price of the day.

Low: The lowest price of the day.

Close: The closing price.

Volume: The trading volume.

####  Volume: The trading volume. 
--> Trading volume is just the total number of items (like stocks, shares, or cryptocurrency) that are bought and sold in a certain period of time. It shows how much activity there is in the market for that item.

For example, if 1000 shares of a stock are bought and sold today, the trading volume is 1000.

High trading volume means there’s a lot of interest in that item, and low volume means fewer people are buying or selling it. It's useful for understanding how popular or active something is in the market.

In [101]:
df.columns

MultiIndex([( 'Close', 'AAPL'),
            (  'High', 'AAPL'),
            (   'Low', 'AAPL'),
            (  'Open', 'AAPL'),
            ('Volume', 'AAPL')],
           names=['Price', 'Ticker'])

So, Its a multi_Index data we can also work with the multiIndex 
but we will convert it to single index 

(A MultiIndex in Pandas refers to an index with multiple levels (also called a hierarchical index) that allows you to represent and work with more complex data structures.)

Before that lets create a copy of our Dataframe

In [111]:
df1 = df.copy()

In [112]:
# Convert MultiIndex columns to a single level by combining the names
df1.columns = [' '.join(col) for col in df.columns]


So now its a single Level data. Now, we can perform the analysis 

In [113]:
df1.head()

Unnamed: 0_level_0,Close AAPL,High AAPL,Low AAPL,Open AAPL,Volume AAPL
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-01-02,72.716072,72.776598,71.466812,71.721019,135480400
2020-01-03,72.009117,72.771745,71.783962,71.941328,146322800
2020-01-06,72.582909,72.621646,70.876075,71.127866,118387200
2020-01-07,72.241554,72.849231,72.021238,72.592601,108872000
2020-01-08,73.403648,73.706279,71.943759,71.943759,132079200


still date is a index but we dont want it 

Before that lets create a copy of dataframe 

In [114]:
df1 = df1.reset_index()

In [115]:
df1.head()

Unnamed: 0,Date,Close AAPL,High AAPL,Low AAPL,Open AAPL,Volume AAPL
0,2020-01-02,72.716072,72.776598,71.466812,71.721019,135480400
1,2020-01-03,72.009117,72.771745,71.783962,71.941328,146322800
2,2020-01-06,72.582909,72.621646,70.876075,71.127866,118387200
3,2020-01-07,72.241554,72.849231,72.021238,72.592601,108872000
4,2020-01-08,73.403648,73.706279,71.943759,71.943759,132079200


#### Chaning the column names 

In [116]:
df1.columns = df1.columns.str.replace(' AAPL', '')

In [131]:
df1.rename(columns={'Close':'Closing_Price',
                   'High': 'Heighest_Price',
                   'Low':'Lowest_Price',
                   'Open':'Opeaning Price',
                   'Volume':'Treading_Volume'}, inplace = True)

### Saving the file to the PC  

In [133]:
df1.to_csv(r'C:\Users\abhi\Desktop\Domain wise Project\Financial Analysis\Apple data extracted from yahoo/Apple_Data.CSV', index = False)

Unnamed: 0,Date,Closing_Price,Heighest_Price,Lowest_Price,Opeaning Price,Treading_Volume
0,2020-01-02,72.716072,72.776598,71.466812,71.721019,135480400
1,2020-01-03,72.009117,72.771745,71.783962,71.941328,146322800
2,2020-01-06,72.582909,72.621646,70.876075,71.127866,118387200
3,2020-01-07,72.241554,72.849231,72.021238,72.592601,108872000
4,2020-01-08,73.403648,73.706279,71.943759,71.943759,132079200
...,...,...,...,...,...,...
1253,2024-12-24,257.916443,257.926411,255.009620,255.209412,23234700
1254,2024-12-26,258.735504,259.814335,257.347047,257.906429,27237100
1255,2024-12-27,255.309296,258.415896,252.782075,257.546826,42355300
1256,2024-12-30,251.923019,253.221595,250.474615,251.952985,35557500


In [8]:
file = pd.read_csv('Apple_Data.csv')

In [9]:
df = pd.DataFrame(file)

In [10]:
df.head()

Unnamed: 0,Date,Closing_Price,Heighest_Price,Lowest_Price,Opeaning Price,Treading_Volume
0,2020-01-02,72.716072,72.776598,71.466812,71.721019,135480400
1,2020-01-03,72.009117,72.771745,71.783962,71.941328,146322800
2,2020-01-06,72.582909,72.621646,70.876075,71.127866,118387200
3,2020-01-07,72.241554,72.849231,72.021238,72.592601,108872000
4,2020-01-08,73.403648,73.706279,71.943759,71.943759,132079200
