<a href="https://www.kaggle.com/code/dattapadal/eda-stock-market-analysis-using-python?scriptVersionId=116166543" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [1]:
!pip install yfinance

Collecting yfinance
  Downloading yfinance-0.2.3-py2.py3-none-any.whl (50 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting pytz>=2022.5
  Downloading pytz-2022.7-py2.py3-none-any.whl (499 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m499.4/499.4 kB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
Collecting multitasking>=0.0.7
  Downloading multitasking-0.0.11-py3-none-any.whl (8.5 kB)
Installing collected packages: pytz, multitasking, yfinance
  Attempting uninstall: pytz
    Found existing installation: pytz 2022.1
    Uninstalling pytz-2022.1:
      Successfully uninstalled pytz-2022.1
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
beatrix-jupyterlab 3.1.7 requires google-cloud-bigquery-storage, which is not installed.
pandas-profil

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import yfinance as yf
import datetime
from datetime import date,timedelta


In [3]:
end_date = date.today().strftime("%Y-%m-%d")
start_date = (date.today() - timedelta(days=365)).strftime("%Y-%m-%d")

# print(start_date,end_date )

In [4]:
data = yf.download('GOOG', start=start_date, end=end_date, progress=False)
data.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-01-12,141.554504,142.814255,141.112,141.647995,141.647995,23642000
2022-01-13,141.8405,143.185501,138.914001,139.130997,139.130997,26566000
2022-01-14,137.5,141.2005,137.5,139.786499,139.786499,23826000
2022-01-18,136.600006,137.391495,135.617004,136.290497,136.290497,27382000
2022-01-19,136.938507,138.399506,135.5,135.651993,135.651993,20796000


In the above dataframe 'Date' field is used as index, lets reset index and move 'Date' field as another column.

In [5]:

data.insert(0,'Date',data.index)
data.reset_index(drop=True, inplace=True)
data

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,2022-01-12,141.554504,142.814255,141.112000,141.647995,141.647995,23642000
1,2022-01-13,141.840500,143.185501,138.914001,139.130997,139.130997,26566000
2,2022-01-14,137.500000,141.200500,137.500000,139.786499,139.786499,23826000
3,2022-01-18,136.600006,137.391495,135.617004,136.290497,136.290497,27382000
4,2022-01-19,136.938507,138.399506,135.500000,135.651993,135.651993,20796000
...,...,...,...,...,...,...,...
246,2023-01-05,88.070000,88.209999,86.559998,86.769997,86.769997,23136100
247,2023-01-06,87.360001,88.470001,85.570000,88.160004,88.160004,26604400
248,2023-01-09,89.195000,90.830002,88.580002,88.800003,88.800003,22996700
249,2023-01-10,86.720001,89.474998,86.699997,89.239998,89.239998,22855600


For analyzing stock market, it is always best to start with 'Candlestick' chart

In [6]:
fig = go.Figure(data = [go.Candlestick(x=data["Date"],
                                       open = data["Open"],
                                       high = data["High"],
                                       low = data["Low"],
                                       close = data["Close"]
                                      )
                       ]
               )
fig.update_layout(title='Google Stock Price Analysis', xaxis_rangeslider_visible=False)
fig.show()

Bar plot is also a handy visualization to analyze the stock market, specifically in the long term. Below is how to visualize the close prices of Google's stock using a bar plot.

In [7]:
fig = px.bar(data, x="Date", y="Close")
fig.show()

One of the valuable tools to analyze stock market is a range slider. It helps us analyze the stock market between two specific points by interactively selecting the time period.

Below is an example for the same.

In [8]:
fig = px.line(data, x='Date', y='Close', title='Stock market analysis with Rangeslider')
fig.update_xaxes(rangeslider_visible=True)
fig.show()

Anotherinteactive feature we can add for stock market analysis is time period selector. Time period selectors are like buttons that show us the graph of a specific time period.For example, a year,three months, six months, etc., 

Below is an example of the same.

In [9]:
fig = px.line(data, x='Date', y='Close', title='Stock market analysis with Time Period Selections')
fig.update_xaxes(
    rangeselector = dict(
        buttons = list([
            dict(count=1, label='1m', step='month', stepmode='backward'),
            dict(count=3, label='3m', step='month', stepmode='backward'),
            dict(count=6, label='6m', step='month', stepmode='backward'),
            dict(count=1, label='1y', step='year', stepmode='backward'),
            dict(step='all')
        ])
    ))
fig.show()

The weekend or holiday season always affects the stock market. So if you want to remove all the records of the weekend trends from your stock market visualization.

In [10]:
fig = px.scatter(data, 
                 x='Date',
                 y='Close', 
                 range_x=['2022-01-01', '2023-12-01'],
                 title='Stock Market Analysis by Hiding Weekend Gaps'
                )
fig.update_xaxes(rangebreaks=[dict(bounds=['sat','sun'])])
fig.show()