**Yes Bank_Stock Market Analysis EDA Project**

# **Project Name**    - Exploring_Yes_Banks_Stock_Market_analysis



##### **Project Type**    - EDA
##### **Contribution**    - Individual
##### **Name**            - Deepak Kumar


# **Project Summary -**

**YES Bank Stock Performance Analysis Post-Fraud Case (FY 2018)**

**Overview 🌟**

YES Bank, a leading Indian bank operating nationwide since 2004 🏦, faced a major setback in FY 2018 due to a high-profile fraud case involving its co-founder, Rana Kapoor 💼. This scandal severely impacted the bank's stock price 📉, leading to a loss of investor confidence and significant market volatility.

In this project, I analyzed 184 trading sessions 📊 of YES Bank's stock data to uncover insights and patterns. The dataset included key metrics such as opening price 🟢, closing price 🔴, highest price ⬆️, and lowest price ⬇️. Through exploratory data analysis (EDA) 🔍 and data visualization 📊, I aimed to understand the impact of the fraud case on stock performance, identify trends, and derive actionable insights for investors and stakeholders.

**Detailed Analysis 📑**

**1. Problem Statement 🎯**

**Objective:** Analyze YES Bank's stock performance post the FY 2018 fraud case to understand its impact on stock prices and investor sentiment.

**Key Questions:**

How did the stock price trend change after the scandal?

What were the key patterns in monthly stock performance?

How did the fraud case influence market volatility and investor behavior?

**2. Dataset Description 📂**

The dataset contains 184 trading sessions of YES Bank's stock data, including:

Opening Price 🟢: The stock price at the start of the trading day.

Closing Price 🔴: The stock price at the end of the trading day.

Highest Price ⬆️: The maximum price reached during the trading day.

Lowest Price ⬇️: The minimum price reached during the trading day.

**3. Approach and Methodology 🔧**

**Step 1: Data Collection and Cleaning 🧹**

Collected historical stock price data for 184 trading sessions.

Cleaned the dataset to handle missing values, inconsistencies, and outliers.

**Step 2: Exploratory Data Analysis (EDA) 🔍**

Performed EDA to identify trends, correlations, and anomalies in the stock price movements.

Analyzed monthly trends to detect patterns in opening, closing, highest, and lowest prices.

**Step 3: Data Visualization 📊**

**Created interactive visualizations using Matplotlib and Seaborn to make the data more understandable:**

**Line charts** to track stock price trends over time.

**Bar charts** to compare monthly high and low prices.

**Histograms** to understand the distribution of closing prices.

**Heatmaps** to highlight correlations between variables like Open, Close, High, and Low.

**Box plots** and **violin plots** to analyze price volatility.

**Scatter plots** to visualize relationships between opening and closing prices.

**Pie charts** to show the percentage of high and low prices.


**Step 4: Insight Generation 💡**

Derived actionable insights to understand the stock's performance before, during, and after the fraud case.

Identified key factors influencing stock price volatility and recovery patterns.

**4. Key Findings 📊**

**Sharp Decline Post-Scandal 📉**

The stock price experienced a significant drop immediately after the fraud case came to light, reflecting a loss of investor trust.

**Increased Volatility 🌪️**

Higher volatility was observed in the months following the scandal, with frequent fluctuations in opening and closing prices.

**Recovery Patterns 🔄**

Identified periods of partial recovery, indicating attempts by the market to stabilize the stock price.

**Monthly Trends 📅**

Certain months showed consistent patterns, such as higher trading volumes or price spikes, which could be linked to external market factors.

**5. Tools and Technologies Used 🛠️**

Python: For data processing and analysis.

Pandas & NumPy: For data cleaning and manipulation.

Matplotlib & Seaborn: For data visualization.

Jupyter Notebook: For interactive analysis and reporting.

**6. Impact and Value Addition 💡**

This analysis provided a clear understanding of how the fraud case impacted YES Bank's stock performance and highlighted key trends in its recovery journey. The insights can be valuable for:

**Investors:** To make informed decisions based on historical trends.

**Analysts:** To understand market behavior during crises.

**Stakeholders:** To assess the long-term impact of the scandal on the bank's financial health.

# **GitHub Link -**

https://github.com/Deepakkumar7774/Yes-Bank-EDA-Project

# **Problem Statement**


**Yes Bank, a prominent Indian bank, experienced a significant financial scandal in FY 2018, which heavily impacted its stock prices. Understanding the factors influencing stock behavior—such as trends, volatility, and correlations—has become critical for making informed investment decisions and devising strategic financial plans. 📈💼 This project aims to analyze historical stock price data, including variables like opening, closing, high, and low prices, to uncover insights and patterns that can support investors and stakeholders in navigating future market scenarios. 📊🔍**

#### **Define Your Business Objective?**

**The objective is to explore data anaylsis on the given data to uncover insights and patterns.**

1. Understanding the overall trends of price.
2. Identifying any significant change in the stock price.
3. Analyzing any volatility of the stock price.
4. And finding an overview and effects on the stock over a period of time.

# ***Let's Begin !***

### Import Libraries

In [None]:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np


### Dataset Loading

In [None]:
import pandas as pd
file_path = ("/content/data_YesBank_StockPrices.csv")
df = pd.read_csv(file_path)
print(df.head())
print(df.tail())

### Dataset First View

In [None]:
print(df.info())


### Dataset Rows & Columns count

In [None]:
print("The total count of rows & coloumns:", df.shape)


#### Duplicate Values

In [None]:

print(df.duplicated().sum())

#### Missing Values/Null Values

In [None]:
print(df.isnull().sum())

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(10,6))
sns.heatmap(df.isnull(), cbar=False, cmap='viridis', yticklabels=False)
plt.title('Heatmap of missing values')
plt.show()

The given data shows the high, low, open and close price, and has there is no missing values we will proced with the further analysis.

## ***2. Understanding Your Variables***

In [None]:
print(df.columns)

In [None]:
print(df.describe)

### Variables Description

**The data has 184 trading session days**

1. Date = Represents the month and year of stock data.	Datetime - Used to analyze trends over time and perform time-series analysis.
2. Open = Opening price of the stock for the month. Numerical	INR (₹) - Helps identify how the stock starts trading each month.
3. High = Highest price of the stock for the month. Numerical	INR (₹) - Indicates the stock's peak performance for the month.
4. Low	= Lowest price of the stock for the month.	Numerical INR (₹) - Highlights the minimum price level, useful for understanding volatility.
5. Close	Closing price of the stock for the month. Numerical INR (₹)	- Shows the stock's final value for the month, essential for trend analysis.

**Check Unique Values for each variable.**

In [None]:

for column in df.columns:
    print(f"Column: {column}")
    print(f"Unique Values: {df[column].unique()[:10]}")
    print(f"Total Unique Values: {df[column].nunique()}")
    print("-" * 40)


## ***4. Data Visualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1 Line Chart: Stock Price Over Time

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Ensure 'Date' column is in datetime format
df['Date'] = pd.to_datetime(df['Date'], format='%b-%y')

# Plot Closing Prices Over Time
plt.figure(figsize=(12, 6))
plt.plot(df['Date'], df['Close'], label='Closing Price', color='blue')
plt.title('Yes Bank Stock Closing Price Over Time', fontsize=16)
plt.xlabel('Date', fontsize=12)
plt.ylabel('Closing Price (₹)', fontsize=12)
plt.legend()
plt.grid()  # Add grid for better readability
plt.tight_layout()  # Adjust layout
plt.show()

* line chart helps to understand the flow and trace accurate price, it highlights, sharps changes in price, making an ideal for time-series analysis.

* Here line chart is showing the closing price of stock traded during the trading hour which can be referred to know the recent and all time low, which has crossed below 25rs by 2020.

* Business Impact:
   1. Through this chart stakeholders can predict the performance of the stock and how it's performing, which can help to plan and decide the best area/zone to trade and invest (i.e - support and resistance zone)
   2. The chart showcase that in long term fiancial impact of the scandal reflecting a loss in investor confidence and trust.

#### Chart - 2 Bar chart: Monthly Stock Prices

In [None]:
plt.figure(figsize=(12,6))
plt.bar(df['Date'], df['High'], color='purple')
plt.title('Monthly High Price')
plt.xlabel('Date')
plt.ylabel('High Price')
plt.show()

* Bar chart helps to know the High createed by the stock on the monthly basis covering the period from 2006-2020. it represent the fluctions in the highest stock prices making it easier to understand the changes or patterns and to compare.

* Here bar chart has been used to present the high created on mainly monthly basis where it has created roboust performance around year near 2016-2018 and decline after the scandal came out in year 2018 and price droped insiginficantly and stock failed to recover it and continue to create lower low.

* Business impact:

  1. Investment Timing :- Investors can identify the higher high resistance where he can determine the entry or exit points.
  2. Risk Assesment :- By Observing the chart one can understand the after the 2018 the stock has fallen insignificantly and making investment in such stock can loose capital. so it is better to settle the stock and then invest.
  3. Marketing strategy/ Historical Insights :- Through this chart stakeholders can understand and create the strategy for similar market scenario.

#### Chart - 3 Histogram: Distribution of closing price

In [None]:
plt.figure(figsize=(10,6))
plt.hist(df['Close'], bins=20, color='red', edgecolor='black')
plt.title('Distribution of closing price')
plt.xlabel('closing price')
plt.ylabel('Frequency')
plt.show()

* The Histogram has been used to display the frequency distribution of closing prices over the given dataset. This type of chart is ideal for understanding how stock prices are distributed within specific ranges.

*  The majority of closing prices fall within the ₹0-₹50 range, highlighting a concentration in lower price levels.As the closing price increases, the frequency decreases significantly, with very few observations above ₹200. This indicates that Yes Bank's stock prices were mostly traded at lower levels during the analyzed period, likely reflecting the impact of financial challenges.

* This chart show the low made by the stock price, which again shows that stock was making lower low by each trading session, which can help the investors and company to know till where the support level is there and from where it can support and bounce back.

#### Chart - 4 Bar plot: Price Volaltity

In [None]:
plt.figure(figsize=(10,6))
sns.boxplot(data=df[['Open', 'High', 'Low', 'Close']])
plt.title('Stock Price Volatility')
plt.ylabel('Frequency')
plt.show()

*  The Box Plot was selected because it provides a clear and concise summary of the distribution and variability (volatility) of stock prices for the four categories: Open, High, Low, and Close prices. It visually highlights the central tendency, spread, and outliers, which are critical for understanding the behavior of stock prices over time. Box plots are especially useful for comparing multiple categories side by side.

*  The chart displays the spread of stock price data for each category (Open, High, Low, Close) through quartiles, The median line within each box indicates the central value, showing the typical price for each category. The wiskers extend to data pointa within 1.5 time the interquartile range providing into the range of prices.

* Business Impact:-
   1. Risk Management: By analyzing volatility, businesses and investors can identify periods of heightened risk and strategize accordingly.
   2. Investment Strategy: Outliers and range help investors decide on entry and exit points for their, optiming profitability.

#### Chart - 5 Scatter plot: Opening vs. Closing Prices

In [None]:
# Chart 13: Stock price scatter plot
plt.figure(figsize=(10,6))
plt.scatter(df['Date'], df['Close'])
plt.title('Stock Prices Scatter Plot')
plt.xlabel('Date')
plt.ylabel('Price')
plt.show()

* The Scatter Plot has been used because it is ideal for displaying the individual data points over time, allowing a clear visualization of the fluctuations and trends in Yes Bank's stock prices.

* The chart shows the trend and fluctuations in Yes Bank stock prices from 2006 to 2020.
It highlights a gradual rise in stock prices from 2006 until around 2014, followed by a sharp increase peaking in 2018.After 2018, there is a steep decline, reflecting the impact of the financial fraud case and loss of investor confidence.

* Scatter Plot help to visualize the opening vs. closing price of stock, Businesses can strategize on stock derivatives and predict market behavior efficiently using such datachart provides insights into the historical performance of Yes Bank's stock, helping stakeholders understand the trends and critical events affecting stock prices.

#### Chart - 6 Pie Chart:percentage of high and low price

In [None]:
mean_high = df['High'].mean()
mean_low = df['Low'].mean()
labels = ['high', 'Low']
sizes = [mean_high, mean_low]

plt.figure(figsize=(10,6))
plt.pie(sizes, labels=labels, autopct = '%1.1f%%', colors =['gold', 'lightblue'])
plt.title('High vs low prices (percentage)')
plt.show()


* The Pie Chart has been used because it provides a clear and simple visual representation of the proportion of high prices versus low prices in the datase.

* The chart shows that 55.0% of the data represents high prices, while 45.0% represents low prices.
This highlights a slight dominance of high prices in the dataset, indicating that during the period analyzed, high prices occurred more frequently than low prices.

* This pie charts shows that maximum time the stock has crated higher high which had a good sign with good pontential to perform great, and which was performing but due to scam it has fallen from its high.

#### Chart - 7 Rolling mean chart: Smoothed closing Price

In [None]:
df['SMA_3'] = df['Close'].rolling(window=3).mean()  # 3-month rolling average
plt.figure(figsize=(12, 6))
plt.plot(df['Date'], df['Close'], label='Closing Price', alpha=0.6)
plt.plot(df['Date'], df['SMA_3'], label='3-Month SMA', color='red')
plt.title('Closing Prices with 3-Month Rolling Mean')
plt.xlabel('Date')
plt.ylabel('Closing Price')
plt.legend()
plt.show()


* This Line Chart with a 3-Month Rolling Mean (SMA) has been used to track and smooth the fluctuations in Yes Bank’s closing stock prices over time (2005–2020). The actual closing prices (blue line) represent raw data, while the rolling mean (red line) highlights the underlying trend by reducing short-term noise. This makes it easier to spot consistent trends, peaks, and dips, which are critical for long-term analysis.

* The blue line represents actual closing stock prices, showcasing significant fluctuations.The red line smooths these prices using a 3-month rolling average, revealing a clearer picture of price trends.

* Business Impact:
  1. Long-Term Trend Analysis: The rolling mean provides insights into the stock’s broader performance trends, helping businesses and investors plan long-term strategies.

  2. Noise Reduction for Clarity: By smoothing out daily volatility, the rolling mean offers a reliable basis for decision-making.

  3. Market Behavior Insights: The chart highlights the significant price drop post-2018, emphasizing the lasting impact of external events like the scandal.

  4. Investment Strategies: This visualization helps identify support levels and patterns, enabling investors to make informed decisions on buying, selling, or holding stocks.

#### Chart - 8 Violin Plot: Distribution of Prices

In [None]:
plt.figure(figsize =(8,6))
sns.violinplot(data=df[['Open', 'High', 'Low', 'Close']])
plt.title('violin plot of stock price')
plt.show()

* This Line Chart with a 3-Month Rolling Mean (SMA) has been used to track and smooth the fluctuations in Yes Bank’s closing stock prices over time (2005–2020). The actual closing prices (blue line) represent raw data, while the rolling mean (red line) highlights the underlying trend by reducing short-term noise. This makes it easier to spot consistent trends, peaks, and dips, which are critical for long-term analysis.

* 1. The blue line represents actual closing stock prices, showcasing significant fluctuations.
The red line smooths these prices using a 3-month rolling average, revealing a clearer picture of price trends.
  2. It highlights major events, such as the peak before 2018, followed by a steep decline after the financial fraud scandal.
  3. Toward 2020, the chart shows lower closing prices with limited signs of recovery.

* Business Impact:-
   1. Long-Term Trend Analysis: The rolling mean provides insights into the stock’s broader performance trends, helping businesses and investors plan long-term strategies.
   2. Noise Reduction for Clarity: By smoothing out daily volatility, the rolling mean offers a reliable basis for decision-making.

#### Chart - 9 Pair Plot: Relationships Between Variables

In [None]:
plt.figure(figsize=(8, 6))
sns.kdeplot(df['Close'], shade=True, color='navy')
plt.title('Closing Price Density Plot')
plt.xlabel('Closing Price')
plt.ylabel('Density')
plt.show()

* This chart is excellent for visualizing the distribution of closing prices over the trading sessions. A density plot helps identify patterns, clusters, and where most of the data is concentrated, without the granularity of individual points like a scatterplot or the rigidity of a histogram.

* The plot shows that most of the closing prices are clustered around ₹50, as indicated by the peak in density in that range. The density gradually declines as prices move away from this peak, with a smaller secondary bump around ₹300, indicating some trading activity there as well.

* Chart clearly shows that the density of of the stock closing priced are concentrated. represent the most probable price ranges, helping investors determine the price points at which the stock is commonly traded. This is crucial for assessing the stock's market demand and stability.

#### Chart - 10 Stacked Bar Chart: Comparing High, Low, Open, and Close

In [None]:
import numpy as np

ind = np.arange(len(df))  # X-axis indices
plt.figure(figsize=(12, 6))
plt.bar(ind, df['Open'], label='Open', color='orange')
plt.bar(ind, df['Close'], bottom=df['Open'], label='Close', color='blue')
plt.bar(ind, df['High'], bottom=df['Close'], label='High', color='green')
plt.bar(ind, df['Low'], bottom=df['High'], label='Low', color='red')
plt.title('Stacked Bar Chart of Prices')
plt.xlabel('Days')
plt.ylabel('Price (₹)')
plt.legend()
plt.tight_layout()
plt.show()


* Using a stacked format provides a cumulative view of these price types on a daily basis, making it easier to observe the interplay and relative differences.

* The chart visually represents how the prices (Open, Close, High, and Low) are distributed for each trading day over the specified period. For instance, it shows:
Daily price fluctuations.
Comparisons between the highest and lowest prices on any given day.
Trends and anomalies within the range of price

* Business Impact:-
   1. Market Dynamics Insight: By examining the interplay of these prices, one can better understand market sentiment and trading activity.
   2. Trend Analysis: Businesses and investors can identify recurring patterns over time, such as regular peaks in High prices or consistency in Closing prices.

#### Chart - 11 - Correlation Heatmap

In [None]:
plt.figure(figsize=(8, 6))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()


##### 1. Why did you pick the specific chart?

Tool for visualizing relationships between numerical variables in the dataset helps in identifying patterns or dependencies among stock price variables, which are crucial for financial analysis and prediction.

##### 2. What is/are the insight(s) found from the chart?

The open, high, low, close price exhibit a very high positive correlation with one another (close to 1) which tends to move together, meaning one price increases, others generally so as well.
These insights highlight the strong interdependence of stock price variables, which is expected in financial data. It also suggests that these factors can collectively provide valuable inputs for predicting trends or patterns in stock prices.

#### Chart - 12 - Pair Plot

In [None]:
sns.pairplot(df[['Open', 'High', 'Low', 'Close']])
plt.show()


##### 1. Why did you pick the specific chart?

Pair plot help to visualizing both the distribution of individual variables and realationships with them, it helps how open. close, low, high correlate with each other, it also highlight the patterns and dependencies through scatter plots, while histogram on the diagonal give plots, and diagonal give insights into distribution of each variable.

##### 2. What is/are the insight(s) found from the chart?

Chart clearly shows that every aspect shows that every variable are interdependent each other, which shows the stock price often cluster at lower ranges and there is conistency among chart which can be help in prediction of price.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?


**The analysis of Bank stock reveals key siginificant drops the business objective.**

  * Trend Analysis: The line chart highlights that before the Kapoor sacandal has came out that is before 2018 the stock was performing very     well creating higher hhigher but all those trend changes and drop of price can help business to prepare strategize for similar market        scenarios.
  * Volatility Analysis: Box plot and violin plots reveal periods of high volatility, helping the stakeholders to identify the patterns and      adjust risk management stratgies.
  * Behavioural Pattern: Correlation heatmaps and pair plots emphasize the strong interdepandece of variables like open,close,high,low  price     which can be leveraged for predictive modeling and decision making.
  * Support and Resistance: Histograms and scatters plot show recurring price ranges,helping to identify the possible support and resistance     of the stock price  which can be used for critical investment for making higher profits.
  * Market Recovery Potential: Rolling average and dennsity plots highlight the possiblity of market recovery, providing investors,stakeholers with a roadmap for pontential buyback or sell off startegies.

# **Conclusion**

The exploratory data analysis provides a comprehensive understanding of Yes Bank's stock performance before and after the scandal. The findings indicate a severe decline in stock prices following the 2018 scandal, along with notable periods of volatility. Despite the challenges, the data also shows potential recovery trends and strong interrelations among stock price variables. 📉📈

These insights can serve as a foundation for data-driven decision-making. Whether it's for investment strategies, risk mitigation, or forecasting, businesses can confidently use this analysis to navigate Yes Bank's market trends and make informed decisions going forward. 📊🔍

**Hurrah! I have successfully completed your EDA Capstone Project !!!**