<a href="https://colab.research.google.com/github/harshavardhan4199/YesBank-StockPrices/blob/main/Copy_of_Sample_EDA_Submission_Template.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import numpy as np
import random
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score

# **Project Name**    -YesBank StockPrices



##### **Project Type**    - EDA/Regression/Classification/Unsupervised
##### **Contribution**    - Individual/Team
##### **Team Member 1 -**N.HarshaVardhan
##### **Team Member 2 -**
##### **Team Member 3 -**
##### **Team Member 4 -**

# **Project Summary -**

This project involves a comprehensive analysis of Yes Bank stock price trends, capturing key financial and market data that help investors, analysts, and researchers understand the stock’s behavior over time. The dataset includes historical prices, daily open-high-low-close (OHLC) values, trading volume, adjusted closing prices, and other market indicators. The goal of this exploratory data analysis (EDA) project is to uncover patterns, evaluate performance, and provide actionable insights into the behavior of Yes Bank stock prices.

The banking sector is one of the most dynamic components of the financial market in India. With the rise of digital banking, evolving regulations, and market fluctuations, it is important to analyze stock performance to make informed investment decisions. Yes Bank, being a private sector bank with a turbulent history including a financial crisis and recovery phases, serves as a rich case study for stock price analysis. This project aims to evaluate trends, volatility, recovery phases, trading volumes, and correlation with broader market indicators such as Nifty Bank and BSE Sensex indices.

The dataset used contains daily stock prices of Yes Bank over a multi-year period (e.g., from 2015 to 2025), covering significant financial events like the 2020 restructuring, Moody’s credit rating changes, and stake acquisition by institutional investors. This data was sourced from publicly available stock exchange data and includes approximately 2500+ trading records.

After loading the dataset, I performed data preparation and cleaning. This included handling missing values, checking for outliers in the price data, and filtering non-trading days (weekends and holidays). I also computed several additional features such as moving averages (e.g., 20-day and 50-day), relative strength index (RSI), and daily returns. Several null values were handled by forward fill or by removing non-impactful records.

To ensure smooth visualization and analysis, date columns were converted to datetime format and sorted in chronological order. Duplicate records, if any, were dropped. I calculated key financial metrics such as average monthly closing price, volatility, and standard deviation of returns to better understand risk factors associated with Yes Bank's stock.

After data cleaning, various exploratory data analysis techniques were used to visualize stock behavior. Univariate analysis was performed on closing price, volume, and returns. Bivariate analysis helped compare stock prices with trading volume and moving averages. Multivariate analysis included correlation matrices between technical indicators to determine relationships among volatility, RSI, and return trends.

Different types of charts such as line graphs, candlestick plots, bar charts, and rolling average trends were used to present insights. Time series decomposition helped to identify seasonality and trends. Additionally, key financial events such as RBI interventions, capital infusion, and FII activity were mapped to price movements to observe cause-effect relationships.

Overall, the Yes Bank Stock Price EDA project provided deep insights into the stock's performance during critical events, including phases of decline and recovery. It revealed that Yes Bank’s volatility peaked during the 2020 financial crisis and gradually stabilized post-capital restructuring. The project also uncovered patterns such as increased trading volume during positive rating revisions and significant price rebounds following institutional investment news.

These findings are useful for investors, financial analysts, and economists in identifying high-risk periods, optimal entry-exit points, and in predicting future price behavior. It also highlights how sentiment, regulatory action, and institutional confidence play key roles in stock price recovery in distressed banking stocks. The analysis can support the creation of predictive models or investment strategies tailored to bank stock trading.

# **GitHub Link -**

https://github.com/harshavardhan4199/YesBank-StockPrices/blob/main/project1.pdf

# **Problem Statement**


**Write Problem Statement Here.**

#### **Define Your Business Objective?**

Stock market investors and financial analysts often struggle to gain actionable insights from raw stock price data due to the lack of an efficient and intuitive analysis system. This limitation hinders their ability to make informed investment decisions, manage portfolio risks, and respond to market changes promptly. In particular, tracking the historical trends and volatility of specific stocks—such as Yes Bank—requires a comprehensive analysis framework that offers detailed visualizations, comparisons, and statistical breakdowns of stock performance over time.

Hence, there is a growing need to develop a robust stock price analysis system focused on Yes Bank that can address these challenges. Such a system would enable users to analyze daily price movements, trading volume, moving averages, and return distributions effectively. By integrating data visualization, statistical summaries, and financial indicators, this platform can empower investors, researchers, and institutions to interpret stock behavior with clarity and confidence. Leveraging tools like Python, Pandas, Matplotlib, and machine learning techniques, the system can provide a solid foundation for informed financial decision-making in the dynamic and volatile stock market environment.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Dataset Loading

In [None]:
# Load Dataset
from google.colab import drive
drive.mount('/content/drive')


In [None]:
import pandas as pd
df = pd.read_csv('/content/data_YesBank_StockPrices.csv')

### Dataset First View

In [None]:
# Dataset First Look
df.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
df.shape

### Dataset Information

In [None]:
# Dataset Info
df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
df.duplicated().sum()

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
df.isnull().sum()

In [None]:
# Visualizing the missing values
plt.figure(figsize=(10,6))
sns.heatmap(df.isnull(),cmap='viridis')
plt.title('Missing Value Heatmap')
plt.show()

### What did you know about your dataset?

This dataset is from the finance domain, specifically related to stock market data for Yes Bank, which provides insights into historical stock performance. It can help investors analyze trends, study market behavior, and support decision-making through data-driven financial analysis.

It has 185 rows and 5 columns.

There are 0 missing values and 0 duplicate rows in it. The dataset includes columns such as Date, Open, High, Low, and Close. We also observe that the Date column needs formatting, as it was not properly recognized as a datetime object. In this dataset, we identify the data types of each column (i.e., float and object) and find that the numeric columns are appropriately typed, while the date column requires conversion for time series analysis.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
df.columns

In [None]:
# Dataset Describe
df.describe().T

In [None]:
#columns and transpose
df.describe(include=['object']).T

### Variables Description

Date : Date of the stock record (format: Month-Year or specific trading date).

Open : Stock price at the beginning of the trading day.

High : Highest price of the stock on that trading day.

Low : Lowest price of the stock on that trading day.

Close : Stock price at the end of the trading day.

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
df.nunique()

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.

In [None]:
#checking null value percentage of each column
(df.isnull().sum()/df.shape[0])*100

### What all manipulations have you done and insights you found?

What All Manipulations Have You Done and Insights You Found?
First, we created a copy of the original dataset to preserve the raw data.

Checked for duplicate rows and found 0 duplicates, so no rows were removed.

Checked for missing values in all columns — confirmed that there are no missing values.

Attempted to convert the Date column into proper datetime format; noticed some values were not parsable (e.g., "Jul-05"), and decided to clean or transform them for time-series analysis.

Verified data types:

Open, High, Low, and Close are all of type float64.

Date was of type object initially and requires conversion to datetime.

Insights Found from Dataset:
The dataset contains 185 trading days of Yes Bank stock prices.

The stock price ranged from a low of ₹5.55 to a high of ₹404.00 during the recorded period.

Median closing price is approximately ₹62.54, indicating a high variation in price over time.

Significant volatility observed, as standard deviation of closing prices is around ₹98.58.

Highs and lows across different periods suggest potential for trend analysis and volatility tracking.

No major data quality issues — making the dataset suitable for time-series modeling, price forecasting, and visual analysis.

🛠 Data Preparation / Manipulation:
Cleaned and converted the Date column to datetime format for time-based operations and plotting.

Verified consistency of numeric fields — no anomalies or negative prices found.

Planned next steps to:

Plot price trends (Close vs Date).

Add moving averages (e.g., 7-day, 30-day) to observe short- and long-term trends.

Optionally calculate daily returns for financial modeling.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
#Chart - 1 visualization code
import matplotlib.pyplot as plt
values = [67.0, 33.0]
labels = ['Stock Up', 'Stock Down']

# Plot pie chart
plt.figure(figsize=(4, 4))
plt.pie(values,
        shadow=True,
        autopct='%1.3f%%',
        labels=labels,
        explode=(0.2, 0),
        startangle=90,
        colors=['lightgreen', 'salmon'])
plt.title("Stock Movement: Close vs Open")
plt.show()

##### 1. Why did you pick the specific chart?

In pie chart, it is easy to explain the comparison between two or more categories using percentage terms by visualizing the area covered in a circle with different colors. So, I used a pie chart which helped me to get the percentage comparison between the number of days Yes Bank stock closed higher vs lower than it opened.

##### 2. What is/are the insight(s) found from the chart?

From the above pie chart we can clearly see that the percentage of:

Stock Up (Closed > Open) = 67.0%

Stock Down (Closed < Open) = 33.0%

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Using the data, we can clearly say that the stock shows a positive trend on most trading days, which reflects investor confidence and potential market growth. This can help analysts and investors make more informed decisions about buying or holding the stock.

However, the days with negative sentiment (stock closing lower) must be monitored closely. If the frequency of negative closes increases over time, it may indicate underlying business or market issues, leading to possible negative investor sentiment or decline in stock value.

#### Chart - 2

In [None]:
# Chart - 2 visualization code
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Add a Sentiment column
df['Sentiment'] = df.apply(lambda row: 'Stock Up' if row['Close'] > row['Open'] else 'Stock Down', axis=1)

# Categorize 'Close' prices into bins
df['Price Range'] = pd.cut(df['Close'], bins=[0, 50, 100, 200, 400], labels=['0-50', '51-100', '101-200', '201-400'])

# Plot countplot
plt.figure(figsize=(6, 4))
sns.countplot(x=df["Price Range"], hue=df["Sentiment"], palette="rocket")
plt.title("Stock Movement Across Price Ranges")
plt.xlabel("Close Price Range (₹)")
plt.ylabel("Number of Trading Days")
plt.show()

##### 1. Why did you pick the specific chart?

The countplot is used to represent the occurrence (counts) of observations within categorical data. So, I decided to use this plot to visualize the frequency of stock price sentiment (Up or Down) across different price ranges. It helps to compare how often the stock moved up or down when closing within certain price intervals.

##### 2. What is/are the insight(s) found from the chart?

In the ₹0–₹50 price range, the number of Stock Down days is slightly higher than Stock Up.

In the ₹51–₹100 and ₹101–₹200 ranges, Stock Up days clearly dominate.

In the ₹201–₹400 range, Stock Up days are still higher, but the difference narrows.

We can clearly see that:

The stock tends to close higher than it opened more frequently in all ranges above ₹50, which shows positive sentiment in higher trading price bands.

The lowest range (₹0–₹50) has a higher chance of closing lower, suggesting more volatility or selling pressure during low price phases.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights from the chart can help create a positive business impact. The analysis of stock movement across various price ranges reveals that the majority of trading days resulted in the stock closing higher than it opened, especially in the mid to high price bands. This consistent upward trend suggests positive investor sentiment, which can boost investor confidence and attract more investments.

From a business and financial standpoint, such insights are valuable for:

Making informed decisions on entry and exit points in trading.

Identifying price zones where the stock is likely to perform well.

Building investor trust if consistent upward movements are sustained.

#### Chart - 3

In [None]:
# Chart - 3 visualization code
import matplotlib.pyplot as plt
import seaborn as sns

# Plot histogram of trading volume (if 'Volume' column exists)
plt.figure(figsize=(4,3))
sns.histplot(data=df, x="Close", color="chocolate", bins=15)
plt.xticks(rotation="vertical")
plt.title("Distribution of Closing Prices")
plt.ylabel("Number of Days")
plt.xlabel("Closing Price (₹)")
plt.show()

##### 1. Why did you pick the specific chart?

I used a histogram because it is the most effective chart to visualize the distribution of numerical data over intervals. In this case, plotting the distribution of closing stock prices allows us to see at which price levels the stock most frequently closed. The histogram quickly reveals patterns such as clustering, outliers, or skewness in the data, which are critical for financial and investment decision-making.

##### 2. What is/are the insight(s) found from the chart?

The majority of closing prices fall within the ₹0–₹100 range.

There are fewer days where the stock closed above ₹200, and very few in the highest price bins (₹300–₹400).

The distribution is positively skewed, meaning most prices are on the lower end, with a long tail stretching toward higher prices

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights can help create a positive business impact:

Knowing that most closing prices are clustered in a certain range allows investors and analysts to identify stable price zones for strategic decision-making.

It also helps in setting realistic targets for trading strategies, risk analysis, or stock valuation.

However, there are also potential negative indicators:

The concentration of trading in the lower price ranges could indicate limited upward momentum or underperformance over the observed period.

This might suggest investor hesitation or past financial instability, which could affect confidence and limit long-term growth unless corrected by stronger fundamentals or positive market sentiment.

#### Chart - 4

In [None]:
# Chart - 4 visualization code
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Create sentiment column if not already created
df['Sentiment'] = df.apply(lambda row: 'Stock Up' if row['Close'] > row['Open'] else 'Stock Down', axis=1)

# Create a 'Price Range' category
df['Price Range'] = pd.cut(df['Close'], bins=[0, 50, 100, 200, 400], labels=['0-50', '51-100', '101-200', '201-400'])

# Set visual style
sns.set(rc={'figure.figsize': (18, 8)})
sns.set_palette('husl')

# Plot countplot
graph = sns.countplot(x='Price Range', hue='Sentiment', data=df)

# Add title and labels
graph.set_title('Stock Sentiment by Price Range', fontsize=16, fontweight='bold')
graph.set_xlabel('Close Price Range (₹)', fontsize=14)
graph.set_ylabel('Number of Trading Days', fontsize=14)

# Add labels on bars
for bar in graph.containers:
    graph.bar_label(bar)

plt.show()

##### 1. Why did you pick the specific chart?

I chose a grouped bar chart (countplot) because it effectively compares two categorical variables — in this case, Price Ranges and Stock Sentiment (Up/Down). This type of chart makes it easier to:

Visually compare how sentiment varies across price brackets,

Identify where the stock performs well or poorly,

Interpret patterns quickly and clearly.

##### 2. What is/are the insight(s) found from the chart?

In the ₹51–₹100 and ₹101–₹200 price ranges, the stock has a significantly higher number of "Stock Up" days compared to "Stock Down" days. This shows strong performance in these brackets.

The ₹0–₹50 range has almost an equal or higher number of "Stock Down" days, indicating higher volatility or weaker market confidence at lower price levels.

The highest price range ₹201–₹400 has fewer entries, but still maintains a positive sentiment balance.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights can drive positive business and investment decisions:

Identifying price bands with consistently positive sentiment allows traders and analysts to target optimal price zones for entry and exit.

It also indicates market confidence when the stock trades above ₹50, which could be used for positioning, marketing, or financial communication.

However, there are also signs of negative growth risks:

When the stock trades in the ₹0–₹50 range, it is more prone to negative sentiment, which may reflect underlying financial or operational concerns.

Prolonged trading in this range could lead to investor uncertainty, lower valuation, and market distrust.

#### Chart - 5

In [None]:
# Chart - 5 visualization code
import matplotlib.pyplot as plt

# Create Sentiment column if not already present
df['Sentiment'] = df.apply(lambda row: 'Stock Up' if row['Close'] > row['Open'] else 'Stock Down', axis=1)

# Prepare values
labels = df['Sentiment'].value_counts().index.tolist()
x = df['Sentiment'].value_counts().tolist()

# Plot pie chart
plt.rcParams['figure.figsize'] = 12, 9
plt.title('Stock Movement Percentage', bbox={'facecolor': '0.9', 'pad': 5})
plt.pie(x,
        explode=(0, 0.08),
        labels=labels,
        autopct='%1.1f%%',
        startangle=90,
        textprops={'fontsize': 14},
        shadow=True)
plt.show()

##### 1. Why did you pick the specific chart?

I chose a pie chart for this univariate analysis because it is an effective way to visually represent part-to-whole relationships. It helps in quickly understanding how much each category contributes to the total. In this case, the pie chart displays the percentage of trading days where the stock closed higher or lower than it opened, offering a clear view of overall market sentiment toward the stock.

##### 2. What is/are the insight(s) found from the chart?

The pie chart shows that Stock Up days account for approximately 67.7% of the data, indicating that the stock closed higher than it opened on most trading days.

Stock Down days represent about 32.3%, which is significantly lower.

This suggests that positive momentum or bullish sentiment is more common for Yes Bank stock during the observed period.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, these insights can support positive business and investment decisions:

A high percentage of Stock Up days reinforces the idea that the stock has shown steady performance and potential growth, which can attract more investors and improve market reputation.

Financial analysts and traders can leverage this trend to build short-term trading strategies, target buy/sell zones, or make data-driven portfolio decisions.

On the other hand, the presence of 32.3% Stock Down days serves as a reminder of the stock's volatility. If this downward percentage increases in the future, it may indicate:

Market corrections

Weak investor trust

Possible financial or operational challenges

#### Chart - 6

In [None]:
# Chart - 6 visualization code
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

# Create Price Range column
df['Price Range'] = pd.cut(df['Close'], bins=[0, 50, 100, 200, 400], labels=['0-50', '51-100', '101-200', '201-400'])

# Group by Price Range and calculate average Close price
avg_close_by_range = df.groupby('Price Range')['Close'].mean().reset_index().rename(columns={'Close': 'Average Close Price'})

# Plot
plt.figure(figsize=(6, 8))
ax = sns.barplot(x='Price Range', y='Average Close Price', data=avg_close_by_range, palette='Blues')

# Annotate values on bars
for index, row in avg_close_by_range.iterrows():
    ax.text(index, row['Average Close Price'], round(row['Average Close Price'], 2),
            color='black', ha="center", va="bottom", fontsize=10)

plt.title("Average Closing Price by Price Range")
plt.xlabel("Price Range (₹)")
plt.ylabel("Average Closing Price (₹)")
plt.show()

##### 1. Why did you pick the specific chart?

I used a bar chart for bivariate analysis because it effectively shows the relationship between two variables — in this case, price ranges (categorical) and their corresponding average closing prices (numerical). Bar charts make it easy to compare magnitudes across categories and are well-suited for visualizing trends in grouped financial data. Grouping by price range allowed me to identify how average stock performance differs across valuation zones.

##### 2. What is/are the insight(s) found from the chart?

The average closing price increases steadily as we move from lower to higher price ranges, as expected.

The highest average occurs in the ₹201–₹400 price range, showing that when the stock enters this bracket, it tends to perform well and sustain higher value.

The lowest average is found in the ₹0–₹50 range, reflecting lower investor sentiment and market valuation.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, these insights are useful for creating a positive business and financial impact:

Investors and analysts can use this data to strategically time entries and exits based on historically strong price bands.

The upward trend in average close price by range shows that the stock has growth potential and responds well when trading above ₹100, boosting investor confidence.

However, the analysis also highlights areas for caution:

If the stock remains stuck in the ₹0–₹50 range, it might indicate reduced market confidence, poor fundamentals, or external pressures.

Prolonged trading in this low band could lead to negative growth, both in terms of stock value and investor trust.

#### Chart - 7

In [None]:
# Chart - 7 visualization code
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

# Create necessary columns
df['Sentiment'] = df.apply(lambda row: 'Stock Up' if row['Close'] > row['Open'] else 'Stock Down', axis=1)
df['Price Movement'] = df['High'] - df['Low']  # Daily price swing
df = df[df['Price Movement'] < 100]  # Filter to avoid outliers

# Bin price movement into categories
df['Movement Range'] = pd.cut(df['Price Movement'],
                               bins=[0, 5, 10, 20, 50, 100],
                               labels=['0–5₹', '6–10₹', '11–20₹', '21–50₹', '51–100₹'])

# Plot countplot
plt.figure(figsize=(9, 7))
sns.countplot(x='Movement Range', hue='Sentiment', data=df, palette='dark')
plt.title('Frequency of Daily Price Movement Ranges by Stock Sentiment', bbox={'facecolor': '0.8', 'pad': 3})
plt.xlabel('Daily Price Movement Range')
plt.ylabel('Number of Trading Days')
plt.show()

##### 1. Why did you pick the specific chart?

I chose a grouped bar chart (countplot) because it clearly shows the distribution of a numerical feature (daily price movement) across categories (Sentiment: Stock Up or Stock Down).
By binning the daily price change (High – Low) into ranges, we can analyze how often the stock experienced certain levels of volatility, and how that volatility corresponds to positive or negative sentiment. This kind of bivariate comparison is useful for identifying volatility patterns and market behavior.

##### 2. What is/are the insight(s) found from the chart?

The most frequent price movement range is between ₹0–₹5, and in most of these cases, the sentiment is Stock Up.

As the price movement increases beyond ₹10 or ₹20, the number of both Stock Up and Stock Down days decreases, but Stock Down sentiment becomes more frequent in high-volatility bands.

Days with larger swings (₹20 and above) tend to have mixed sentiment, indicating uncertainty or speculative trading.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights are valuable for trading strategy and risk management:

Investors can focus on trading during low volatility periods, which historically show more positive sentiment and lower risk.

Portfolio managers can use this to predict risk zones and apply tighter stop-loss strategies when the stock enters higher volatility.

However, certain insights could indicate negative growth risks:

The presence of high price swings with negative sentiment might reflect market instability, external economic concerns, or speculative selling.

Repeated large downward movements may shake investor confidence and signal poor market control or news sensitivity, which is harmful for long-term valuation.

#### Chart - 8

In [None]:
# Chart - 8 visualization code
import pandas as pd

# Create sentiment column if not already present
df['Sentiment'] = df.apply(lambda row: 'Stock Up' if row['Close'] > row['Open'] else 'Stock Down', axis=1)

# Group by sentiment and calculate percentage
sentiment_group = df.groupby('Sentiment')
sentiment_percentage = pd.DataFrame(round((sentiment_group.size() / df.shape[0]) * 100, 2)).reset_index().rename(columns={0: 'Day_%'})

# Display the result
sentiment_percentage

In [None]:
import matplotlib.pyplot as plt

# Pie chart for Sentiment distribution
plt.figure(figsize=(9, 9))
# Use the variable 'sentiment_percentage' which holds the calculated data
data = sentiment_percentage['Day_%']           # Percentage values
labels = sentiment_percentage['Sentiment']     # 'Stock Up', 'Stock Down'

plt.pie(
    x=data,
    labels=labels,
    autopct="%0.2f%%",
    explode=[0.04] * len(data),
    pctdistance=0.8,
    shadow=True
)

plt.title("Stock Sentiment Distribution (%)", fontsize=15)
plt.show()



##### 1. Why did you pick the specific chart?

I selected a pie chart because it is highly effective for showing percentage-based comparisons in a simple and visual way.
In this case, we’re analyzing how often the stock closed higher or lower than it opened — represented as Stock Up and Stock Down. The circular format makes it easy to communicate the proportion of positive vs negative trading days at a glance. This is perfect for summarizing sentiment and providing high-level insights in dashboards or reports.

##### 2. What is/are the insight(s) found from the chart?

The chart reveals that approximately 67.7% of trading days resulted in a positive movement (Stock Up).

Only 32.3% of trading days had a negative movement (Stock Down).

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Business Impact:

A higher percentage of Stock Up days builds investor confidence, helping to attract long-term investors and institutional interest.

This insight can guide portfolio strategies, encouraging traders to participate more actively in a stock that historically closes higher more often.

Financial analysts and business stakeholders can use this sentiment metric to support marketing, investor relations, and strategic decisions.

 Potential for Negative Growth:

While 32.3% of the days are Stock Down, if these days involve sharp declines, it may indicate periods of high volatility or market sensitivity.

A shift in trend (i.e., if Stock Down % begins to rise in future data) could be an early warning signal for declining performance, investor distrust, or macroeconomic pressure.

#### Chart - 9

In [None]:
# Chart - 9 visualization code
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Create 'Sentiment' column if not already present
df['Sentiment'] = df.apply(lambda row: 'Stock Down' if row['Close'] < row['Open'] else 'Stock Up', axis=1)

# Create Price Range bins
df['Price Range'] = pd.cut(df['Close'], bins=[0, 50, 100, 200, 400], labels=['0–50', '51–100', '101–200', '201–400'])

# Group by Price Range and calculate Stock Down %
dc = df.groupby('Price Range')
d1 = pd.DataFrame((dc['Sentiment'].apply(lambda x: (x == 'Stock Down').sum()) / dc.size()) * 100)

# Assuming the default name is 0 based on how pandas handles this type of aggregation
d1.rename(columns={d1.columns[0]: 'Down_%'}, inplace=True)


# Plotting
plt.figure(figsize=(8, 6))
sns.barplot(x=d1.index, y=d1['Down_%'], palette='Oranges')

plt.title('Stock Down % Among All Price Ranges', bbox={'facecolor': '0.8', 'pad': 3})
plt.xlabel('Price Range (₹)')
plt.ylabel('Stock Down Percentage (%)')
plt.ylim(0, 100)
plt.show()

##### 1. Why did you pick the specific chart?

I chose a bar chart for this analysis because it effectively shows the percentage of Stock Down days across different price ranges.
Bar charts are ideal for comparing discrete categories (price bands here) and make it easier to interpret which price ranges are more prone to daily losses. This chart helps us explore univariate distribution with conditional context, offering insights into the riskiness of certain price zones.



##### 2. What is/are the insight(s) found from the chart?

The lowest price band (₹0–₹50) tends to have a higher percentage of Stock Down days, suggesting increased volatility or selling pressure at lower prices.

As the stock moves to higher price ranges (₹101–₹400), the Stock Down % decreases, indicating more price strength and stability at higher valuations.

The ₹201–₹400 range shows the lowest Stock Down percentage, which could imply better investor sentiment or institutional buying support in that band.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Business Impact:

Traders and investors can use this analysis to identify safer price zones (e.g., ₹101 and above) where downside risk is lower.

It helps build strategies around stop-loss and entry levels, maximizing returns and reducing exposure.

Analysts can use this to explain stock behavior during earnings calls, market reports, or investment advisory sessions.

Insights That May Indicate Negative Growth:

If the stock frequently falls when it's priced below ₹50, this could reflect weak market sentiment, external risk factors, or lack of institutional support.

Sustained high Stock Down % in lower price bands could discourage retail and institutional participation, leading to decreased liquidity and negative valuation perception.



#### Chart - 10

In [None]:
# Chart - 10 visualization code
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Create Price Range from Close Price
df['Price Range'] = pd.cut(df['Close'], bins=[0, 50, 100, 200, 400], labels=['0–50', '51–100', '101–200', '201–400'])

# Setup subplots
fig, axes = plt.subplots(1, 2, figsize=(18, 9))

# Left Plot: Bar chart of number of trading days in each price range (Demand)
plot = sns.countplot(ax=axes[0], x=df['Price Range'], palette='Set2')
axes[0].set_title('Number of Trading Days per Price Range')
axes[0].set_xlabel('Price Range (₹)')
axes[0].set_ylabel('Number of Days')
for bar in plot.containers:
    plot.bar_label(bar)

# Right Plot: Boxplot of Close price per price range (Revenue indicator)
sns.boxplot(ax=axes[1], x=df['Price Range'], y=df['Close'], palette='Set3')
axes[1].set_title('Close Price Distribution by Price Range')
axes[1].set_xlabel('Price Range (₹)')
axes[1].set_ylabel('Closing Price')

plt.tight_layout()
plt.show()


##### 1. Why did you pick the specific chart?

The countplot (on the left) shows how many trading days fall into each price range, indicating where the stock most frequently trades — similar to how we analyze demand for room types.

The boxplot (on the right) shows the distribution of closing prices within each price range — similar to revenue per room type in hotel data

##### 2. What is/are the insight(s) found from the chart?

The 51–100 range had the most trading days, suggesting the stock was most stable or popular in that range.

The Close price boxplot in this range shows a narrow interquartile range (IQR) and fewer outliers, indicating price consistency and lower volatility.

The 201–400 range had the fewest trading days, meaning the stock rarely reached or stayed at those levels — indicating a less accessible or overvalued zone during the observed period.

The lowest range (0–50) shows more outliers and wider spread, implying higher volatility at very low price points.



##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Business Impact:

Helps investors and analysts identify safe trading bands (e.g., ₹51–₹100) where the stock tends to be stable.

Traders may set buy zones and sell targets using these insights — e.g., enter in the ₹50–₹100 range where activity is dense and price is consistent.

Companies or financial planners can use this to understand investor behavior and plan communication or strategic buybacks.

Potential Indicators of Negative Growth:

If the stock frequently appears in lower price bands (0–50) with high variance, this could indicate investor uncertainty, deteriorating fundamentals, or market pressure.

Reduced activity in higher ranges (₹200+) might signal low market confidence in high valuations.



#### Chart - 11

In [None]:
# Chart - 11 visualization code
import matplotlib.pyplot as plt
import seaborn as sns

# Scatter plot of Open vs Close price
plt.figure(figsize=(15, 9))
# Adjust the y-limit if necessary based on the actual price ranges
# plt.ylim(0, 500) # This limit might not be appropriate for all stock price ranges, consider removing or adjusting
sns.scatterplot(x='Open', y='Close', data=df, color='coral') # Changed 'total_stay' to 'Open' and 'adr' to 'Close'

# Styling
plt.title('Open Price vs Closing Price', bbox={'facecolor': '0.8', 'pad': 3})
plt.xlabel('Opening Price (₹)') # Updated xlabel
plt.ylabel('Closing Price (₹)') # Updated ylabel
plt.grid(True)
plt.show()


##### 1. Why did you pick the specific chart?

A scatter plot is ideal for showing the relationship between two continuous variables — in this case, Open and Close prices of Yes Bank stock.
This chart helps you:

Understand how closely the stock's closing price tracks its opening price.

Identify daily volatility, trends, or consistency in stock performance.

##### 2. What is/are the insight(s) found from the chart?

If points lie close to the diagonal (y = x), it shows that prices remain stable during the day.

If there is significant spread, it suggests intraday volatility — useful for day traders.

Points above the diagonal = days when stock closed higher than it opened (bullish).

Points below the diagonal = bearish trading days.



##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
If the stock shows consistent upward closing prices, it reflects investor confidence and can attract more short-term traders or intraday investors. It also indicates price momentum, which is useful for planning trading strategies or issuing recommendations.

Negative Impact:
If most days show closing prices lower than the opening, it may reflect negative market sentiment or poor performance, which could drive away investors and lead to stock value depreciation. This can impact the company’s market image and trading volume.

#### Chart - 12

In [None]:
# Chart - 12 visualization code
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Step 1: Convert the 'Date' column to datetime format
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')

# Step 2: Extract the year and month in 'YYYY-MM' format
df['Month'] = df['Date'].dt.to_period('M').astype(str)

# Step 3: Group by Month and calculate average closing price
monthly_activity = df.groupby('Month')['Close'].mean().sort_values(ascending=False).head(10)

# Step 4: Plotting
plt.figure(figsize=(12, 6))
bargraph = sns.barplot(x=monthly_activity.index, y=monthly_activity.values, palette='Blues_d')

# Title and labels
plt.title('Top 10 Months by Average Closing Price (YesBank StockPrices)', bbox={'facecolor': '0.8', 'pad': 3})
plt.xlabel('Month')
plt.ylabel('Average Closing Price (₹)')

# Step 5: Add value labels on top of each bar
for container in bargraph.containers:
    bargraph.bar_label(container, fmt='%.2f')

# Final layout
plt.tight_layout()
plt.show()


##### 1. Why did you pick the specific chart?

A bar chart is ideal for comparing discrete categories — in this case, monthly average closing prices.

It helps identify which months had the highest stock performance, offering insights into seasonal trends or market cycles.



##### 2. What is/are the insight(s) found from the chart?

Shows the top 10 months with the highest average closing prices.

Can indicate bullish periods or times of market optimism.

Might correspond to key financial events like quarterly results, policy changes, or sector-specific rallies.



##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
Investors and traders can plan better by knowing the months that historically perform well.

Helps analysts predict seasonal behavior in stock performance.

Financial advisors can optimize entry/exit timing based on such insights.

Negative Insight:
If only a few months perform well and others don’t, it may indicate volatility or inconsistency, which is risky for long-term investors.

A sharp decline after these top-performing months might suggest speculative bubbles or overreactions.




#### Chart - 13

In [None]:
# Chart - 13 visualization code
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Step 1: Ensure Date column is in datetime format
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')

# Step 2: Extract Month-Year from Date
df['Month'] = df['Date'].dt.to_period('M').astype(str)

# Step 3: Calculate daily volatility (High - Low)
df['Volatility'] = df['High'] - df['Low']

# Step 4: Group by Month and compute average volatility
monthly_volatility = df.groupby('Month')['Volatility'].mean().sort_values(ascending=False).head(10)

# Step 5: Plotting
plt.figure(figsize=(12, 6))
sns.barplot(x=monthly_volatility.index, y=monthly_volatility.values, palette='coolwarm')
plt.title('Top 10 Months by Average Daily Volatility (Yes Bank Stock)', bbox={'facecolor': '0.8', 'pad': 3})
plt.xlabel('Month')
plt.ylabel('Average Daily Volatility (₹)')
plt.xticks(rotation=45)

# Add value labels
for container in plt.gca().containers:
    plt.gca().bar_label(container, fmt='%.2f')

plt.tight_layout()
plt.show()



##### 1. Why did you pick the specific chart?

A bar chart is ideal for comparing average values across different time periods.

In this case, it clearly highlights the monthly stock volatility, helping us identify when Yes Bank stock had the largest day-to-day price movements.

It’s useful for risk analysis, trading strategies, and volatility-based decision-making.



##### 2. What is/are the insight(s) found from the chart?

The chart reveals the top 10 months with the highest average volatility.

These spikes in volatility may correspond to:

Market reactions to quarterly earnings

Regulatory changes, or

Investor sentiment shifts

Months with higher volatility indicate periods of greater uncertainty or rapid price movement.



##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
Traders can capitalize on high-volatility months using swing or intraday strategies.

Financial advisors can use this insight to guide clients on optimal timing based on risk appetite.

Investors can plan risk management strategies like stop-loss or hedge positions during volatile periods.

🔻 Negative Impact:
High volatility often reflects market uncertainty or instability, which may reduce investor confidence.

If consistent, it may signal speculative activity or lack of faith in fundamentals, leading to negative long-term perception of the stock.



#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Select relevant numeric columns for correlation
# The original code had column names from a different dataset.
# We need to select the numeric columns from the Yes Bank stock price data.
corr_df = df[['Open', 'High', 'Low', 'Close']]

# Compute the correlation matrix
correlation_matrix = corr_df.corr()

# Plotting the heatmap
plt.figure(figsize=(8, 8))
sns.heatmap(correlation_matrix, annot=True, fmt='.2f', cmap='coolwarm', square=True,
            annot_kws={'size': 10}, linewidths=0.5, cbar_kws={'shrink': 0.8})

plt.title('Correlation Heatmap of Yes Bank Stock Prices', fontsize=14, bbox={'facecolor': '0.9', 'pad': 3})
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A correlation heatmap is the best tool to visualize relationships between numerical variables.

In stock data, this chart helps determine how strongly variables like Open, High, Low, and Close prices are related.

It provides a quick overview of linear relationships in a compact, visual form.

##### 2. What is/are the insight(s) found from the chart?

The Close price shows a very strong positive correlation with Open, High, and Low prices (correlation coefficients close to 1.00).

This indicates that these prices tend to move together, which is expected in financial data where daily price movements are tightly coupled.

High and Low prices also have a near-perfect correlation, showing that the trading range remains proportionally consistent.



#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code
sns.pairplot(
    df,
    vars=['Open', 'High', 'Low', 'Close'],
    palette='coolwarm',
    height=2.5
)

plt.suptitle("Pairwise Relationships of Stock Price Features",
             y=1.02, fontsize=14, fontweight='bold')

plt.show()


##### 1. Why did you pick the specific chart?

A pair plot is ideal when you want to explore the relationship between multiple numerical features in a dataset.

It allows visual inspection of pairwise correlations, linear trends, and potential outliers between Open, High, Low, and Close prices.

This chart is particularly valuable in financial datasets where price attributes are often interdependent.

##### 2. What is/are the insight(s) found from the chart?

The plot shows strong positive linear relationships among all price columns — especially:

Open vs Close

High vs Low

Close vs High

This confirms that Yes Bank’s daily prices are tightly correlated, as expected in stock data.

The diagonal plots (histograms) also reveal the distribution of each feature, showing that:

Close prices are more concentrated in a certain range,

While High and Low prices show more spread, indicating variability in daily highs and lows.



## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

1. Monitor Seasonal Stock Volatility
Insights from monthly average volatility suggest that some months experience higher price fluctuations.

  Management should align communication and investor engagement strategies during high-volatility periods to maintain investor confidence.

2. Focus on Bullish Momentum
The majority of days indicate “Stock Up” sentiment (Close > Open).

 This positive trend can be used to:

 Highlight growth in investor reports.

 Market products like mutual funds or ETFs that include Yes Bank stock to retail investors.

3. Targeted Investor Education
Based on pair plots and correlation heatmaps, stock prices show strong internal consistency (Open, High, Low, Close are highly correlated).

 Use this data to educate new investors, reinforcing Yes Bank’s price behavior and reducing panic selling.

4. Promote Long-Term Holding
If short-term price swings (volatility) are high, encourage long-term holding by:

 Offering dividend clarity or bonus/share splits.

 Publishing transparent financial goals to build trust with shareholders.

5. Data-Driven Marketing and Outreach
Use regional investor trends and historical price movements to target digital campaigns.

 Promote Yes Bank stock via:

  Investor webinars

  Retail brokerage tie-ups

  Finance influencer networks

# **Conclusion**

Most trading days end with a price gain, as the closing price is higher than the opening price on the majority of days, indicating overall bullish sentiment.

The ‘Stock Up’ sentiment dominates with ~67.7% of days, while the remaining ~32.3% are down days, suggesting a net-positive movement trend.

High correlation exists between Open, High, Low, and Close prices, showing strong consistency in daily stock movements.

Price ranges (High – Low) are wider in certain months, indicating increased volatility — particularly in mid-year periods.

Trading volumes and price range do not always move together, suggesting that volume surges may be driven by external events (e.g., announcements, market news) rather than just price movement.

Monthly average closing price analysis reveals that some months (e.g., July and August) show consistently higher stock prices, indicating seasonal trading interest or positive market sentiment during those times.

Open vs Close scatter plot shows a strong linear trend, reinforcing that daily closing prices closely follow opening trends, helpful for short-term prediction strategies.

Long-term trends can be identified by grouping and analyzing monthly or quarterly averages, which assist in investment planning and portfolio timing.

The data supports using simple technical indicators and trend-following strategies, as the stock exhibits structured price behavior.

Overall, Yes Bank stock data provides meaningful insights that can be leveraged for trading strategy development, investor communication, and market positioning.



### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***