# Project Name - Yes Bank Stock Closing Price Predicition

- Project Type - Yes Bank Stock Closing Price Predicition
- Contribution - Group
- Team Member 1 - Vipil Khapre
- Team Member 2 - Pranay Kuthe

# GitHub Link 

# Problem Statement (Data Set Information)

- In recent years, Yes Bank, a prominent bank in the Indian financial sector, has garnered significant attention due to a high-profile fraud case involving its former CEO Rana Kapoor. The objective of this project is to explore the impact of such events on the stock prices of the bank and determine whether time series models or other predictive models can effectively capture the dynamics of stock price movements. The dataset at hand contains monthly stock price information for Yes Bank, spanning from its inception. This dataset includes the closing, opening, highest, and lowest stock prices for each month. The primary goal is to develop predictive models that can accurately forecast the closing stock price for the upcoming months.

# Let's Begin !

## ***1. Know Your Data***

### Import Libraries

In [3]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
from datetime import date
%matplotlib inline

### Dataset Loading

In [4]:
df =pd.read_csv('data_YesBank_StockPrices.csv')

### Dataset First and last View

In [5]:
# Dataset first and last 
df.head()

Unnamed: 0,Date,Open,High,Low,Close
0,Jul-05,13.0,14.0,11.25,12.46
1,Aug-05,12.58,14.88,12.55,13.42
2,Sep-05,13.48,14.87,12.27,13.3
3,Oct-05,13.2,14.47,12.4,12.99
4,Nov-05,13.35,13.88,12.88,13.41


In [6]:
df.tail()

Unnamed: 0,Date,Open,High,Low,Close
180,Jul-20,25.6,28.3,11.1,11.95
181,Aug-20,12.0,17.16,11.85,14.37
182,Sep-20,14.3,15.34,12.75,13.15
183,Oct-20,13.3,14.01,12.11,12.42
184,Nov-20,12.41,14.9,12.21,14.67


### Dataset Rows & Columns count

In [7]:
# Dataset Rows and Columns
df.shape

(185, 5)

### Dataset Information

In [8]:
# Dataset info
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 185 entries, 0 to 184
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Date    185 non-null    object 
 1   Open    185 non-null    float64
 2   High    185 non-null    float64
 3   Low     185 non-null    float64
 4   Close   185 non-null    float64
dtypes: float64(4), object(1)
memory usage: 7.4+ KB


### What did you know about your dataset?


# Checking missing values

In [9]:
# Checking missing values in the dataset
df.isnull().sum()

Date     0
Open     0
High     0
Low      0
Close    0
dtype: int64

## ***2. Understanding Your Variables***

In [10]:
# Dataset columns
df.columns

Index(['Date', 'Open', 'High', 'Low', 'Close'], dtype='object')

#### Description of variables of the dataframe

In [11]:
# Shoing statistical information for the dataset
df.describe()

Unnamed: 0,Open,High,Low,Close
count,185.0,185.0,185.0,185.0
mean,105.541405,116.104324,94.947838,105.204703
std,98.87985,106.333497,91.219415,98.583153
min,10.0,11.24,5.55,9.98
25%,33.8,36.14,28.51,33.45
50%,62.98,72.55,58.0,62.54
75%,153.0,169.19,138.35,153.3
max,369.95,404.0,345.5,367.9


### Variables Description

1. Date  : Date of records
2. Open  : Opening price
3. High  : Highest price in the day
4. Low   : Lowest price in the day
5. Close : Occupation of the speaker

### Check Unique Values for each variable.

In [12]:
# Check Unique Values for each variable.
for i in df.columns.tolist():
  print("No. of unique values in",i,"is",df[i].nunique())

No. of unique values in Date is 185
No. of unique values in Open is 183
No. of unique values in High is 184
No. of unique values in Low is 183
No. of unique values in Close is 185


### Check duplicate values.

In [63]:
# there is no duplicate value avaliable in the dataset
df.duplicated().sum()

0

# Creating a new columns 

In [13]:
# Creating a new columns name 'Day amd 'Month' from 'Date' column in the dataset 
df['Day'] = df['Date'].str.split('-').str[1]
df['Month'] = df['Date'].apply(lambda x: datetime.strptime(x, '%b-%y').month)
df = df.drop('Date',axis=1)

# Changing the Data type of the columns from the dataset


In [14]:
# changing the datatype of the Day column from object to int
df['Day'] = df['Day'].astype(int)

#  Exploratory Data Analysis (EDA)

In [70]:
df.head()

Unnamed: 0,Open,High,Low,Close,Day,Month
0,13.0,14.0,11.25,12.46,5,7
1,12.58,14.88,12.55,13.42,5,8
2,13.48,14.87,12.27,13.3,5,9
3,13.2,14.47,12.4,12.99,5,10
4,13.35,13.88,12.88,13.41,5,11


## Top 5 share values as per Days and Months

In [76]:
# top 5 Opening values shares as per Day and month
top_opening_values_by_day_month = df.sort_values(by='Open', ascending=False).groupby('Month')[['Month','Day','Open']].head()
top_opening_values_by_day_month = top_opening_values_by_day_month.head(5)
top_opening_values_by_day_month

Unnamed: 0,Month,Day,Open
157,8,18,369.95
145,8,17,363.0
154,5,18,362.85
151,2,18,355.0
147,10,17,354.6


In [81]:
# top 5 Low values shares as per Day and month
top_low_values_by_day_month = df.sort_values(by='Low',ascending=False).groupby('Month')[['Month','Day','Low']].head()
top_low_values_by_day_month = top_low_values_by_day_month.head(5)
top_low_values_by_day_month

Unnamed: 0,Month,Day,Low
146,9,17,345.5
157,8,18,338.0
145,8,17,337.37
156,7,18,332.45
155,6,18,327.35


In [83]:
# top 5 High values shares as per Day and month
top_high_values_by_day_month = df.sort_values(by='High',ascending=False).groupby('Month')[['Month','Day','Low']].head()
top_high_values_by_day_month = top_high_values_by_day_month.head(5)
top_high_values_by_day_month

Unnamed: 0,Month,Day,Low
157,8,18,338.0
156,7,18,332.45
146,9,17,345.5
147,10,17,299.0
144,7,17,290.78


In [82]:
# top 5 Close values shares as per Day and month
top_close_values_by_day_month = df.sort_values(by='Close',ascending=False).groupby('Month')[['Month','Day','Close']].head()
top_close_values_by_day_month  = top_close_values_by_day_month .head(5)
top_close_values_by_day_month 

Unnamed: 0,Month,Day,Close
156,7,18,367.9
153,4,18,362.05
144,7,17,361.96
150,1,18,354.45
145,8,17,351.15


## 5 minimum share values as per Days and Months

In [102]:
# 5 minimum Opening values shares as per Day and month
top_opening_values_by_day_month = df.sort_values(by='Open', ascending=True).groupby('Month')[['Month','Day','Open']].head()
top_opening_values_by_day_month = top_opening_values_by_day_month.head(5)
top_opening_values_by_day_month

Unnamed: 0,Month,Day,Open
44,3,9,10.0
45,4,9,10.04
181,8,20,12.0
43,2,9,12.19
41,12,8,12.4


In [103]:
# 5 minimum High values shares as per Day and month
top_high_values_by_day_month = df.sort_values(by='High',ascending=True).groupby('Month')[['Month','Day','Low']].head()
top_high_values_by_day_month = top_high_values_by_day_month.head(5)
top_high_values_by_day_month

Unnamed: 0,Month,Day,Low
44,3,9,8.16
43,2,9,9.9
4,11,5,12.88
0,7,5,11.25
183,10,20,12.11


In [104]:
# 5 minimum Close values shares as per Day and month
top_close_values_by_day_month = df.sort_values(by='Close',ascending=True).groupby('Month')[['Month','Day','Close']].head()
top_close_values_by_day_month  = top_close_values_by_day_month .head(5)
top_close_values_by_day_month

Unnamed: 0,Month,Day,Close
44,3,9,9.98
43,2,9,10.26
180,7,20,11.95
42,1,9,12.24
40,11,8,12.26


In [105]:
# 5 minimum Low values shares as per Day and month
top_low_values_by_day_month = df.sort_values(by='Low',ascending=True).groupby('Month')[['Month','Day','Low']].head()
top_low_values_by_day_month = top_low_values_by_day_month.head(5)
top_low_values_by_day_month

Unnamed: 0,Month,Day,Low
176,3,20,5.55
44,3,9,8.16
43,2,9,9.9
45,4,9,9.94
39,10,8,11.01
