<a href="https://colab.research.google.com/github/gaurav4601/capstoneproject2/blob/master/Yes_Bank_Price_ML_Project_Regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img title="Almabetter" alt="Almabetter" src="https://pbs.twimg.com/profile_images/1649033540149866497/tg4B3SVf_400x400.jpg" width=70px>

## Yes Bank Stock Closing Price Predictor
<img title="Yes Bank" alt="Yes Bank Logo" src="https://logos-download.com/wp-content/uploads/2016/06/Yes_Bank_logo.png" width=120px>


#### **About Project** :
>The Indian financial domain has long been aware of Yes Bank, a prominent bank that has been the subject of much discussion since 2018, due to a fraudulent scheme involving Rana Kapoor. This illicit activity raises a pertinent question about how this event has affected the bank's stock prices, and whether reliable predictive models, such as Time series models, can accurately reflect such situations.
Our dataset includes monthly stock prices of Yes Bank since its inception, encompassing essential information regarding the closing, starting, highest, and lowest stock prices of each month. The primary objective of this study is to investigate and forecast the stock's closing price for each period, utilizing a range of analytical methods for the most accurate results.

<br>
<hr>
<br>

#### **Little Bit😶‍🌫️ about Domain** 

>Stocks represent ownership of a publicly-traded company, which individuals and institutions can purchase in the form of shares. When you buy a share of a stock, you become a shareholder in that company and can potentially benefit from its success. The stock market, meanwhile, is where these publicly-traded companies' shares are bought and sold. It is a market where investors can trade stocks and profit from fluctuations in the stock's price.The stock price refers to the current trading price of a particular stock. This price is subject to change due to market demand, trading volume, company financial performance, and other macroeconomic factors, making the stock market a volatile and uncertain environment. The price may reflect the market's perceived value of the company, based on factors such as its financial performance, leadership, and growth potential.Overall, stock prices and shares are essential components of the stock market, which serves as a venue for investors to buy and sell stocks and generate profits. Companies may benefit from the sale of stocks by raising capital to fund expansion and operations or to finance new projects.

<img text='illustration' src='https://img.freepik.com/free-vector/hand-drawn-stock-market-concept-with-analysts_23-2149163670.jpg?w=900&t=st=1683438353~exp=1683438953~hmac=8b0a3826b6bc810e5f5cc124fc72171ef9b0f84ccffa76196b2fd6a0ab0e3a7e' width=350px>

In [1]:
# importing all required libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px


#additional as required

>**Data Loading from CSV**

In [3]:
# loading csv file directly from the github-raw

data = pd.read_csv('https://raw.githubusercontent.com/gaurav4601/capstoneproject2/master/data_YesBank_StockPrices.csv')

#copy data to df

df = data.copy()

### 🏛️  **First Lookup Over Data**

In [4]:
# head
df.head()


Unnamed: 0,Date,Open,High,Low,Close
0,Jul-05,13.0,14.0,11.25,12.46
1,Aug-05,12.58,14.88,12.55,13.42
2,Sep-05,13.48,14.87,12.27,13.3
3,Oct-05,13.2,14.47,12.4,12.99
4,Nov-05,13.35,13.88,12.88,13.41


In [5]:
#last rows
df.tail()

Unnamed: 0,Date,Open,High,Low,Close
180,Jul-20,25.6,28.3,11.1,11.95
181,Aug-20,12.0,17.16,11.85,14.37
182,Sep-20,14.3,15.34,12.75,13.15
183,Oct-20,13.3,14.01,12.11,12.42
184,Nov-20,12.41,14.9,12.21,14.67


In [6]:
#shape of data
df.shape

(185, 5)

In [9]:
# sample

df.sample(5)

Unnamed: 0,Date,Open,High,Low,Close
142,May-17,326.0,330.3,275.15,286.38
135,Oct-16,253.41,265.5,245.8,253.52
166,May-19,163.3,178.05,133.05,147.95
81,Apr-12,73.62,76.1,69.11,70.07
120,Jul-15,169.0,175.58,156.45,165.74


In [11]:
# duplicate values

df.duplicated().sum()

0

In [12]:
# check for null or missing values
print(df.isnull().sum())
df.isna().sum()

Date     0
Open     0
High     0
Low      0
Close    0
dtype: int64


Date     0
Open     0
High     0
Low      0
Close    0
dtype: int64

In [13]:
# info about the data
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 185 entries, 0 to 184
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Date    185 non-null    object 
 1   Open    185 non-null    float64
 2   High    185 non-null    float64
 3   Low     185 non-null    float64
 4   Close   185 non-null    float64
dtypes: float64(4), object(1)
memory usage: 7.4+ KB


In [14]:
# number summary of dataset

df.describe()

Unnamed: 0,Open,High,Low,Close
count,185.0,185.0,185.0,185.0
mean,105.541405,116.104324,94.947838,105.204703
std,98.87985,106.333497,91.219415,98.583153
min,10.0,11.24,5.55,9.98
25%,33.8,36.14,28.51,33.45
50%,62.98,72.55,58.0,62.54
75%,153.0,169.19,138.35,153.3
max,369.95,404.0,345.5,367.9


### **Notes from First Lookup Over Data**

- We have 185 rows and 5 columns
- No null / Duplicate or Missing Values are thier in Dataset
- All Columns are in Proper Data Types....We May Consider Date Column to Convert Into ```datetime``` as per requirement



<br>
<br>

> **Columns Information**

In the stock market, the terms "open", "high", "low", and "close" refer to different prices of a stock or security at various points in time.

> **Open** refers to the price of a security at the beginning of a trading session, typically the first few minutes of trading.

> **High** refers to the highest price that a security reached during a trading session, whether it was in the opening minutes or later in the day.

> **Low** refers to the lowest price that a security reached during a trading session, whether it was in the opening minutes or later in the day.

> **Close** refers to the price of a security at the end of a trading session, typically at the close of the stock market.

These four terms are commonly used to describe the performance of a stock or security over a particular time period, such as a day, week, or month. The difference between the open and close prices is often used to calculate the daily return of a stock or security, while the difference between the high and low prices is used to measure its volatility or range for the day.


In [19]:
# Conver Date Column to Date Time dtype

pd.to_datetime(df["Date"], format='%b-%y')

0     2005-07-01
1     2005-08-01
2     2005-09-01
3     2005-10-01
4     2005-11-01
         ...    
180   2020-07-01
181   2020-08-01
182   2020-09-01
183   2020-10-01
184   2020-11-01
Name: Date, Length: 185, dtype: datetime64[ns]