
<div align=center>
    <img src=https://i.imgur.com/lbVlAYr.png>
</div>

<div style="font-family:verdana; 
            text-align:center; 
            font-weight:bold; 
            display:fill; 
            background-color:#5642C5;
            color:white;
            border-radius:5px">
    <h1 style="padding:20px 10px">Reliance Industries Stock Prices - EDA</h1>
</div>

<!-- ![Reliance](https://upload.wikimedia.org/wikipedia/en/thumb/9/99/Reliance_Industries_Logo.svg/375px-Reliance_Industries_Logo.svg.png) -->
<div align=center>
    <img src=https://upload.wikimedia.org/wikipedia/en/thumb/9/99/Reliance_Industries_Logo.svg/375px-Reliance_Industries_Logo.svg.png>
</div>

<div style="font-family:verdana;">
    <blockquote>
        Reliance Industries Limited (RIL) is an Indian multinational conglomerate headquartered in Mumbai. Reliance owns businesses across India engaged in energy, petrochemicals, textiles, natural resources, retail, and telecommunications. Reliance is one of the most profitable companies in India, the largest publicly traded company in India by market capitalisation, and the largest company in India as measured by revenue after recently surpassing the government-controlled Indian Oil Corporation. It is also the eighth largest employer in India with nearly 195,000 employees. On 10 September 2020, Reliance Industries became the first Indian company to cross $200 billion in market capitalisation.
    </blockquote>
</div>

*****************

<!-- ![Plotly](https://upload.wikimedia.org/wikipedia/commons/thumb/3/37/Plotly-logo-01-square.png/330px-Plotly-logo-01-square.png) -->
<div align=center>
    <img src=https://upload.wikimedia.org/wikipedia/commons/thumb/3/37/Plotly-logo-01-square.png/330px-Plotly-logo-01-square.png>
</div>

<div style="font-family:verdana;">
    <blockquote>
        Plotly is a technical computing company headquartered in Montreal, Quebec, that develops online data analytics and visualization tools. Plotly provides online graphing, analytics, and statistics tools for individuals and collaboration, as well as scientific graphing libraries for Python, R, MATLAB, Perl, Julia, Arduino, and REST.
        Plotly was founded by Alex Johnson, Jack Parmer, Chris Parmer, and Matthew Sundquist.
        The Boston Globe and Washington Post newsrooms have produced data journalism using Plotly. In 2020, Plotly was named a Best Place to Work by the Canadian SME National Business Awards, and nominated as Business of the Year.
    </blockquote>
    <blockquote>
        Plotly was founded by Alex Johnson, Jack Parmer, Chris Parmer, and Matthew Sundquist.
    </blockquote>
    <blockquote>
        The Boston Globe and Washington Post newsrooms have produced data journalism using Plotly. In 2020, Plotly was named a Best Place to Work by the Canadian SME National Business Awards, and nominated as Business of the Year.
    </blockquote>
</div>

*************

<div style="font-family:verdana;">
    <center><h1>About the data 📃</h1></center>
    <br>
    <blockquote>
        The dataset contains individual files for each stock in the BSE(Bombay Stock Exchange) at an interval of 15 minutes. It contains stock in formation for the interval i.e. <code>high</code>, <code>low</code>, <code>open</code>, <code>close</code> and <code>volume</code>. This notebook focuses on the stock data for Reliance Industries Limited.
    </blockquote>
    <blockquote>
        <li>high - Highest price in the interval
        <li>low - Lowest price in the interval
        <li>open - Price at start of the interval
        <li>close - Price at closing of the interval
        <li>volume - Quantity of stocks traded in the interval
    </blockquote>
</div>

-----------------

# Importing Libraries 📚

In [1]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import re

# Loading Data ⏳

In [2]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python

file_paths = []

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        file_paths.append(os.path.join(dirname, filename))
        print(os.path.join(dirname, filename))


/kaggle/input/bse-stocks-data-15-minute-interval-historical/JUSTDIAL-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/UPL-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/APOLLOHOSP-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/JINDALSTEL-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/BATAINDIA-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/HINDUNILVR-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/TORNTPHARM-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/CUMMINSIND-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/NESTLEIND-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/COLPAL-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/MM-15minute-Hist
/kaggle/input/bse-stocks-data-15-minute-interval-historical/CENTURYTEX-15minute-Hist
/ka

<div class="alert alert-info">
    <strong>📌 Merging Dataframes</strong>
    <p>Read and merge dataframes of selected companies with <code>Reliance</code> data. The selected stocks are <code>TCS</code>, <code>Wipro</code>, <code>Asian Paints</code> and <code>HDFC</code>. These stocks are in the similar range as that of Reliance. This merged dataset will be used to compare the stocks of these companies with Reliance.</p>
</div>

In [3]:
def merge_files(file_paths=file_paths):
    final_df = pd.DataFrame()
    for file_path in file_paths:
        df = pd.read_pickle(file_path)
        df = pd.DataFrame(df)
        
        # Getting Stock Name
        pattern = "historical\/(.*?)\-"
        stock_name = re.search(pattern, file_path).group(1)
        
        # Preprocessing
        df['date'] = df['date'].apply(pd.to_datetime)
        df['time'] = df['date'].dt.time
        df['date'] = df['date'].dt.date
        
        temp = df.groupby(['date']).agg({'low':'min', 'high':'max', 'open':'first', 'close':'last', 'volume':'sum'})
        temp = temp.reset_index()
        temp['stock_name'] = stock_name
        
        final_df = final_df.append(temp, ignore_index=True)
    
    return final_df

In [4]:
file_paths = [
    '/kaggle/input/bse-stocks-data-15-minute-interval-historical/RELIANCE-15minute-Hist',
    '/kaggle/input/bse-stocks-data-15-minute-interval-historical/TCS-15minute-Hist',
    '/kaggle/input/bse-stocks-data-15-minute-interval-historical/WIPRO-15minute-Hist',
    '/kaggle/input/bse-stocks-data-15-minute-interval-historical/ASIANPAINT-15minute-Hist',
    '/kaggle/input/bse-stocks-data-15-minute-interval-historical/HDFC-15minute-Hist'
]

In [5]:
df_merged = merge_files(file_paths=file_paths)

In [6]:
ril_data = pd.read_pickle('/kaggle/input/bse-stocks-data-15-minute-interval-historical/RELIANCE-15minute-Hist')
ril_data = pd.DataFrame(ril_data)

In [7]:
ril_data.head()

Unnamed: 0,date,open,high,low,close,volume
0,2015-02-02 09:15:00+05:30,458.48,459.9,457.03,458.25,587446
1,2015-02-02 09:30:00+05:30,458.35,458.48,457.0,457.03,288842
2,2015-02-02 09:45:00+05:30,457.03,457.88,456.45,457.25,232376
3,2015-02-02 10:00:00+05:30,457.25,457.75,455.75,457.0,208256
4,2015-02-02 10:15:00+05:30,457.0,457.48,456.5,457.0,90870


In [8]:
ril_data.tail()

Unnamed: 0,date,open,high,low,close,volume
26226,2019-05-08 15:00:00+05:30,1303.0,1306.7,1302.05,1304.75,650447
26227,2019-05-08 15:15:00+05:30,1304.7,1305.0,1292.5,1298.0,1993463
26228,2019-05-08 15:30:00+05:30,1297.75,1299.45,1297.75,1299.45,5877
26229,2019-05-08 15:45:00+05:30,1299.45,1299.45,1299.45,1299.45,1479
26230,2019-05-08 16:00:00+05:30,1299.45,1299.45,1299.45,1299.45,0


In [9]:
ril_data.isnull().sum()

date      0
open      0
high      0
low       0
close     0
volume    0
dtype: int64

In [10]:
ril_data.describe()

Unnamed: 0,open,high,low,close,volume
count,26231.0,26231.0,26231.0,26231.0,26231.0
mean,739.320535,740.641894,737.925143,739.308286,311516.2
std,286.90364,287.457618,286.298588,286.904062,381680.1
min,398.75,400.23,398.28,398.75,0.0
25%,495.53,496.35,494.75,495.5,127623.0
50%,644.0,645.1,642.85,643.88,206062.0
75%,944.725,946.05,942.775,944.725,359350.0
max,1416.85,1417.5,1412.85,1417.0,10937790.0


# Preprocessing ⚒

In [11]:
ril_data['date'] = ril_data['date'].apply(pd.to_datetime)
ril_data.head()

Unnamed: 0,date,open,high,low,close,volume
0,2015-02-02 09:15:00+05:30,458.48,459.9,457.03,458.25,587446
1,2015-02-02 09:30:00+05:30,458.35,458.48,457.0,457.03,288842
2,2015-02-02 09:45:00+05:30,457.03,457.88,456.45,457.25,232376
3,2015-02-02 10:00:00+05:30,457.25,457.75,455.75,457.0,208256
4,2015-02-02 10:15:00+05:30,457.0,457.48,456.5,457.0,90870


In [12]:
# Splitting date into date and time
df = ril_data.copy()
df['time'] = df['date'].dt.time
df['date'] = df['date'].dt.date
df.head()

Unnamed: 0,date,open,high,low,close,volume,time
0,2015-02-02,458.48,459.9,457.03,458.25,587446,09:15:00
1,2015-02-02,458.35,458.48,457.0,457.03,288842,09:30:00
2,2015-02-02,457.03,457.88,456.45,457.25,232376,09:45:00
3,2015-02-02,457.25,457.75,455.75,457.0,208256,10:00:00
4,2015-02-02,457.0,457.48,456.5,457.0,90870,10:15:00


<div class="alert alert-info">
    <strong>📌 Converting the data with 15 min interval to 1 day interval</strong>
    <p>
        <li>df['open'] = Open price at start of day | when time==9.15
        <li>df['high'] = Max value in 'high' fot the whole day
        <li>df['low] = Min value in 'low' for the whole day
        <li>df['close'] = Close price at end of day | when time==5.15
        <li>df['volume'] = Sum of volume for the entire day
    </p>
</div>

In [13]:
temp1 = df.groupby(['date']).agg({'low':'min', 'high':'max', 'open':'first', 'close':'last', 'volume':'sum'})


<div class="alert alert-info">
    <strong>📌 Converting 15 min interval data to mean data for 1 day</strong>
    <p>
        Convert the 15 min interval data for each column into mean data for each of the columns for that complete day. This is not correct way for grouping this kind of problem, the previous <code>temp1</code> dataframe is more appropriate.
    </p>
</div>

In [14]:
temp = df.groupby(['date']).mean()

# EDA 📈

In [15]:
fig = px.line(temp, x=temp.index, y='high')

fig.update_layout(title='Reliance Stock Mean Data', xaxis_title='Date', yaxis_title='Mean of High for the day')

fig.add_hline(y=np.average(temp['high']),
              line={
                  'color':'Orange',
                  'dash':'dot'
              },
              annotation_text=f"Mean:{np.average(temp['high'])}")

fig.add_vrect(x0="2018-06-28", x1="2018-08-29",
              annotation_position="bottom right",
              annotation_text="Rise",
              fillcolor="Green", opacity=0.2,
              line_width=0
             )

fig.show()

In [16]:
fig = go.Figure()

fig.add_trace(go.Scatter(x=temp1.index, y=temp1['low'],
                        mode='lines',
                        name='Low',
                        ))

fig.add_trace(go.Scatter(x=temp1.index, y=temp1['high'],
                        mode='lines',
                        name='High',
                        fill='tonexty'))



fig.update_layout(title="Reliance Daily Stock Prices(2015-2019)")

fig.show()

In [17]:
temp1.reset_index(inplace=True)
temp1['date'] = temp1['date'].apply(pd.to_datetime)

<div class="alert alert-info">
    <strong>📌 Add missing dates</strong>
    <p>Add missing dates in the data for each year for the sake of plotting for each year with uniformity. Also, fill the missing values with vlaue from the previous day.</p>
</div>

In [18]:
def add_missing_data(year):
    
    df = temp1.copy()
    df = df[df['date'].dt.year == year]
#     df['day_of_year'] = df['date'].dt.strftime('%j')
    
    dates = pd.date_range(start=f'{year}-01-01', end=f'{year}-12-31').to_frame()
    dates = dates[~dates.index.isin(df['date'])]
    dates = dates.reset_index()
    
    df = pd.merge(df, dates, left_on='date', right_on=0, how='outer').sort_values(by=['date'])
    df = df.reset_index()
    df = df.drop([0, 'level_0', 'index'], axis=1)
    df.index += 1    
    
    return df

In [19]:
df_15 = add_missing_data(2015)
df_16 = add_missing_data(2016)
df_17 = add_missing_data(2017)
df_18 = add_missing_data(2018)
df_19 = add_missing_data(2019)

In [20]:
df_19[:128] = df_19[:128].fillna(method='ffill')

In [21]:
fig = make_subplots(rows=5, cols=1, start_cell="top-left", vertical_spacing=0.02)

fig.add_trace(go.Scatter(x=df_15['date'], y=df_15.fillna(method='ffill')['high'], name=2015), row=1, col=1)

fig.add_trace(go.Scatter(x=df_16['date'], y=df_16.fillna(method='ffill')['high'], name=2016), row=2, col=1)

fig.add_trace(go.Scatter(x=df_17['date'], y=df_17.fillna(method='ffill')['high'], name=2017), row=3, col=1)

fig.add_trace(go.Scatter(x=df_18['date'], y=df_18.fillna(method='ffill')['high'], name=2018), row=4, col=1)

fig.add_trace(go.Scatter(x=df_19['date'], y=df_19['high'], name=2019), row=5, col=1)

fig.update_layout(height=1000, width=800, 
                  title="Reliance Stock Prices Yearly",
                  xaxis1=dict(
                      showticklabels=False
                  ),
                  xaxis2=dict(
                      showticklabels=False
                  ),
                  xaxis3=dict(
                      showticklabels=False
                  ),
                  xaxis4=dict(
                      showticklabels=False
                  ),
                  xaxis5_tickformat='%B',
                 )

In [22]:
df_year = temp1.copy()
df_year['year'] = df_year['date'].dt.year
df_year = df_year.groupby(['year']).agg({'volume': 'sum'})
df_year

Unnamed: 0_level_0,volume
year,Unnamed: 1_level_1
2015,1571097940
2016,1775671062
2017,2099157497
2018,1939419623
2019,786035522


In [23]:
fig = go.Figure()

fig.add_trace(go.Bar(x=df_year.index, y=df_year.volume,
                     marker_color=['#636EFA','#636EFA','#636EFA','#636EFA','#EF553B']
                    ))

fig.add_annotation(x=2019, y=700000000, 
                   text="Till May", 
                   showarrow=False, 
                   opacity=0.7,
                   font_color="white"
                  )

fig.update_layout(title="Total Volume traded", yaxis_title="Volume")
fig.show()

In [24]:
fig = go.FigureWidget(make_subplots(specs=[[{"secondary_y": True}]]))

fig.add_trace(go.Candlestick(x=temp1['date'],
                             open=temp1['open'],
                             high=temp1['high'],
                             low=temp1['low'],
                             close=temp1['close'],
                             name="Stock Prices"
                            ), secondary_y=True)

fig.add_trace(go.Bar(x=temp1['date'],
                     y=temp1['volume'],
                     name="Volume"
                     ), secondary_y=False)

fig.update_layout(height=650, width=1000, title="Candlechart", yaxis1_title="Volume", yaxis2_title="Stock Prices(₹)")

# def zoom(layout, xrange):
#     in_view = temp1.loc[fig.layout.xaxis.range[0]:fig.layout.xaxis.range[1]]
#     fig.layout.yaxis.range = [in_view['volume'].min() - 10, in_view['volume'].max() + 10]

# fig.layout.on_change(zoom, 'xaxis.range')

fig.show()

In [25]:
fig = px.line(df_merged, x='date', y='high', color='stock_name')

fig.update_layout(title="Stock Prices of Selected Stocks", 
                  xaxis_title="Daily High", 
                  yaxis_title="Date", 
                  legend_title="Stocks", 
                 )

fig.show()

In [26]:
def convert_df_to_yearly(file_paths=file_paths):
    final_df = pd.DataFrame()
    for file_path in file_paths:
        df = pd.read_pickle(file_path)
        df = pd.DataFrame(df)
        
        # Getting Stock Name
        pattern = "historical\/(.*?)\-"
        stock_name = re.search(pattern, file_path).group(1)
        
        # Preprocessing
        
        df['date'] = df['date'].apply(pd.to_datetime)
        df['time'] = df['date'].dt.time
        df['year'] = df['date'].dt.strftime('%Y')
        df['date'] = df['date'].dt.date
        
        temp = df.groupby(['year']).agg({'low':'min', 'high':'max', 'open':'first', 'close':'last', 'volume':'sum'})
#         print(temp)
        temp = temp.reset_index()
#         temp.date = temp.date.apply(pd.to_datetime)
#         temp = add_missing_data(temp, '2015-02-02', '2019-05-15')
        temp = temp.reset_index()
        temp['stock_name'] = stock_name
        
        final_df = final_df.append(temp, ignore_index=True)
    
    return final_df

In [27]:
df_yearly = convert_df_to_yearly()

In [28]:
fig = px.bar(df_yearly, x=df_yearly['stock_name'], y=df_yearly['low'], 
             color=df_yearly['stock_name'], 
             animation_frame='year',
             range_y=[0, 2500]
            )

fig.update_layout(title="Yearly change in Stock Prices",
                  showlegend=False,
                  yaxis_title="Low(₹)",
                  xaxis_title="Stocks"
                 )

fig.show()

<div class="alert alert-info">
    <strong>📌 Reliance Financial (2015-2019)</strong>
    <p>Financial features for Reliance procured from reliable resources available on the web. These features impact the stock prices of the company.</p>
</div>

In [29]:
data = {
        'Total Revenue':[383732.00, 301494.00, 255573.00, 236808.00, 335854.00,],
        'Gross Profit':[57925.00, 55305.00, 49242.00, 44606.00, 37956.00],
        'Net Profit':[35163.00, 33612.00, 31425.00, 27384.00, 22719.00]
       }
financials = pd.DataFrame(data, index=[2019, 2018, 2017, 2016, 2015])
financials

Unnamed: 0,Total Revenue,Gross Profit,Net Profit
2019,383732.0,57925.0,35163.0
2018,301494.0,55305.0,33612.0
2017,255573.0,49242.0,31425.0
2016,236808.0,44606.0,27384.0
2015,335854.0,37956.0,22719.0


In [30]:
fig = go.Figure()

fig.add_trace(go.Bar(x=financials.index, y=financials['Total Revenue'].apply(lambda x: x / 10),
                     name="Total Revenue(₹ 10x crores)",
                    ))

fig.add_trace(go.Bar(x=financials.index, y=financials['Net Profit'],
                     name="Net Profit(₹ crores)",
                    ))

fig.add_trace(go.Bar(x=df_yearly[df_yearly['stock_name'] == 'RELIANCE']['year'], 
                     y=df_yearly[df_yearly['stock_name'] == 'RELIANCE']['high'],
                     name="Highest Stock Price(₹)",
                    ))

fig.show()

<center><h3><br>Thank You! Upvote if you liked or learned.<br><br> Feel free to comment any suggestions or improvements.</h3></center>