## 1. Importing the libraries:

In [3]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.io as pio
pio.templates.default = "plotly_white"

## 2. Reading the dataset:

In [6]:
stocks_dataset = pd.read_csv("stocks.csv")
stocks_dataset.head(2)

Unnamed: 0,Ticker,Date,Open,High,Low,Close,Adj Close,Volume
0,AAPL,2023-02-07,150.639999,155.229996,150.639999,154.649994,154.41423,83322600
1,AAPL,2023-02-08,153.880005,154.580002,151.169998,151.919998,151.6884,64120100


In [7]:
stocks_dataset['Ticker'].value_counts()

AAPL    62
MSFT    62
NFLX    62
GOOG    62
Name: Ticker, dtype: int64

The dataset contains the stock information for 4 top stocks, their opening and closing price, the highs and lows 
for that day, the adjusted closing price which accounts for all corporate actions such as dividends, stock split, etc
and the volumes which is the number of shares traded during that day.

## 3. Descriptive Statistics:

Lets find the descriptive statistics for each of the stock (mean, median, standard deviation, and so on)

#### 3.1 Descriptive stats for the opening price:

In [22]:
desc_stats_opening_price = stocks_dataset.groupby('Ticker')['Open'].describe().reset_index()
desc_stats_opening_price

Unnamed: 0,Ticker,count,mean,std,min,25%,50%,75%,max
0,AAPL,62.0,157.779839,7.224608,144.380005,151.489998,158.400002,164.702503,170.979996
1,GOOG,62.0,100.381919,6.197598,89.540001,94.532499,102.68,105.859999,107.800003
2,MSFT,62.0,274.735969,17.324808,246.550003,257.410004,277.110001,285.825005,307.76001
3,NFLX,62.0,328.110643,18.467142,287.339996,317.137497,325.649994,340.674995,372.410004


#### 3.2 Descriptive stats for the closing price:

In [49]:
desc_stats_closing_price = stocks_dataset.groupby('Ticker')['Close'].describe().reset_index()
desc_stats_closing_price

Unnamed: 0,Ticker,count,mean,std,min,25%,50%,75%,max
0,AAPL,62.0,158.240645,7.360485,145.309998,152.077499,158.055,165.162506,173.570007
1,GOOG,62.0,100.631532,6.279464,89.349998,94.702501,102.759998,105.962503,109.459999
2,MSFT,62.0,275.039839,17.676231,246.270004,258.7425,275.810013,287.217506,310.649994
3,NFLX,62.0,327.614677,18.554419,292.76001,315.672493,325.600006,338.899994,366.829987


These two tables represent the common descriptive statistics for the closing and opening prices.

Some of them are:
    1. Count: Total number of stocks during the time frames.
    
    2. Mean: The average prices (opening/closing).
    
    3. Std: The standard deviation represents the variation between the prices. (opening/closing).
    
    4. min: Shows the minimum price the stock has reached. (the opening/closing price).
    
    5. 25%: It represents the 25th percentile, meaning 25% of the data has prices lower than the shown price.
    
    6. 50%: It represents the 50th percentile, meaning 50% of the data has prices lower than the shown price.
    
    7. 75%: It represents the 75th percentile, meaning 75% of the data has prices lower than the shown price.
    
    8. max: The maximum price the stock reached. (the opening/closing price)

In [69]:
stocks_dataset.head(2)

Unnamed: 0,Ticker,Date,Open,High,Low,Close,Adj Close,Volume
0,AAPL,2023-02-07,150.639999,155.229996,150.639999,154.649994,154.41423,83322600
1,AAPL,2023-02-08,153.880005,154.580002,151.169998,151.919998,151.6884,64120100


#### Box Plots focusing on the closing price:

In [71]:
px.box(data_frame=stocks_dataset[stocks_dataset['Ticker'] == 'AAPL'], 
       x='Ticker', y='Close', points='all', height=600)

In [72]:
px.box(data_frame=stocks_dataset[stocks_dataset['Ticker'] == 'GOOG'], 
       x='Ticker', y='Close', points='all', height=600)

In [74]:
px.box(data_frame=stocks_dataset[stocks_dataset['Ticker'] == 'MSFT'], 
       x='Ticker', y='Close', points='all', height=600)

In [73]:
px.box(data_frame=stocks_dataset[stocks_dataset['Ticker'] == 'NFLX'], 
       x='Ticker', y='Close', points='all', height=600)

## 4. Time Series Analysis:

Now, in this analysis, we'll focus to examine the trends and patterns over time focusing on the closing price.

In [79]:
stocks_dataset['Date'] = pd.to_datetime(stocks_dataset['Date'])

#### This pivot table will present the closing values for each stock based on the dates:

In [94]:
pivot_data_closing_price = stocks_dataset.pivot(index='Date', columns='Ticker', values='Close')

In [95]:
pivot_data_closing_price.head(2)

Ticker,AAPL,GOOG,MSFT,NFLX
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2023-02-07,154.649994,108.040001,267.559998,362.950012
2023-02-08,151.919998,100.0,266.730011,366.829987


In [96]:
# Creating the sub-plot:
fig = make_subplots(rows=1, cols=1)


# Adding the traces for each stock:
for column in pivot_data_closing_price.columns:
    fig.add_trace(
        go.Scatter(x=pivot_data.index, y=pivot_data[column], name=column),
        row=1, col=1)
    

# Update the layout:
fig.update_layout(
    title_text = "Time Series Visualization for Closing Prices",
    xaxis_title='Date',
    yaxis_title='Closing Price',
    legend_title='Stocks',
    showlegend=True
)


fig

##### From the above chart we can see that, MSFT and AAPL has a general upward trend in the closing price.

## 5. Volatility Analysis:

Now, we will analyze how much volatile are the stocks for the closing prices using standard deviation i.e. how
much do these stocks deviate from their closing mean price.

In [102]:
volaitlity_on_closing_price = pivot_data_closing_price.std().sort_values(ascending=False).reset_index()

In [104]:
volaitlity_on_closing_price.rename(columns={0:'Standard Deviation'}, inplace=True)

In [105]:
volaitlity_on_closing_price

Unnamed: 0,Ticker,Standard Deviation
0,NFLX,18.554419
1,MSFT,17.676231
2,AAPL,7.360485
3,GOOG,6.279464


In [116]:
fig = px.bar(data_frame=volaitlity_on_closing_price, x='Ticker',
       y='Standard Deviation', title='Volaitility on Closing Price (Standard Deviation)',
      text='Standard Deviation')

fig.update_traces(texttemplate='%{text:.2f}', textposition='outside')

From this bar chart, we can see that NFLX(18.55) and MSFT(17.68) are the most volatile stocks 
compared to AAPL(7.36) and GOOG(6.28).

## 6. Correlation Analysis:

Now, let's see if there is any kinds of correlation among these stocks in their closing price.

- Values that are close to +1 represents a strong correlation.
- Values that are close to -1 represents a negative correlation.
- Values close to 0 shows to correlation.

In [133]:
# Creating the correlation matrix:
corr_closing_price = pivot_data_closing_price.corr()

fig = go.Figure(data=go.Heatmap(
        z=corr_closing_price,
        x=corr_closing_price.columns,
        y=corr_closing_price.columns,
        colorscale='blues',
        colorbar=dict(title='Correlation')
))

fig.update_layout(
    title='Correlation Matrix for Closing Prices',
    xaxis_title='Ticker',
    yaxis_title='Ticker'
    )

fig

Here, GOOG and AAPL has a strong positive correlation with 0.901. Which means, if the closing price of AAPL is high,
GOOG also tends to have a higher closing price and vice versa.