# Project By Ahsan Zafar

# Stock Analysis In Python

### Introduction:

- In the dynamic world of finance and investment, understanding the behavior of stock prices is of paramount importance. The stock market is a reflection of the ever-changing economic landscape, influenced by a multitude of factors such as market sentiment, company performance, and global events. In this project, we will analyze the stocks of big tech giants Amazon, Google, Apple and Microsoft. I will use python and the famous graphing library plotly for visualizations as it provides interactivity.

### Exploring Amazon Stocks:

- Our initial focus will be on Amazon, one of the most prominent companies in the tech and e-commerce sectors. We will delve into its stock performance throughout the year 2022, aiming to uncover insights that can inform investment decisions.

- Here are the key aspects we'll analyze for Amazon stocks in 2022:

- **Change in Stock Price:** Understanding the fluctuations in Amazon's stock price over the year.
- **Change in Stock Volume:** Examining the trading volume of Amazon shares and its significance.
- **Simple Moving Average:** Utilizing this widely-used technical indicator to identify trends.
- **Daily Return of Stock:** Calculating the daily returns to gauge volatility and assess risk.
- **Average Daily Return:** Evaluating the average daily returns for a broader perspective.
- **Trend Frequency:** Identifying trends in Amazon's stock price movements and their frequencies.

## Comparative Analysis of Tech Giants:

- Our exploration won't stop at Amazon. We'll broaden our scope to include Google, Microsoft, and Apple, collectively known as the technology giants. We will analyze the 2022 stock data of these companies. By analyzing these companies, we aim to gain a comprehensive view of the tech sector.

- For each of these companies, we will delve into the following aspects:

- **Change in Stock Price:** How did the stock prices of these giants evolve in 2022?
- **Change in Stock Volume:** What were the trading patterns for their shares?
- **Simple Moving Average:** Did these companies exhibit similar or distinct trends?
- **Daily Return of Stock:** What were the daily return profiles for each company?
- **Average Daily Return:** What was the average daily return for investors in each of these giants?
- **Trend Frequency:** Were there significant trends in their stock price movements, and how often did they occur?

### Daily Return Correlation:
- Lastly, we will explore the interplay between these tech giants by examining the daily return correlations among Amazon, Google, Microsoft, and Apple. This analysis will provide insights into how closely their stock prices move in relation to one another, a critical consideration for portfolio diversification.

- Throughout this project, our goal is to empower investors and financial enthusiasts with valuable insights and data-driven decisions. The world of stocks is complex, but with Python and Plotly, we have the tools to navigate it, uncover patterns, and make informed investment choices.

In [109]:
# importing libraries
import pandas as pd
import plotly.subplots as sp
import plotly.express as px
import plotly.graph_objects as go
import yfinance as yf

In [110]:
# I haved made functions to perform some tasks for the combined data of all companies stocks

# function to read data
def get_stock_data(ticker,start_date,end_date):
  """This function will take the ticker (companies name to fetch the data for ), start date and end date and will bring the data as specified"""

  # downloading the data using yahoo finance
  df = yf.download(ticker,start_date,end_date)

  # returning the downloaded data
  return df

# making a function to clean data
def clean_data(df):
  """"This function will take a dataframe as input and will remove the NaN values and duplicate values from the data."""

  # dropping the NaN values
  df.dropna(inplace=True)

  # dropping the duplicate values
  df.drop_duplicates(inplace=True)

  # returning the cleaned dataframe
  return df

# making a function to plot change in stocks over time
def stock_change(df,ticker):
  """ This function will take the dataframe and the symbol of the companies and will make line chart to show the stock price change for each compnay"""

  # making a figure with a title
  fig = go.Figure(layout_title_text="Change in Stock Price Over Time")

  # applying a for loop that iterates on the companies symbol and plot the data of that company stocks
  for tick in ticker:
    fig.add_trace(go.Scatter(x=df.index, y=df["Close"][tick], name=tick))

  # adding x axis label
  fig.update_xaxes(title_text="Date")

  # adding y axis label
  fig.update_yaxes(title_text="Stock Price")

  # adding a slider at the bottom
  fig.update_xaxes(rangeslider_visible=True)

   # controling the scaling of the y-axes in relative to the x-axis
  fig.update_yaxes(scaleanchor='x', scaleratio=0.1)
  fig.show()

# making a function to plot change in stocks volume over time
def volume_change(df,ticker):

  """ This Function will take a dataframe and symbol of companies and will plot the change in stocks volume for each company"""

  # making  a fig object with title
  fig = go.Figure(layout_title_text="Change in Stock Volume Over Time")

  # iterating over the companies symbols to plot data for each company
  for tick in ticker:
    fig.add_trace(go.Scatter(x=df.index, y=df["Volume"][tick], name=tick))

  # adding the x axis label
  fig.update_xaxes(title_text="Date")

  # adding the y-axis label
  fig.update_yaxes(title_text="Stock Volume")

  # adding the slider
  fig.update_xaxes(rangeslider_visible=True)

  # controling the scaling of the y-axis relative to the x-axis
  fig.update_yaxes(scaleanchor='x', scaleratio=0.1)

  # finally displaying the final plot
  fig.show()


# making a fuc=nction for calculating moving averages and plotting them
def moving_average(df,ticker,window):
  """ This function is will plot the Simple Moving Averages of Stocks of each company.
  This function takes in a dataframe, symbol of companies and the window for which to calculate the Simple Moving Average"""

  # making a figure with a title
  fig = go.Figure(layout_title_text="Simple Moving Average Of Stocks ({} window)".format(window))

  # iterating on the symbol of each company, calculating the defined window Simple Moving Average for each company and plotting them
  for tick in ticker:

    # making a new column of moving averages for each company
    df["moving_average_{}".format(tick)] = df["Adj Close"][tick].rolling(window=window).mean()

    # plotting the data
    fig.add_trace(go.Scatter(x=df.index, y=df["moving_average_{}".format(tick)], name=tick))

  # adding x axis label
  fig.update_xaxes(title_text="Date")

  # adding y axis label
  fig.update_yaxes(title_text="Simple Moving Average")

  # adding a slider
  fig.update_xaxes(rangeslider_visible=True)

  # controling the scaling of the y-axis relative to the x-axis
  fig.update_yaxes(scaleanchor='x', scaleratio=0.1)

  # displaying the plot finally
  fig.show()

# making a function to calculate Daily Return of Stock and Plotting them
def daily_return(df,ticker):

  """ This function will calculate the daily return of each day for the stocks of each company and will plot them for further analysis.
  It takes in the dataframe containing the data and the symbol of the companies, calculates the daily return and plot them using a lineplot"""

  # initiating a figure with a title
  fig = go.Figure(layout_title_text="Daily Return Of Stock")

  # iterating on each company data, calculating the daily return and plotting them
  for tick in ticker:

    # making a new column of daily return for each company
    df["Daily_return_{}".format(tick)] = df["Close"][tick].pct_change() * 100

    # plotting the daily return of each company
    fig.add_trace(go.Scatter(x=df.index, y=df["Daily_return_{}".format(tick)], name=tick))

  # adding x axis label
  fig.update_xaxes(title_text="Date")

  # adding y axis label
  fig.update_yaxes(title_text="Daily Return")

  # adding a slider
  fig.update_xaxes(rangeslider_visible=True)

  # controling the scaling of the y-axis relative to the x-axis
  fig.update_yaxes(scaleanchor='x', scaleratio=0.1)

  # displaying the plot finally
  fig.show()


# making a function to calculate average daily return o stocks of each company and visualizing the averages through a bar chart
def average_daily_return(df):

  """ This function will calculate the daily returns of each comapny stock, than take the average of daily returns and plot them using a bar chart.
  It takes in a dataframe only and do the calculations for daily return, average daily return and visualization through a bar chart"""

  # calculating the daily returns
  daily_return = df["Adj Close"].pct_change()

  # calculating average of daily return
  avg_daily_return = daily_return.mean()

  # making a dataframe for the averge daily returns of each company
  average_returns_df = pd.DataFrame({'Stock': avg_daily_return.index, 'Average Daily Return': avg_daily_return.values})

  # Creating a bar plot using Plotly Express
  fig = px.bar(average_returns_df, y='Stock', x='Average Daily Return', color="Stock",title="Average Daily Returns")

  # displaying the plot
  fig.show()


# making a function to visualize the trends for the stocks of each company
def trend_pie(uptrend_value,downtrend_value,ticker,df):
  """ This function make a trend column for each company stock which has three values Flat, Uptrend and Downtrend and will make a pie plot for each company.
  This function takes in the uptrend threshold value, downtrend threshold value, symbols of comapanies and the dataframe."""

  # defining uptrend threshold value
  uptrend_threshold = uptrend_value

  # defining downtrend threshold value
  downtrend_threshold = downtrend_value

# iterating on each company data and creating trend column
  for tick in ticker:

    # setting all the values initially to Flat
    df["Trend_{}".format(tick)] = 'Flat'  # Initialize with 'Sideways'

    # applying the uptrend threshold
    df.loc[df["Daily_return_{}".format(tick)] > uptrend_threshold, "Trend_{}".format(tick)] = 'Uptrend'

    # applying the downtrend threshold
    df.loc[df["Daily_return_{}".format(tick)] < downtrend_threshold, "Trend_{}".format(tick)] = 'Downtrend'

  # making the title for each subplot
  subplot_title = [i + " Trend" for i in ticker]
  # making 4 subplots
  fig = sp.make_subplots(rows=2,cols=2,subplot_titles=subplot_title,specs=[[{"type": "pie"}, {"type": "pie"}],[{"type": "pie"}, {"type": "pie"}]])

  # defining the positions of subplots
  index_positions = [[1,1],[1,2],[2,1],[2,2]]

  # iterating on the companies symbol and the position to create pie plots
  for tick,position in zip(ticker,index_positions):

    # calculating the counts of each category in the trend column for each company
    value_counts = df["Trend_{}".format(tick)].value_counts()

    # saving labels in list
    labels = value_counts.index.tolist()

    # saving values in list
    values = value_counts.values.tolist()

    # making pie plot for each companies trend
    fig.add_trace(go.Pie(values=values,labels=labels,domain=dict(x=[0.1, 1]),),row=position[0],col=position[1])

  # displaying the figure finally
  fig.show()


# making a function to visualize the correlation coefficient of daily return of  each company stock
def daily_return_correlation(df):

  """ This function will calculate the pearson correlation coefficient of daily returns of each company stock and visualize it through heatmap.
  It takes in the dataframe, calculate the pearson correlation coefficient and cisualize it."""

  # calculating the pearson correlation coefficient for daily returns of each company stocks
  corr_matrix = df[["Daily_return_AMZN"	,"Daily_return_AAPL"	,"Daily_return_GOOGL", "Daily_return_MSFT"]].corr()

  # defining the names of each comapny in the same order of the data
  stock_labels = ["AMAZON", "APPLE", "GOOGLE", "MICROSOFT"]

  # Creating a heatmap using Plotly Express
  fig = px.imshow(corr_matrix,
                x=stock_labels,
                y=stock_labels,
                color_continuous_scale='viridis',
                #zmin=-1,
                #zmax=1,
                text_auto=True)

  # Customizing the layout,adding title to the heatmap, adding x-axis label and y-axis label
  fig.update_layout(title="Correlation Heatmap",
    xaxis=dict(title="Stocks"),
    yaxis=dict(title="Stocks"))

  # Finally displaying the figure
  fig.show()




# Stock Analysis of Amazon

In [111]:
# Downloading the stock data for amazon
df = yf.download("AMZN", start='2022-01-01', end='2022-12-31')

[*********************100%%**********************]  1 of 1 completed


In [112]:
# checking the data retrieved
df.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-01-03,167.550003,170.703506,166.160507,170.404495,170.404495,63520000
2022-01-04,170.438004,171.399994,166.349503,167.522003,167.522003,70726000
2022-01-05,166.882996,167.126495,164.356995,164.356995,164.356995,64302000
2022-01-06,163.4505,164.800003,161.936996,163.253998,163.253998,51958000
2022-01-07,163.839005,165.2435,162.031006,162.554001,162.554001,46606000


In [113]:
# getting basic statistics for amazon stocks
df.describe()

Unnamed: 0,Open,High,Low,Close,Adj Close,Volume
count,251.0,251.0,251.0,251.0,251.0,251.0
mean,126.31808,128.419815,123.914148,126.098819,126.098819,76080700.0
std,23.91848,24.0516,23.701656,23.904315,23.904315,34025410.0
min,82.800003,83.480003,81.690002,81.82,81.82,35088600.0
25%,108.330498,112.38625,106.459999,108.8895,108.8895,55086950.0
50%,122.699997,124.400002,120.629997,122.769997,122.769997,66538000.0
75%,146.57,149.645004,143.752251,145.857506,145.857506,85083900.0
max,170.438004,171.399994,167.8685,170.404495,170.404495,272662000.0


In [114]:
# getting info about the columns
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 251 entries, 2022-01-03 to 2022-12-30
Data columns (total 6 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Open       251 non-null    float64
 1   High       251 non-null    float64
 2   Low        251 non-null    float64
 3   Close      251 non-null    float64
 4   Adj Close  251 non-null    float64
 5   Volume     251 non-null    int64  
dtypes: float64(5), int64(1)
memory usage: 13.7 KB


In [115]:
# checking for null values
df.isna().sum()

Open         0
High         0
Low          0
Close        0
Adj Close    0
Volume       0
dtype: int64

- No null values found in the data.

In [116]:
# checking for duplicate values
df[df.duplicated()]

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1


- No duplicate values are found in the data.

In [117]:
# making a lineplot to show change in amazon stock price over time
px.line(df,x=df.index,y="Close",title="Amazon Stock Price over time(Jan 2022-Dec 2022)")

- The line chart is showing that from Jan 2022 to Decmeber 2022 the stock price of AMAZON has decreased.

In [118]:
# making a lineplot to show change in amazon stock Volume over time
px.line(df,x=df.index,y="Volume",title="Amazon Stock Volume Over Time (Jan 2022-Dec 2022)")

- The lineplot is showing some spikes that depicts that the stock volume increase sometimes but stock volume in December 2022 is less than the stock volume in January 2022.

In [119]:
# computing 50- day moving average
df["moving_average"] = df["Adj Close"].rolling(window=50).mean()

In [120]:
# making a lineplot to show 50 moving average
px.line(df,x=df.index,y="moving_average",title="Amazon Stock Simple Moving Average (50 Day window)")

In [121]:
# Calculating daily returns as a percentage
df["Daily return"] = df["Close"].pct_change()

In [122]:
# making a lineplot to show 50 moving average
px.line(df,x=df.index,y="Daily return",title="Amazon Stock Daily Return")

In [123]:
# average daily return of amazon stock
avg_daily_return = df["Daily return"].mean()
print("Average Daily Return of Amazon Stock is",avg_daily_return)

Average Daily Return of Amazon Stock is -0.002329137756859738


In [124]:
# defining uptrend and downtrend
# uptrend is when daily return is greater than 0, downtrend is when daily return is less than 0 and Flat is when daily return is 0
uptrend_threshold = 0
downtrend_threshold = 0

# creating trend column
df['Trend'] = 'Flat'  # Initialize with 'Sideways'
df.loc[df['Daily return'] > uptrend_threshold, 'Trend'] = 'Uptrend'
df.loc[df['Daily return'] < downtrend_threshold, 'Trend'] = 'Downtrend'


In [125]:
# maikg a pie chart for amazon stocks trend
px.pie(df,names= "Trend",title="Amazon Stocks Trend Distribution" )

# Stock Analysis Of Amazon, Apple, Google and Microsoft

In [126]:
# making a list with symbols of companies to retrieve data for from yahoo finance
ticker = ["AMZN","AAPL","GOOGL","MSFT"]

# defining the starting date for the stocks
start_date = '2022-01-01'

# defining the ending date for the stocks
end_date = '2022-12-31'


# Downloading the stock data for the companies
stocks = get_stock_data(ticker,start_date,end_date)

[*********************100%%**********************]  4 of 4 completed


In [127]:
# checking the stocks data
stocks.head()

Unnamed: 0_level_0,Adj Close,Adj Close,Adj Close,Adj Close,Close,Close,Close,Close,High,High,...,Low,Low,Open,Open,Open,Open,Volume,Volume,Volume,Volume
Unnamed: 0_level_1,AAPL,AMZN,GOOGL,MSFT,AAPL,AMZN,GOOGL,MSFT,AAPL,AMZN,...,GOOGL,MSFT,AAPL,AMZN,GOOGL,MSFT,AAPL,AMZN,GOOGL,MSFT
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2022-01-03,180.190979,170.404495,144.991501,329.394897,182.009995,170.404495,144.991501,334.75,182.880005,170.703506,...,143.712997,329.779999,177.830002,167.550003,145.054993,335.350006,104487900,63520000,28646000,28865100
2022-01-04,177.904053,167.522003,144.399506,323.746704,179.699997,167.522003,144.399506,329.01001,182.940002,171.399994,...,143.716507,326.119995,182.630005,170.438004,145.395996,334.829987,99310400,70726000,28400000,32674300
2022-01-05,173.171844,164.356995,137.774994,311.318756,174.919998,164.356995,137.774994,316.380005,180.169998,167.126495,...,137.688004,315.980011,179.610001,166.882996,144.419998,325.859985,94537600,64302000,54618000,40054300
2022-01-06,170.281006,163.253998,137.747498,308.858734,172.0,163.253998,137.747498,313.880005,175.300003,164.800003,...,136.558502,311.48999,172.699997,163.4505,136.998505,313.149994,96904000,51958000,37348000,39646100
2022-01-07,170.449341,162.554001,137.016998,309.016174,172.169998,162.554001,137.016998,314.040009,174.139999,165.2435,...,135.766495,310.089996,172.889999,163.839005,138.145493,314.149994,86709100,46606000,29760000,32720000


In [128]:
# getting basic statistics about stocks data
stocks.describe()

Unnamed: 0_level_0,Adj Close,Adj Close,Adj Close,Adj Close,Close,Close,Close,Close,High,High,...,Low,Low,Open,Open,Open,Open,Volume,Volume,Volume,Volume
Unnamed: 0_level_1,AAPL,AMZN,GOOGL,MSFT,AAPL,AMZN,GOOGL,MSFT,AAPL,AMZN,...,GOOGL,MSFT,AAPL,AMZN,GOOGL,MSFT,AAPL,AMZN,GOOGL,MSFT
count,251.0,251.0,251.0,251.0,251.0,251.0,251.0,251.0,251.0,251.0,...,251.0,251.0,251.0,251.0,251.0,251.0,251.0,251.0,251.0,251.0
mean,153.726056,126.098819,114.760371,265.72924,154.83506,126.098819,114.760371,268.917091,156.907809,128.419815,...,113.201064,265.289721,154.802709,126.31808,114.879681,269.10502,87910380.0,76080700.0,34767530.0,31219320.0
std,12.79099,23.904315,16.109141,24.843659,13.056081,23.904315,16.109141,25.761774,12.937389,24.0516,...,15.952845,25.460905,13.063034,23.91848,16.268895,26.059584,23656990.0,34025410.0,13619270.0,11483870.0
min,125.504547,81.82,83.43,212.199982,126.040001,81.82,83.43,214.25,129.949997,83.480003,...,83.339996,213.429993,127.989998,82.800003,85.400002,217.550003,35195900.0,35088600.0,9701400.0,9200800.0
25%,143.846016,108.8895,100.879997,244.753212,144.645004,108.8895,100.879997,247.18,146.709999,112.38625,...,99.134998,243.900002,144.330002,108.330498,100.005001,245.82,72297400.0,55086950.0,26012150.0,23334300.0
50%,152.968796,122.769997,113.891998,262.797943,154.089996,122.769997,113.891998,265.899994,155.830002,124.400002,...,112.480003,262.399994,154.009995,122.699997,113.404999,265.679993,83737200.0,66538000.0,31696000.0,29043900.0
75%,164.56942,145.857506,129.2155,285.799667,165.915001,145.857506,129.2155,289.744995,167.989998,149.645004,...,127.320999,285.464996,166.189995,146.57,130.606255,289.959991,96937050.0,85083900.0,39972000.0,35292600.0
max,180.190979,170.404495,148.0,329.394897,182.009995,170.404495,148.0,334.75,182.940002,171.399994,...,145.522507,329.779999,182.630005,170.438004,151.25,335.350006,182602000.0,272662000.0,123200000.0,90428900.0


In [129]:
# getting info about stocks data
stocks.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 251 entries, 2022-01-03 to 2022-12-30
Data columns (total 24 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   (Adj Close, AAPL)   251 non-null    float64
 1   (Adj Close, AMZN)   251 non-null    float64
 2   (Adj Close, GOOGL)  251 non-null    float64
 3   (Adj Close, MSFT)   251 non-null    float64
 4   (Close, AAPL)       251 non-null    float64
 5   (Close, AMZN)       251 non-null    float64
 6   (Close, GOOGL)      251 non-null    float64
 7   (Close, MSFT)       251 non-null    float64
 8   (High, AAPL)        251 non-null    float64
 9   (High, AMZN)        251 non-null    float64
 10  (High, GOOGL)       251 non-null    float64
 11  (High, MSFT)        251 non-null    float64
 12  (Low, AAPL)         251 non-null    float64
 13  (Low, AMZN)         251 non-null    float64
 14  (Low, GOOGL)        251 non-null    float64
 15  (Low, MSFT)         251 non-null    fl

In [130]:
# cleaning the retrived data of the stocks using the function clean data
clean_stocks = clean_data(stocks)

In [131]:
# visualizing the change in stock price using the function stock change
stock_change(clean_stocks,ticker)

- A slider is added in the bottom of the plot to investigate the change in stock price for a particular point.

In [132]:
# visualizing the change in stock volume using the function volume change
volume_change(clean_stocks,ticker)

- A slider is added in the bottom of the plot to investigate the change in stock volume for a particular point.

In [133]:
# visualizing the moving averages of stocks using the moving average function
moving_average(clean_stocks,ticker,50)

- The gap in the first part of the plot shows nan values as we calclated 50 daya moving average.
- A slider is added in the bottom to study the plot for particular point of time.

In [134]:
# visualizing the daily return of stocks for each company using the daily return function
daily_return(clean_stocks,ticker)

- A slider is added in the bottom to zoom in to a particular point on the plot to investigate.

In [135]:
# visualizing the average daily return of stocks for all comapnies using the average daily return function
average_daily_return(clean_stocks)

- The barplot shows the average daily return value of stocks of each company.

In [136]:
# creating pie charts to visualized the trends for each company stock
# the first zero refers to up trend threshold value and second zero refers to down trend threshold value
trend_pie(0,0,ticker,clean_stocks)

- Each pie chart shows the trend frequency of stocks of each company.

In [137]:
# calculating the correlation coefficient between the daily return of stocks and visualizing them using heatmap, using the function daily return correlation
daily_return_correlation(clean_stocks)

- A correlation coefficient is a number between -1 and 1 that tells you the strength and direction of a relationship between variables. If correlation of two variables is +1 than they are strongly correlated and have the same direction which means if one is increased the other one also tends to increase and vice verse