# Stock Market Analysis 

<img src="https://img.freepik.com/free-vector/gradient-stock-market-concept_23-2149166910.jpg" width = 800px>

# Introduction

Success in any financial market requires one to identify solid investments. When a stock or derivative is undervalued, it makes sense to buy. If it's overvalued, perhaps it's time to sell. While these finance decisions were historically made manually by professionals, technology has ushered in new opportunities for retail investors. Data scientists, specifically, may be interested to explore quantitative trading, where decisions are executed programmatically based on predictions from analysis and trained models.

Stocks can be a valuable part of your investment portfolio. Owning stocks in different companies can help you build your savings, protect your money from inflation and taxes, and maximize income from your investments. It's important to know that there are risks when investing in the stock market.

Historically, long-term equity returns have been better than returns from cash or fixed-income investments such as bonds. However, stock prices tend to rise and fall over time. Investors may want to consider a long-term perspective for their equity portfolio because these stock-market fluctuations do tend to smooth out over longer periods of time.

Taxes and inflation can impact your wealth. Equity investments can give investors better tax treatment over the long term, which can help slow or prevent the negative effects of both taxes and inflation.
Some companies pay shareholders dividends or special distributions. These payments can provide you with regular investment income and enhance your return.

### S&P 500 - Microsoft

As one of the most diversified companies, Microsoft (MSFT -0.65%) has strong positions in operating systems, video games, cloud computing, productivity software, and even social media with LinkedIn. This year, the company will also expand its digital advertising business through a partnership with Netflix. 

Microsoft's varied revenue streams proved their strength in 2022, with segments less affected by economic challenges able to maintain earnings growth despite declines in specific markets. The company's stock is down 22% year over year. However, its growth of 157% over the last five years proves it is a reliable long-term investment.

In mid-January, news broke that Microsoft is considering investing 10USD billion in OpenAI, an artificial intelligence (AI) company best known for its AI chatbot ChatGPT. The deal would expand Microsoft's stake in the company after it initially invested 1USD billion in 2019. Microsoft has a share in the company and now ChatGPT has a paid for service to access the Chatbot. This will increase it's revenue. 

### S&P 500 - Amazon

Brand Finance recognized Amazon as the most valuable brand in the world in 2023. That distinction highlights the tremendous popularity its marketplace enjoys among consumers. Indeed, Amazon draws more visitors each month than any other digital shopping destination, and it accounted for 38% of online retail sales in North America and Western Europe last year. Its Prime membership program creates even more value for consumers, and its massive logistics network adds value for merchants, accelerating the network effects inherent to its business model.

Looking ahead, Ameco Research estimates that global e-commerce sales will grow at 13.6% annually to reach 15USD trillion by 2030. Amazon is exceptionally well positioned to benefit from that trend.

Amazon Web Services (AWS) was the first hyperscale public cloud, and it still dominates the market for cloud infrastructure and platform services (CIPS). AWS accounted for 32% of CIPS spending in the fourth quarter last year, putting it nine percentage points ahead of the runner-up Microsoft Azure.

### Cryptocurrency - Bitcoin

The blockchain technology underlying bitcoin and other cryptocurrencies has been hailed as a potential gamechanger for a large number of industries, from shipping and supply chains to banking and healthcare. By removing intermediaries and trusted actors from computer networks, distributed ledgers can facilitate new types of economic activity that were not possible before.

This potential makes for an attractive investment to people who believe in the future of digital currencies. For people who believe in that promise, investing in cryptocurrency represents a way to earn high returns while supporting the future of technology.

Another common reason to invest in cryptocurrency is the desire for a reliable, long-term store of value. Unlike fiat money, most cryptocurrencies have a limited supply, capped by mathematical algorithms.This makes it impossible for any political body or government agency to dilute their value through inflation. Moreover, due to the cryptographic nature of cryptocurrencies, it is impossible for a government body to tax or confiscate tokens without the cooperation of the owner.

Bitcoin has grabbed the attention of the world over the last decade, as it could represent a new form of decentralized money. The ability to have a trustless payment system without a third party intermediary has many people betting on its future being bright.Bitcoin is the fastest growing crypto currency in the world. This makes Bitcoin an important cryptocurrency and a good investment option for you to make for the market.


### Inflation

Inflation occurs when the supply of money increases relative to the level of productive output in the economy. Prices tend to rise because more dollars are chasing relatively fewer goods. Another way of stating this phenomenon is that the purchasing power of each money unit declines.

This means that inflation can have a huge impact on the way we save and invest our money: it can either reduce the value of your investment portfolio over time, or you could possibly use it to your advantage to help your investments grow.


# **IMPORTING PACKAGES**

In [1]:
# packages to install before running notebook
#!pip install -q yfinance
#!pip install pandas-datareader
#!pip install yahoo-finance

In [12]:
# for data access and mannipulation
import requests
import pandas as pd
import numpy as np

# visualizations
import plotly.express as px
import plotly.graph_objects as go
import matplotlib.pyplot as plt
import seaborn as sns

#sns.set_style('whitegrid')
#plt.style.use("fivethirtyeight")
#%matplotlib inline

# For reading stock data from yahoo
from pandas_datareader.data import DataReader
import yfinance as yf
from yahoo_finance import Share
from yahoo_finance import Currency
from pandas_datareader import data as pdr
yf.pdr_override()

# For time stamps
from datetime import datetime

#ignoring warnings
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline

# LOADING DATA

Data is extracted from different sources. Yahoo Finance is a rich resource of financial market data and tools to find compelling investments.We use Yahoo Finance API for Microsoft and Amazon Data, and Alpha Vantage API for bitcoin data and inflation data.

### MICROSOFT AND AMAZON DATA

We start from 2029 before the pandemic to see how the market was performing and how the pandemic shaped that moving forward. This will help in decision making to see how sensitive these are/were to the pandemic. 

In [13]:
# we specify our start and end date
end = datetime.now()
start = datetime(2019, 1, 1)

In [14]:
#Amazon data
amazon_df = yf.download('AMZN', start, end)
amazon_df = amazon_df.reset_index(drop = False)
amazon_df['company_name'] = ['AMAZON' for i in range(len(amazon_df))]

#Microsoft data
msft_df = yf.download('MSFT', start, end)
msft_df = msft_df.reset_index(drop = False)
msft_df['company_name'] = ['MSFT' for i in range(len(msft_df))]


[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed


In [15]:
#How the data looks
msft_df.head()

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,company_name
0,2019-01-02,99.550003,101.75,98.940002,101.120003,96.632668,35329300,MSFT
1,2019-01-03,100.099998,100.190002,97.199997,97.400002,93.077728,42579100,MSFT
2,2019-01-04,99.720001,102.510002,98.93,101.93,97.406708,44060600,MSFT
3,2019-01-07,101.639999,103.269997,100.980003,102.059998,97.530952,35656100,MSFT
4,2019-01-08,103.040001,103.970001,101.709999,102.800003,98.238098,31514400,MSFT


Both Datasets are in USD so we need to convert to our ZAR currency. We assume the current conversion rate, **1USD = 18,71ZAR**

In [None]:
#Microsoft
msft_df['Open'] = msft_df['Open']*18.71
msft_df['High'] = msft_df['High']*18.71
msft_df['Low'] = msft_df['Low']*18.71
msft_df['Close'] = msft_df['Close']*18.71
msft_df['Adj Close'] = msft_df['Adj Close']*18.71
msft_df.head()

In [None]:
#Amazon
amazon_df['Open'] = amazon_df['Open']*18.71
amazon_df['High'] = amazon_df['High']*18.71
amazon_df['Low'] = amazon_df['Low']*18.71
amazon_df['Close'] = amazon_df['Close']*18.71
amazon_df['Adj Close'] = amazon_df['Adj Close']*18.71
amazon_df.head()

### BITCOIN

In [32]:
# csv downloaded using using api key

bitcoin_df = pd.read_csv('currency_monthly_BTC_ZAR.csv')

In [33]:
# first 5 rows of df
bitcoin_df.head()

Unnamed: 0,timestamp,open (ZAR),high (ZAR),low (ZAR),close (ZAR),open (USD),high (USD),low (USD),close (USD),volume,market cap (USD)
0,2023-05-10,545462.863464,556411.9764,508682.20324,515165.466329,29233.2,29820.0,27262.0,27609.46,476512.3,476512.3
1,2023-04-30,531135.721547,578429.62,502726.617236,545463.050054,28465.36,31000.0,26942.82,29233.21,1626746.0,1626746.0
2,2023-03-31,431799.017461,544557.527814,364766.861292,531135.721547,23141.57,29184.68,19549.09,28465.36,9516189.0,9516189.0
3,2023-02-28,431492.263173,471140.255,398390.042151,431799.017461,23125.13,25250.0,21351.07,23141.57,8642691.0,8642691.0
4,2023-01-31,308653.217265,447080.195071,307855.35757,431492.263173,16541.77,23960.54,16499.01,23125.13,7977029.0,7977029.0


We will focus on the ZAR currency.

In [34]:
# taking columns with ZAR
ZAR = [i for i in bitcoin_df.columns if 'ZAR' in i]
cols = ['timestamp'] + ZAR + ['volume']

bitcoin_df = bitcoin_df[cols]
bitcoin_df.head()

Unnamed: 0,timestamp,open (ZAR),high (ZAR),low (ZAR),close (ZAR),volume
0,2023-05-10,545462.863464,556411.9764,508682.20324,515165.466329,476512.3
1,2023-04-30,531135.721547,578429.62,502726.617236,545463.050054,1626746.0
2,2023-03-31,431799.017461,544557.527814,364766.861292,531135.721547,9516189.0
3,2023-02-28,431492.263173,471140.255,398390.042151,431799.017461,8642691.0
4,2023-01-31,308653.217265,447080.195071,307855.35757,431492.263173,7977029.0


Our data now comforms to our currency.

### INFLATION

In [21]:
# inflation csv downloaded using using api key

inflation_df = pd.read_csv('INFLATION.csv')
inflation_df = inflation_df.sort_values(by = 'timestamp')

In [22]:
inflation_df.head()

Unnamed: 0,timestamp,value
62,1960-01-01,1.457976
61,1961-01-01,1.070724
60,1962-01-01,1.198773
59,1963-01-01,1.239669
58,1964-01-01,1.278912


# Descriptive Statistics about the Data
Descriptive statistics include those that summarize the central tendency, dispersion, and shape of a dataset’s distribution, excluding `NaN` values. This should provide high level statistics about our data.


In [23]:
amazon_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1096 entries, 0 to 1095
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   Date          1096 non-null   datetime64[ns]
 1   Open          1096 non-null   float64       
 2   High          1096 non-null   float64       
 3   Low           1096 non-null   float64       
 4   Close         1096 non-null   float64       
 5   Adj Close     1096 non-null   float64       
 6   Volume        1096 non-null   int64         
 7   company_name  1096 non-null   object        
dtypes: datetime64[ns](1), float64(5), int64(1), object(1)
memory usage: 68.6+ KB


Reviewing the content of our data, we can see that the data is numeric(except for the date and company name), aggregated daily and the date is the index of the data. Notice also that weekends and holiday are missing from the records.

In [24]:
msft_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1096 entries, 0 to 1095
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   Date          1096 non-null   datetime64[ns]
 1   Open          1096 non-null   float64       
 2   High          1096 non-null   float64       
 3   Low           1096 non-null   float64       
 4   Close         1096 non-null   float64       
 5   Adj Close     1096 non-null   float64       
 6   Volume        1096 non-null   int64         
 7   company_name  1096 non-null   object        
dtypes: datetime64[ns](1), float64(5), int64(1), object(1)
memory usage: 68.6+ KB


Both Data sets have equal number of rows because we chose the same time frame.

# Exploratory Data Analysis (EDA)

This include methods like:

- Maximize our natural pattern-recognition abilities to extract insights.
- Uncover underlying structure (e.g skewness).
- Extract important variables and find interesting relations among the variables.
- Detect Anomalies.
