# Stock Market Anomaly Detection: Process We Can Follow

Anomalies in the stock market are important because they can indicate opportunities or risks. For example, a sudden spike in a stockâ€™s price could be due to positive news about the company or its industry, which signals a potential investment opportunity. Conversely, an unexpected price drop could warn of underlying issues or market sentiment changes, which signals a risk that investors may need to manage.

Below is the process we can follow for the task of stock market anomaly detection:

+ Gather historical stock market data, including prices (open, high, low, close, adjusted close) and trading volumes.
+ Develop additional features that may help in detecting anomalies, such as moving averages, relative strength index (RSI), or percentage changes over specific periods.
+ Visualize the data to identify potential outliers or unusual patterns across time.
+ Employ statistical methods like Z-score analysis, where data points that are a certain number of standard deviations away from the mean are flagged as anomalies.
+ Use the insights gained from anomaly detection to inform investment decisions, risk management, and strategic planning.



## Collecting Real-time Stock Market Data using Python

Before getting started with Stock Market Anomaly Detection, Real-time stock market data of several companies will e collected. The yfinance API will be used. You can install it on your Python virtual environment (if you don't have it) by using the command mentioned below on your terminal or the command prompt:

+ pip install yfinance

In [4]:
import pandas as pd
import numpy as np
import yfinance as yf
from datetime import date, timedelta
import matplotlib.pyplot as plt
import seaborn as sns



In [10]:
# define the time period
end_date = date.today().strftime('%Y-%m-%d')
start_date = (date.today() - timedelta(days = 365)).strftime("%Y-%m-%d")

# list of stocks tickers to download
tickers = ['AAPL', 'MSFT', 'NFLX', 'GOOG', 'TSLA']

market_data = yf.download(tickers, start = start_date,
                  end = end_date, progress=False)

# reset index to bring date into columns for the melt function
market_data = market_data.reset_index()

#melt the dateframe to make it long format where each row
# is a unique combination of date, Ticker and Attributes
market_data_melted = market_data.melt(id_vars=['Date'], var_name=['Attribute', 'Ticker'])

# pivot the melted data to have attributes (Open, High, Low etc.) as columns
market_data_pivoted = market_data_melted.pivot_table(index = ['Date', 'Ticker'],
                                                    columns = 'Attribute',
                                                    values = 'value',
                                                    aggfunc='first')
# reset index to turn multi-index into columns
stock_Data = market_data_pivoted.reset_index()

stock_Data.head()

Attribute,Date,Ticker,Adj Close,Close,High,Low,Open,Volume
0,2023-06-21,AAPL,183.236389,183.960007,185.410004,182.589996,184.899994,49515700.0
1,2023-06-21,GOOG,121.122169,121.260002,123.410004,120.860001,123.235001,22612000.0
2,2023-06-21,MSFT,331.567871,333.559998,337.730011,332.070007,336.369995,25117800.0
3,2023-06-21,NFLX,424.450012,424.450012,434.549988,422.540009,432.649994,5146400.0
4,2023-06-21,TSLA,259.459991,259.459991,276.98999,257.779999,275.130005,211797100.0
