# Statisticals Fundamentals

In this activity, students will code along with the instructor to get the opportunity to practice using statistical fundamentals to create a report for a group of stocks. The report will give a recommendation for each stock on whether it is over or under valued and more or less volatile than the market.

## Importing Required Modules

In [None]:
# Import modules
import pandas as pd
from pathlib import Path
import matplotlib.pyplot as plt

## Loading the Stocks Data

  - Each CVS file contains a stock's closing price and the date of the closing price.

  - Create a `Path` object for each CSV filepath.

In [None]:
# Set paths to CSV files
hd_csv_path = Path("../Resources/HD.csv")
intc_csv_path = Path("../Resources/INTC.csv")
mu_csv_path = Path("../Resources/MU.csv")
nvda_csv_path = Path("../Resources/NVDA.csv")
tsla_csv_path = Path("../Resources/TSLA.csv")
sp500_path = Path("../Resources/sp500.csv")

For each CSV file read the data into a `pandas` `DataFrame`.

  - Set the index column to be the date.

  - Infer the date time format.

  - Parse all dates when the CSV file is loaded.

In [None]:
# Read in CSV files


## Coding Statistical Measures in Python

### Create a function named `calculate_mean`

   - Calculate mean should return the average value for a given `list` or `Series`.

   - $\mu = \frac{\sum{x_{i}}}{n}$

   - Choose a function name that will not conflict with any modules that may have been imported.

In [None]:
# Create a function named 'calculate_mean'.
def calculate_mean(data_set):
    

In [None]:
# Test the `calculate_mean` function
data = [1, 2, 3, 4, 5]
print(calculate_mean(data))

### Create a function named `calculate_variance`
   - Variance is the squared average change around the mean.

   - ${S}^2 = \frac{\sum{ (x_{i} - \mu })^{2}}{ n - 1}$


In [None]:
# Create a function named 'calculate_variance'.
def calculate_variance(data_set):
    

In [None]:
# Test the `calculate_variance` function
data = [1, 2, 3, 4, 5]
print(calculate_variance(data))

### Create a function named `calculate_standard_deviation`

 - The standard deviation is the square root of the variance.

 - $\sigma = \sqrt{S^{2}}$

In [None]:
# Create a function named 'calculate_standard_deviation'.
def calculate_standard_deviation(data_set):
    

In [None]:
# Test the `calculate_standard_deviation` function
data = [1, 2, 3, 4, 5]
print(calculate_standard_deviation(data))

## Coding Helper Functions

### Create a function named `check_value`

   - The function should compare the most recent price of the asset to it's mean price.

   - If the most recent price is greater than the mean price the asset is over-valued.

   - If the most recent price is under than the mean price the asset is under-valued.

   - If neither case is true then the most recent price must be at the mean price.

In [None]:
# Create a function to check to most recent price against the mean price to determine if the stock is overvalued.
def check_value(current_price, mean_price):
    

### Create a function named `compare_volatility`
   
   - The function should compare the standard deviation of an assets price change percentage to a market's.

   - If the asset's standard deviation is greater than the market's the stock is more volatile otherwise it's less volatile.

In [None]:
# Create a function to compare the volatility with the underlying market
def compare_volatility(stock_std, market_std):
    

## Coding the Stocks Report

### Calculate the Daily Percent Change for the SP500

In [None]:
# Calculate the daily percent changes for sp500 and drop n/a values


## Calculate the Standard Deviation for the SP500

In [None]:
# Calculate the standard deviation for the sp500


### Create a Python Dictionary of Stocks to Run the Report On
   
   - Map each stock name to it's dataframe

   - Do not include the SP500

   - Example: stocks_to_check = {"stock_name" : stock_df}

In [None]:
# Create a dictionary for all stocks except the sp500


### Generate the Report

  - Loop through the dictionary of stocks.

  - **Hint**: Use the `items()` method for dictionaries. You can read more on the [documentation page](https://docs.python.org/3/tutorial/datastructures.html#looping-techniques).

  - For each stock:
    * Calculate the daily percent change.
    * Get the most recent price.
    * Calculate the mean and standard deviation using the functions you created.
    * Print the stock's name.
    * Print the statistics that you calculated.
    * Using `check_value` see if the stock is over or under valued.
    * Using `compare_volatility` check if the stock is more or less volatile than the SP500
    * Plot a box plot of the daily percent change

In [None]:
# Loop through the stocks in the dictionary and compare their performance with the sp500.
for stock_name, dataframe in stocks_to_check.items():

    # Calculate the daily percent change for each stock
    

    # Get most recent price
    

    # Calculate the mean price percent change
    
    
    # Calculate the standard deviation of the percent change
    
    
    # Print the stock name and calculated statistics
    

    # Using check_value, check if the stock is overvalued or not
    

    # Compare the stock's volatility with the market
    