<a href="https://colab.research.google.com/github/StevenMW11/stocks-dashboard/blob/main/stocks_dashboard.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Prompt


Create an automated data dashboard which reads data about stock prices, and produces a corresponding chart of the prices over time. The dashboard should be produced by a reusable function, which, if written correctly, can be used in the provided "Stocks Dashboard" section at the very bottom of this notebook. The function will produce a dataviz of the stock prices over time, as well as a summary report of key metrics. 

To satisfy the basic requirements, the program can read data from one of the provided CSV files.  Each CSV file contains some (outdated) stock price data about the company name mentioned in the CSV file name ('AAPL', 'DIS', 'GOOGL', 'MSFT', 'NFLX', 'SBUX', or 'TSLA'). You can assume all these provided CSV files will conform to the same file naming conventions and will have the same columns and row structure.

For further exploration, the program can alternatively fetch real-time stock market data from the Internet. In this case, the dashboard will be able to work with a broader range of stock symbols.



# Evaluation

Submissions will be evaluated according to their ability to meet all requirements (see sections below), as summarized by the following rubric:

  + The **Function and Paramter Requirements** are worth 10%.

  + The **Symbol Validation Requirements** are worth 10%.

  + **Part I (Data Extraction)** is worth 20%. For full points, you're encouraged to fetch real-time data from the Internet instead of using the provided CSV files. If there is an attempt to fetch data from the API, the API Key will need to be handled securely, otherwise compromised keys will lead to security deductions.

  + **Part II (Data Processing)** is worth 30%. For full points, make sure you answer all questions correctly. Each question is worth around the same weight. The last two "further exploration" questions are optional, and deliberately harder / more involved, and may earn bonus points.

  + **Part III (Data Visualization)** is worth 30%. For full points, make sure you produce a polished chart, with title and axis labels and prices formatted as USD. The further exploration challenges may earn bonus points.



# Requirements



## Function and Parameter Requirements

Define a function called `generate_stocks_report()`.

The function should accept a stock symbol as a parameter input, called `symbol`, which is expected to be a string datatype. Example valid invocation: `generate_stocks_report("MSFT")`.

If the symbol parameter is not supplied by the user during function invokation, the function should use `"NFLX"` as the default `symbol`. Example valid invocation: `generate_stocks_report()`.



## Symbol Validation Requirements

If an invalid stock symbol is passed in, the report generation function should gracefully handle any errors and display only a friendly error message like "OOPS, couldn't find that stock. Please check your symbol and try again."

If your solution uses the provided CSV files, the only valid inputs are the symbols corresponding to the provided files. If the user tries to input a symbol that doesn't have a corresponding CSV file, that symbol is invalid.


If your solution uses the AlphaVantage API, the only valid inputs are ones that lead to successful responses. If the user tries to input a symbol that that doesn't lead to a successful corresponding API response, that symbol is invalid.

> HINT: in either case, try wrapping your data extraction method around a `try... except` block, or using conditional logic to detect whether the extraction method has produced the data we need to move forward.

In either case, the program should not try to process data that doesn't exist, and the program should avoid crashed / red cells.

## Part I (Data Extraction Requirements)


**Basic Requirements**

Run the **Setup Cell #1**, below, to download the provided CSV files into the colab filesystem. Optionally download them and inspect them in spreadsheet software to get familiar with the structure. Notice these files represent the price of a given stock on an example day in the past (i.e. outdated data).

> NOTE: you can assume the provided CSV files adhere to the same naming conventions, so for the symbol "NFLX", the corresponding CSV file would be named "daily_adjusted_nflx.csv".

**Further Exploration**

Later, if you'd like to make the dashboard more dynamic, instead of reading the stock data from local CSV file(s), fetch the stock data from the Internet instead. Use the AlphaVantage API's ["daily adjusted" endpoint](https://www.alphavantage.co/documentation/#dailyadj). This is a "premium" endpoint, so first obtain a premium API Key from the professor.

> SECURITY NOTE: Remember to use `getpass` to securely ask for the user's API Key, to keep this credential secure and private. We shouldn't see it's value hard-coded or displayed / printed out anywhere. See **Setup Cell #4** below, for an example of using `getpass`.

## Part II (Data Processing Requirements)

**Basic Requirements**

When invoked, the report generation function should reference the stock data obtained in Part I to display a report of details about the stock, including answers to the questions below.

> NOTE: the provided answers below are for NFLX, so use NFLX when checking your answers...

A) Print the **column names** / available fields (i.e. `['timestamp', 'open', 'high', 'low', 'close', 'adjusted_close', 'volume', 'dividend_amount', 'split_coefficient']`).

B) Print the **number of rows** / available days (i.e. `100`).

C) Print the **latest day** available (i.e. `2021-10-18`).

D) Print the **earliest day** available (i.e. `2021-05-27`).

E) Print the (adjusted) **closing price on the latest day**, formatted as dollars and cents with a dollar sign and two decimal places (i.e. `$637.97`).

F) Print the (adjusted) **closing price on the earliest day**, formatted as dollars and cents with a dollar sign and two decimal places (i.e. `$503.86`).

G) Print the **100-day high price**, formatted as dollars and cents with a dollar sign and two decimal places (i.e. `$646.84`). NOTE: the 100-day high price is equal to the maximum of all the available high prices.


H) Print the **100-day low price**, formatted as dollars and cents with a dollar sign and two decimal places (i.e. `$482.14`). NOTE: the 100-day low price is equal to the minimum of all the available low prices.

**Further Exploration**

I) Print the **percentage change** between the earliest closing price and the latest closing price, as identified in parts G and H, above, formatted with a percent sign and rounded to four decimal places (i.e. `26.6165%`). NOTE: percent change is defined as `(latest - earliest) / earliest`. HINT: leverage the `to_pct` function provided in the setup cell when printing the final percentage.

J) Print the **50 day moving average price for the latest day** (i.e. something around `$581.32`, depending on your methodology). The 50-day moving average for each given day is calculated by averaging the closing prices of the previous 50 daily periods. HINT: create a separate column to store the 50-day moving average for each day, using the [`rolling` method](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html) on the closing prices column, with a window of 50, and then taking the mean of those values.



## Part III (Dataviz Requirements)


When invoked, the report generation function should also display a chart of the closing prices over time, including the following components:

  1. **Chart Title**, which includes the selected stock symbol (i.e. `"Daily Stock Prices (NFLX)"`)
  2. **Axis labels** (i.e. "Closing Price" and "Date" respectively).
  3. **Prices formatted with a dollar signs**, wherever they appear (see axis ticks).


Example:

<img width="1266" alt="Screen Shot 2021-10-21 at 10 08 39 AM" src="https://user-images.githubusercontent.com/1328807/138295257-c285e730-721a-445f-868d-fab55588dab1.png">

> NOTE: It doesn't matter whether the chart comes before or after the stock details.

**Further Exploration**

Consider displaying a [candlestick chart](https://plotly.com/python/ohlc-charts/) instead of a line chart. 

If you do, consider also displaying a 50-day moving average trend line as well. HINT: if you do both, you might need to plot two different graph objects (see [example](https://github.com/prof-rossetti/intro-to-python/blob/main/notes/python/packages/plotly.md#charting-multiple-graph-objects)).



# Setup

In [69]:
# SETUP CELL 1
# ... run this cell to download some CSV files into the colab filesystem
# ... for use in the basic requirements

import os
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import requests #ADDED IN FOR REQUEST BELOW (TO DOWNLOAD CSV'S TO LOCAL ENVIRONMENT)

symbols = ["googl", "msft", "nflx", "dis", "sbux", "aapl", "tsla"]

for symbol in symbols:
    csv_filename = f"daily_adjusted_{symbol}.csv"
    if not os.path.isfile(csv_filename):
        print("DOWNLOADING", "...", csv_filename)

        #ADJUSTED THE FUNCTION TO WRITE TO A CSV LOCALLY - AS WORK WAS DONE IN LOCAL ENVIRONMENT (NOT COLAB)
        file_url = requests.get(f"https://raw.githubusercontent.com/prof-rossetti/intro-to-python/main/data/{csv_filename}")
        # file_url = f"https://raw.githubusercontent.com/prof-rossetti/intro-to-python/main/data/{csv_filename}"
        
        with open(f"daily_adjusted_{symbol}.csv", "wb") as file:
            file.write(file_url.content)
        #!wget -q $file_url

In [63]:
# SETUP CELL 2
# ... run this cell to define a dollar-sign formatting function, so you can use it later

def to_usd(my_price):
    """
    Converts a numeric value to usd-formatted string, for printing and display purposes.
    
    Param: my_price (int or float) like 4000.444444
    
    Example: to_usd(4000.444444)
    
    Returns: $4,000.44
    """
    return f"${my_price:,.2f}" 

#print(to_usd(4.5))
#print(to_usd(200000.9999))

In [64]:
# SETUP CELL 3
# ... run this cell to define a percent-sign formatting function, so you can use it later

def to_pct(my_number):
    """
    Formats a decimal number as a percentage, rounded to 4 decimal places, with a percent sign.
    
    Param my_number (float) like 0.95555555555
    
    Returns (str) like '95.5556%'
    """
    return f"{(my_number * 100):.4f}%"
    
#print(to_pct(0.5))
#print(to_pct(.955555555))

In [65]:
# SETUP CELL 4
# ... uncomment the code below and run this cell to securely ask for an API key
# ... (for use in the further exploration only)

from getpass import getpass
api_key = getpass("Please input your AlphaVantage API Key: ")

# Solution

## Report Generation Function 


In [81]:
#STOCK DATA USING API AND ALPHAVANTAGE

def stock_report_analysis_api_source(symbol):
#GET THE CORRECT DATA (USING SYMBOL) & CREATE PANDAS DF
    url = f"https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol={symbol}&apikey={api_key}&datatype=csv"
    df = pd.read_csv(url)
        
    print(f"Stock Report: {symbol}")
    #PRINT COLUMN NAMES 
    headers = df.columns
    header_list = []
    for header in headers:
        header_list.append(header)
    print (f"...Available Information: {header_list}")
    
    #PRINT NUMBER OF ROWS
    record_count = len(df.index)
    print(f"...Total Records: {record_count}")
    
    #PRINT LATEST DAY AVAILABLE - SORT DATAFRAME BY DATE (DESCENDING)
    df.sort_values(by='timestamp', ascending=False)
    last_date = df.iloc[0]['timestamp']
    print(f"...Latest Date: {last_date}")
    
    #PRINT EARLIEST DAY AVAILABLE
    first_date = df.iloc[record_count-1]['timestamp']
    print(f"...Earliest Date: {first_date}")
    
    #PRINT LATEST DAY CLOSING PRICE
    last_date_closing_price = df.iloc[0]['close']
    last_date_closing_price_usd = to_usd(last_date_closing_price)
    print(f"...Adjusted Closing Price on Latest Date ({last_date}): {last_date_closing_price_usd}")
    
    #PRINT EARLIEST DAY CLOSING PRICE
    first_date_closing_price = df.iloc[record_count-1]['close']
    first_date_closing_price_usd = to_usd(first_date_closing_price)
    print(f"...Adjusted Closing Price on Earliest Date ({first_date}): {first_date_closing_price_usd}")
    
    #PRINT 100 DAY HIGHEST/LOWEST PRICE (MAX/MIN)
    hundred_day_max = to_usd(max(df['high']))
    hundred_day_min = to_usd(min(df['low']))
    print(f"...100 Day High: {hundred_day_max}")       
    print(f"...100 Day Low: {hundred_day_min}")       
    
    #PRINT %CHANGE OF CLOSING PRICES
    pct_change = to_pct((int(last_date_closing_price)-int(first_date_closing_price))/(int(first_date_closing_price)))
    print(f"...Closing Price Percentage Change: {pct_change}")         
    
    #CREATE 50 DAY MOVING AVERAGE
    df['moving_average'] = df['close'].rolling(50).mean()
    fifty_day_ma = to_usd(df['moving_average'].mean())
    print(f"...50 Day Moving Average: {fifty_day_ma}")
    
    #CREATE DATA VISUALIZATION THROUGH DATAVIZ
    viz = px.line(data_frame=df,x="timestamp",y="close",title=f"Daily Stock Prices ({symbol})",labels={"timestamp":"Date","close":"Closing Price"})
    viz.update_yaxes(nticks=10)
    viz.update_layout(yaxis_tickprefix = '$', yaxis_tickformat = ',.')
    viz.show()
    return


# use of =None taken from stackoverflow: https://stackoverflow.com/questions/55025149/how-to-check-for-no-argument-sent-to-a-function-in-python

def generate_stocks_report(symbol=None):
    if symbol is None:
        return stock_report_analysis_api_source("NFLX")
    elif type(symbol) == str:
        try:
            stock_report_analysis_api_source(symbol)
        except:
            print ("OOPS, that ticker symbol could not be found. Please review and try again.")
    else:
        print ("OOPS, that ticker symbol could not be found. Please review and try again.")

In [82]:
# todo: define the generate_stocks_report function here

#DEFINE A NEW FUNCTION TO DO ALL THE WORK... SO YOUR CONDITIONAL CHECKING THE SYMBOL INPUT IS CLEAN

#def stock_report_analysis(symbol):
##GET THE CORRECT DATA (USING SYMBOL) & CREATE PANDAS DF
#    stock_data = f"daily_adjusted_{symbol.lower()}.csv"
#    df = pd.read_csv(stock_data)
#        
#    print(f"Stock Report: {symbol}")
#    #PRINT COLUMN NAMES 
#    headers = df.columns
#    header_list = []
#    for header in headers:
#        header_list.append(header)
#    print (f"...Available Information: {header_list}")
#    
#    #PRINT NUMBER OF ROWS
#    record_count = len(df.index)
#    print(f"...Total Records: {record_count}")
#    
#    #PRINT LATEST DAY AVAILABLE - SORT DATAFRAME BY DATE (DESCENDING)
#    df.sort_values(by='timestamp', ascending=False)
#    last_date = df.iloc[0]['timestamp']
#    print(f"...Latest Date: {last_date}")
#    
#    #PRINT EARLIEST DAY AVAILABLE
#    first_date = df.iloc[record_count-1]['timestamp']
#    print(f"...Earliest Date: {first_date}")
#    
#    #PRINT LATEST DAY CLOSING PRICE
#    last_date_closing_price = df.iloc[0]['adjusted_close']
#    last_date_closing_price_usd = to_usd(last_date_closing_price)
#    print(f"...Adjusted Closing Price on {last_date}: {last_date_closing_price_usd}")
#    
#    #PRINT EARLIEST DAY CLOSING PRICE
#    first_date_closing_price = df.iloc[record_count-1]['adjusted_close']
#    first_date_closing_price_usd = to_usd(first_date_closing_price)
#    print(f"...Adjusted Closing Price on {first_date}: {first_date_closing_price_usd}")
#    
#    #PRINT 100 DAY HIGHEST/LOWEST PRICE (MAX/MIN)
#    hundred_day_max = to_usd(max(df['high']))
#    hundred_day_min = to_usd(min(df['low']))
#    print(f"...100 Day High: {hundred_day_max}")       
#    print(f"...100 Day Low: {hundred_day_min}")       
#    
#    #PRINT %CHANGE OF CLOSING PRICES
#    pct_change = to_pct((int(last_date_closing_price)-int(first_date_closing_price))/(int(first_date_closing_price)))
#    print(f"...Closing Price Percentage Change: {pct_change}")         
#    
#    #CREATE 50 DAY MOVING AVERAGE
#    df['moving_average'] = df['adjusted_close'].rolling(50).mean()
#    fifty_day_ma = to_usd(df['moving_average'].mean())
#    print(f"...50 Day Moving Average: {fifty_day_ma}")
#    
#    #CREATE DATA VISUALIZATION THROUGH DATAVIZ
#    viz = px.line(data_frame=df,x="timestamp",y="close",title=f"Daily Stock Prices ({symbol})",labels={"timestamp":"Date","close":"Closing Price"})
#    viz.update_yaxes(nticks=10)
#    viz.update_layout(yaxis_tickprefix = '$', yaxis_tickformat = ',.')
#    viz.show()
#    return
#
## use of =None taken from stackoverflow: https://stackoverflow.com/questions/55025149/how-to-check-for-no-argument-sent-to-a-function-in-python
#
#def generate_stocks_report(symbol=None):
#    if symbol is None:
#        return stock_report_analysis("NFLX")
#    elif type(symbol) == str:
#        if symbol.lower() in symbols:
#            return stock_report_analysis(symbol)
#        else:
#            return "OOPS, that ticker symbol could not be found. Please review and try again."
#    else:
#        return "OOPS, that ticker symbol could not be found. Please review and try again."


In [83]:
# If your function works, we should be able to uncomment the lines below and use it like this:

#generate_stocks_report()
generate_stocks_report("F")

Stock Report: F
...Available Information: ['timestamp', 'open', 'high', 'low', 'close', 'volume']
...Total Records: 100
...Latest Date: 2022-07-28
...Earliest Date: 2022-03-07
...Adjusted Closing Price on Latest Date (2022-07-28): $14.00
...Adjusted Closing Price on Earliest Date (2022-03-07): $15.97
...100 Day High: $17.80
...100 Day Low: $10.61
...Closing Price Percentage Change: -6.6667%
...50 Day Moving Average: $13.74


## Stocks Data Dashboard

If your function works, we should be able to use it in the dashboard below:

 1. Use the dropdown to select a stock symbol.
 2. Run run the cell to generate a report and chart chart of prices over time.

In [80]:
# @title Stock Selection Form
symbol = "AAPL" # @param ['MSFT', 'GOOGL', 'AAPL', 'NFLX', "SBUX", "TSLA", "DIS"]
generate_stocks_report(symbol)

Stock Report: AAPL
...Available Information: ['timestamp', 'open', 'high', 'low', 'close', 'volume']
...Total Records: 100
...Latest Date: 2022-07-28
...Earliest Date: 2022-03-07
...Adjusted Closing Price on Latest Date (2022-07-28): $157.35
...Adjusted Closing Price on Earliest Date (2022-03-07): $159.30
...100 Day High: $179.61
...100 Day Low: $129.04
...Closing Price Percentage Change: -1.2579%
...50 Day Moving Average: $151.80
OOPS, that ticker symbol could not be found. Please review and try again.



# Scratch Work



Optionally use these cells as scratch-work to practice your ability to produce the desired dataviz. 

> NOTE: These practice cells won't be evaluated. Only the report generation function above will be evaluated. So make sure that your final work ends up in the report generation function!



