# Notes for our stock project

## Goals:

* We will create a program that will import the stock data for a given, inputted ticket
* This stock data can be shown in graphs, or specific variables can be called upon if requested. Will automatically display the most relevant statistics
* We will then have the ability to build a function that would be able to correlate two tickers based on their historical movements
* Finally, given a risk preference of low, medium, or high & a given $amt to invest, we will have a function that would automatically build a portfolio for an investor. There could also be some detailed information on the portfolio's compostion of each stock and its relevant stats.

## STEP 0: Interface

Before beginning all my data collection & manipulation, I wanted to work on the initial interface where the user would be given information about the program, could input the ticker to collect & review its data, and to build their own portfolio. Thus, the inital phases of this project were simply creating each skeleton function for the intro, the "main" page, and the two main functions of the tickers & the portfolio builder.


In [5]:
def main():
    print("Hello!")
    print("For info about the program, type: 'info'")
    print("To look up different assets' information, type: 'tick'")
    print("To build a diversifed portfolio, type: 'build'")
    print("To exit at any time, type: 'exit'")

    while True:
        program_info = input()
        if program_info == "info":
            info()
        elif program_info == "tick":
            tick()
        elif program_info == "build":
            build()
        elif program_info == "exit":
            print("Have a great day!")
            return False
        
def info() -> None:
    print("This program will allow you to do the following: ")
    print("- See different assets, their relevant statistics, and accompanying graphs")
    print("- Cross compare multiple tickers & see their approximated correlation")
    print("- Based on an inputted risk preference of 'low,' 'medium,', or 'high,' and a given $ amount, this program will auto-generate you a diversified portfolio.")

## STEP 1.1: Retrieving Stock Data

We make use of the built-in yfinance library. However, I quickly found out there was a lot of issues when trying to use the .info() method call on any given ticker, which made the initial phase of the project much more difficult than anticipated. I browsed on the internet to read about different ways to download stock data, and the .download() method call and .history() method call seemed to be the most effective choices. The first part of this is shown below, where these two functions would allow someone to input a ticker -- the difficult part I had to figure out how to solve was when there was a ticker that did not exist on the stock market, "APAL" is an example.

In [6]:
import yfinance as yf

def tick() -> None:
    ticker: str = input("Enter 1 or more tickers, separated by commas: ")
    if tick_tester(ticker) == True:
        tickers = undelimit(ticker)
        statistics(tickers)
        stock_data(tickers)
    else:
        print(f"'{ticker}' is not a valid ticker symbol.")


def tick_tester(given_tick: str) -> bool:
    start_date = '2022-03-14'
    end_date = '2023-03-14'
    list_of_tickers = undelimit(given_tick)
    for ticker_symbol in list_of_tickers:
        try:
            ticker = yf.Ticker(ticker_symbol.strip())
            df = ticker.history(start=start_date, end=end_date)
            if df.empty:
                return False
        except:
            return False
    return True

This code was fine to start with, but I quickly realized after creating it that I would need to be able to test for multiple inputted tickers so that an investor can view them all in a table, side-by-side. Thus, I had to mess with the code in order to convert the given input (which is a string) and convert it into a list of strings, so that I can go through each ticker and determine if it existed in the yfinance library, same as before. The undelimit function is one that I imported from my comp110 class that we made & reviewed on a quiz. I struggled a lot with this part initially, but once I realized I had to convert the `str` to a `list[str]`, I was able to tackle this part much quicker.

## STEP 2: Relevant statistics, graphs, & display

In the second part of this phase, my goal was to have the stock print all of its relevant statistics to an investor be default, and retrieve/exclude certain stats if directed to do so. Thus, the first step was to determine what were the important default stats I wanted a prospective investor to see. I knew from my financial economics class that standard deviation, expected return, risk premium were all very important.

Further, I went searching online to see what other statistics might be of relevance. I determined for the default settings, we would also include: (list...)

This inital function is one that takes in the inputted tickers (which we determined to exist, see code above) & gives the user a default table of their most relevant/important statistics to an investor.

In [9]:
def list_of_lists(*args):
    return [list(item) for item in zip(*args)]

def statistics(x: list[str]) -> None:
    """Given a list of tickers, we can gather all their relevant stats and put them through list_of_lists."""

def stock_data(tickers: list[str]) -> None:
    """Given tickers, create default statistics displayed."""
    list_of_lists(tickers)


For calculating expected return,  my old financial economics notes tell me that we need to sum: 
* sum[ HPR of scenario i ((ending share price (yr) - beginning share price (yr) + dividend)/beginning price) * probability of scenario i ]

For calculating standard deviation, we need to take the square root of the sum:
* sqrt(sum[ probability of scenario i * (HPR of scenario i - expected return of scenario i)^2 ])

## STEP 3: Correlation function

In this step, the goal was to create a correlation function which using Pearson's correlation coefficient, we would determine the corr between two tickers as a step into creating a diversified portfolio. The usage of historical data & comparing statistics helps create an approximation of the correlations.

In [None]:
def pearson_correlation(x, y):
    """Calculate Pearson correlation coefficient between two variables."""
    x = np.array(x).astype(float)
    y = np.array(y).astype(float)
    N = len(x)
    mean_x = np.mean(x)
    mean_y = np.mean(y)
    std_x = np.std(x)
    std_y = np.std(y)
    num = np.sum((x - mean_x) * (y - mean_y))
    denom = (N - 1) * std_x * std_y
    return num / denom

## STEP 4: Portfolio Generator

The final part of this project was to create a diversified portfolio generator based on a given individual's risk preference of low, medium, or high, & their given $ amount they wanted to invest. The previous correlation function is instrumental in this step, because to create an optimized portfolio, we want stocks with minimum correlations to one another.