# Chapter 7 Notes

- This chapter will focus on developing a workflow to pull and analyze financial data
- The material in this chapter will rely on the object oriented programming functionality of Python
  - Classes, decorators, etc. will be developed to accomplish the goal of analyzing the data
    - The main classes built include the following:
      - `StockReader`
      - `Visualizer` (multiple)
      - `StockAnalyzer`
      - `StockModeler`
    - These classes will be housed in a Python package
    - Each class will have a single purpose, this practice makes each class easier to build
-Building a Python package
  - Packages are good coding practice as they allow for modular code and reuse
  - A module is a single file of python code that can be imported
  - A package is a collection of modules organized into directories
    - We've imported packages as a whole, but we've also imported specific modules from packages without importing the entire thing
  - To turn a module into a package, the following is performed
    1. Create a directory with the name of the package
    2. Place the modules in the new directory
    3. Add an `__init__.py` file containing any Python code to run upon package import
    4. Make a `setup.py` file at the same level as the package's top-level directory
       - This will give `pip` instructions on how to install the package
  - A package can contain as many subpackages as you want
    - Subpackages are setup the same as the main package, just without a `setup.py` file
    - There is an example directory hierarchy for a package on Pg. 398
  - It's also helpful to include the following: 
    - a README file for the package repository
    - Lint the code so it conforms to standards and check for errors
    - Add tests to make sure changes don't break the code
- Stock Analysis Package
  - The stock analysis package contains many classes, covering multiple facets of financial analysis
  - Pg. 399 to 400 discuss UML diagrams and shows the diagram for the stock analysis package

### Stock Analysis Package Breakdown

- Collecting financial data
  - The StockReader class
    - Web scraping is used to extract data from the HTML page itself
    - Data for various assets across the same date range will be collected, so it makes sense to create a class to prevent repeat code
      - These assets include bitcoin, stocks, and stock market indicies
    - The UML diagram for this class can be found on pg. 401
    - The code for this class can be found in `stock_reader.py`
      - The first line of the module is the docstring for the module
        - This is pulled up when `help()` is called on the module and helps describe the purpose of the module
      - The imports follow next, following PEP 8 style guidelines
        - Standard library imports are first, third-party libraries second, and relative imports from another module in the `stock_analysis` package
      - The actual definition of the `StockReader` class is defined next
        - The class also has a docstring
      - The first item defined in the class is a dictionary, `_index_ticker`, mapping tickers for indicies
        - This attribute starts with an underscore to make it private, so it won't be listed when attributes and methods of the class are displayed
        - `_index_ticker` isn't defined in the `__init__()` method as we don't want that to be initialized for every object created, but shared between all objects
        - To see what is available in `_index_ticker` without giving users the ability to modify it, a property for the keys in `_index_ticker` will be created
          - This takes te form of a decorator: A function that wraps around other functions, allow for the execution of extra code before and/or after the inner function executes
      - The next item is an `__init__()` method which initializes an object when using the class
        - This `__init__()` method holds on to the start and end dates of the data we're interested in
          - The dates are parsed so that any date separator can be used
          - A `ValueError` is raised in the start date is equal to or after the end date
      - Decorators are used frequently in this class
        - Decorators begin with '@' signs and are placed above a the function or method definition
        - The `@label_sanitizer` decorator is designed to reformat column names to all the same format, removing all non character or non whitespace characters
          - A private function is setup first that simply cleans an input string and then utilized by the defined decorator to clean column names specifically
          - In addition, an `@wraps` decoration from the `functools` module in the standard library is used to give the decorated function/method with the same docstring it had beforehand
      - A class method was used in this module to get the tickers, as they are stored as class variables
        - The class method receives `cls` as the first argument while instance methods, the typical method, receive `self`
        - A try, except block is used to provide additional information to raised errors
      - The foundation method is `get_ticker_data()`
        - The pulls finance data from Yahoo! Finance
        - The method relies on the `pandas_datareader` package, but updates in the Yahoo! Finance API can cause issues. Check the documentation when using.
        - Note that this method has the `@label_sanitizer` decorator on it
      - The next important method is `get_index_data()`
        - This looks up an index's ticker and calls the `get_ticker_data()` method on it
        - The decorator applied to `get_ticker_data()` also applies to this method
      - Bitcoin data is available through Yahoo! Finance and can be retrieved using the `get_bitcoin_data()` method
        - A currency code must be input
        - This method also relies on the `get_ticker_data()` method

- Exploratory data analysis
  - The Visualizer class family
    - This group is designed to develop vizualizations of the data
    - The module starts with the doc string and imports
    - The base class is `Vizualizer`
      - There are two subclasses under this, `AssetGroupVisualizer` and `StockVisualizer`
      - This class holds the data for vizualization, which will be stored in the `__init__()` method
      - The `add_reference_line()` method, a static method, is defined to add reference lines to plots
        - Static methods don't depend on the class for data
        - Takes scaler x and/or y values
      - The `shade_region()` static method adds a shaded region to a plot
        - Takes tuple x or y values, with the min and max bounds for either the horizontal or vertical box
      - The `_iter_handler(items)` static method creates a list out of an item that isn't a list or tuple, for easy plotting
      - The next method defined is the `_window_calc()` abstract method
        - This method doesn't have implementation, instead subclasses will override it for implementation
        - The next two methods, `moving_average()` and `exp_smoothing`, using this abstract method to add moving average lines and exponentially smoothed moving average lines to the plot, respectively
        - Since `_window_calc()` isn't defined here, each subclass will have it's own implementation, which will inhereit the top-level method without overriding `moving_average()` or `exp_smoothing()`
      - The subclasses to `Visualizer` will inherit, overide, or define unique methods for their own implementation
    - The first subclass of `Visualizer` is `StockVisualizer`, used to handle single assets only
      - `StockVisualizer` doesn't override the `__init__()` method of the parent class
      - Implementations for the needed methods will be added or overridden
      - The methods in this subclass include:
        - `evolution_over_time()`
          - Overrides the abstract method defined in the parent class, creating a line plot
        - `candlestick()`
          - Unique to the subclass, creating candle plots from an individual asset
        - `after_hours_trades()`
          - Helps vizualize effects of after-hours trading on individual assets, bars colored red for losses and green for gains
        - `fill_between()`
          - A static method that uses `plt.fill_between()` to color the area between two curves
          - The color is dependent on which curve is higher
        - `open_to_close()`
          - Uses the `fill_between()` static method to visualize the daily difference between opening and closing price, coloring the area green if the closing price is higher than the opening and red vice versa
        - `fill_between_other()`
          - Uses `fill_between()` again to vizualize the difference between the asset used to create the visualization and another asset, with the area colored green when the visualizer's asset is higher
        - `_window_calc()`
          - Overrides the abstract method defined in the parent class
          - Uses the `pipe()` and `_iter_handler()` methods to add reference lines base on window calculates for a single asset
        - `jointplot()`
          - Shows the relationship between two assets
          - Based on the `jointplot()` function from `seaborn`
        - `correlation_heatmap()`
          - Another way of showing relationships between two assets
          - Creates a matrix for the `sns.heatmap()` function and uses a mask to only show the diagonals with correlation coefficients
          - Uses the daily percentage change to account for differences in scale
      - Examples of `StockVisualizer` class usage are provided at the end of this section
    - The second subclass of `Visualizer` is `AssetGroupVisualizer`, used to visualize groups of assets in a single dataframe
      - `AssetGroupVisualizer` overrides the `__init__()` method for it's own use and calls the `__init__()` method of the super class as well
        - The new `__init__()` method tracks the column used for the `groupby()` operation 
      - The methods defined in this subclass include:
        - `evolution_over_time()`
          - Plots the same column for all assets in the group in a single plot for comparison
          - Uses `seaborn` as the data comes in different shapes
        - `_get_layout()`
          - Automatically determines a reasonable number of subplot layouts
          - This wasn't necessary in the previous subclass as it was only plotting a single asset
        - `_window_calc()`
          - Helper method for plotting a series and adding reference lines using a window calculation
        - `after_hours_trades()`
          - Overrides the method previously defined to visualize the effects of after-hour trading on a group of assets using subplots
        - `pairplot()`
          - Overrides the method defined in `StockVisualizer` to allow for relationships between closing prices across assets in a group
        - `heatmap()`
          - Generates a heatmap of the correlations between closing prices of all assets in a group
          - Handles differences in scale between assets, as some assets will have significant differences in prices

- Technical analysis of financial instruments
  - The Analysis class family
    - This module is designed to calculate metrics to compare various assets to each other
    - The first class is `StockAnalyzer`
      - For technical analysis of a single asset
      - This method instance will be initialized with the data for the asset on which we want to perform a technical analysis
        - The `__init__()` method will need to accept the data as a parameter
      - A series of properties are defined to make pulling specific values from the data easier. These include:
        - `close`
          - Simplifies access to an assets closing price
        - `pct_change`
          - Gets the percent change of the close column
        - `pivot_point`
          - Calculates the pivot point
        - `last_close`
          - Gets the value of the last close in the data
        - `last_high`
          - Gets the value of teh last high in the data
        - `last_low`
          - Gets the value of the last low in the data
        - `_max_periods`
          - Returns the number of rows in the data
      - Next, methods are defined to calculate metrics using the properties above
        - `resistance()` and `support()`
          - Support and resistance are calculated at three different levels, as specified, default is 1
            - Level 1 is closest to the closing price and the most restrictie
            - Level 3 is furthest from closing and the least restrictive
      - The following few methods are used to assess asset volatility
        - `daily_std()`
          - Calculates and returns the standard deviation of the percent change
          - Max `periods` value is 252
        - `annualized_volatility()`
          - Multiplies daily standard deviation by the square root of the number of trading periods in the year
        - `volatility()`
          - Rlling volatility is calculated for shorter trading periods
        - `corr_with()`
          - Calculates correlations between daily percentage change
        - `cv()`
          - Used for comparing the level of dispersion of assets
          - Calculates the coefficient of variation for an asset
        - `qcd()`
          - Caclulates the quantile coefficient of dispersion
        - `beta()`
          - Quantifies the volatility of an asset compared to an index
          - Ratio of covariance of the asset's return and the index's return to the variance of the asset's return
          - The benchmark index for comparision is user supplied
        - `cumulative_returns()`
          - Calculates cumulative returns of an asset as a series
          - Defined as the cumulative product of one plus the percent change in closing price
        - `portfolio_return()`
          - Defied as static method since it will need to be calculated for an index and not the data stored in `self.data`
          - It is assumed that there is no distribution per share, therefore the return is calculated as the percent change from the starting price to ending price over the time period covered by the data
        - `alpha()`
          - Compares the returns of an asset to those of an index
          - Uses the risk-free rate of return, which is the rate of return of an investment that has no risk of financial loss, e.g. U.S. Treasury Bills
          - Requires calculating the portfolio return of the index and the asset along with beta
        - `is_bear_market()`
          - Determines if an asset is in a bear market
          - A bear market is a decline of 20% or more in a stock price over the last two months
        - `is_bull_market()`
          - Determines if an asset is in a bull market
          - A bull market is the opposite of a bear market, with the same increase over the same period
        - `sharpe_ratio()`
          - Calculates the return recieved in excess of the risk-free rate of return for the volatility taken on with an investment
    - The second class is `AssetGroupAnalyzer`
      - For technical analysis of a group of assets
      - Handles running the `StockAnalyzer` calculations for group of assets
      - Shares much of the functionality of `StockAnalyzer` through composition
        - Compsition: When an object cntains instances of other classes
        - This greatly simplifies the `AssetGroupAnalyzer` class
      - Class input is the dataframe for the assets and the name of the grouping column
      - The only public method is `analyze`, which calls `StockAnalyzer` on all methods
        - The `getattr()` method allows the `analyzer()` method to grab `StockAnalyzer` objects

- Model performance using historical data
  - `StockModeler` class
    - Helps serve as the middleman to the `statsmodels` package
    - Has no attributes and is a static class
    - Raises an error if an attempt to instantiate is made
    - The methods created in this class are as follows:
      - `decompose()`
        - Decompose closing price of a stock into trend, seasonal, and remainder components
        - Utilizes `seasonal_decomposition()` from `statsmodels`
      - `arima()` and `arima_predictions()`
        - `arima()` creates an arima model using a `statsmodels` ARIMA object
        - `arima_predictions()` evaluates the ARIMA model's predictions
      - `regression()` and `regression_predictions()`
        - `regression()` builds a linear regression model for the closing price of a stock with a lag of 1 and `regression_predictions()` evaluates the model's predictions
        - Uses the `statsmodels` package
      - `plot_residuals()`
        - Plots the residuals for a model and is used to visualize errors in the ARIMA and linear regression predictions
  - Time series decomposition
    - Time series can be decomposed into trend, seasonal, and remainder components with the `statsmodels` package and `StockModeler.decompose`
  - ARIMA
    - Review statistics and documentation for ARIMA modeling
  - Linear Regression
    - Like ARIMA, review statistics and documentation