In [7]:
from pathlib import Path
import seaborn as sns
import os
import json

import pandas as pd
import numpy as np
from scipy.sparse import csr_matrix

## nice trick to increase dpi of exported images -> higher quality png's for presentation
import matplotlib.pyplot as plt
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('retina', quality=100)

# Objective

Explore the effects of supply-side shocks on the U.S. Economy through the lens of graph networks

# Defenitions

- **Supply tables:** The value of goods and services available in the U.S. economy, whether produced by domestic industries or imported. 
- **Use(IO) tables:** How the supply of goods and services is used. Includes purchases by U.S. industries, individuals, and government, and exports to foreign purchasers. This is the table we are actually interested in because $a_{ij}x_{j}$ is the total number of units (Millions of Dollars)

When it comes to the application of probabilsitc graph models to societal concerns, one need not look further than into public economic policy.

1. **Open Model:**
2. **Closed Model:**

In fact, these methods have proved particularly effective in the analysis of sudden and large changes, as in the case of military mobilization or other far-reaching tranformations of an economy. The method has also been applied in studies of how cost and price changes are transmitted through various sectors of an economy.

Among recent developments of the method may be mentioned its extension to include residuals of the production system – smoke, water pollution, scrap, etc., and the further processing of these. In this way the effects of the production on the environment can be studied.

# Setup

**Required:**
1. reqs_master $\coloneqq$ Total Requirements Table from 1997-2020 from BEA database
2. mapping.json $\coloneqq$ Dictionary containing shortened industry names
3. 'IO.py' $\coloneqq$ Python module containing IO class & functions for IO-dataframe -> networkX.G object

Please deploy the cells below and ensure that there are no errors; note that file paths may need to be changed to the directory where the data is stored on your local machine. More information on the data, mapping dictionary, and IO module may be found in `EDA-Andrew.ipynb`, where the function is built and we conduct a prelimiary EDA with visualzations.

In [11]:
## import BEA total requirements table
data_path = f'{str(Path(os.getcwd()).parent)}/data/processed'

# assign path

path, dirs, files = next(os.walk(f'{data_path}/../io-reqs-table'))
file_count = len(files)
files.sort()

# create empty list
reqs_master = []
  
# append datasets to the list 
for i in range(file_count):
    temp_df = pd.read_csv(path + "/" + files[i], sep=',', 
                skiprows = 4, header = [1]).iloc[:72,1:].set_index('Industry Description')
                
    reqs_master.append(temp_df)

In [12]:
## import our mapping.json dictionary 
with open("./../data/dict.json", 'r') as myfile:
    data=myfile.read()
    
mapping = json.loads(data)

In [13]:
## import IO module containing IO() class and pre-built functions from EDA notebook
from IO import IO
io = IO.IO()
io ## sanity check

<IO.IO.IO at 0x7f90f16f2f10>

# Model Implementation & Deployment

From a big picture perspective, we will be implementing a variety of proposed Economic Network models in published scientific journals. We will be starting with a relatively simple model porposed by .

From the algebra, one could easily mistake this for a trivial task; however, remember that our IO tables are of sizes. The following concerns must be investigated and adressed:

- How can we guarantee the nonsigularity of our input-output matrix A? 
- Should we utilize the psuedoinverse and how will taht affect the predictive power of our model? 
- How do we deal with the computational complaexity of possible inverting a ~100x100 matrix? 

For those of you interested, these articles may be found both linked in the model summaries below or at the end in the 'References' section. 

## Model 1: Open Leontif Model

We will first be attempting to fit underpowered the model proposed by . This model does not include .

If Leontif's economic model was good enough for a Nobel prize, let us assume that it is sufficient for this project.

- A = $\sigma$
- B = 
- X = 

In [134]:
## 5x5 Example for year 1997
df = reqs[0].iloc[0:5,1:].set_index('Name')
A = df.div(df['Total Commodity Output'],axis=0).iloc[0:5,1:6].to_numpy()
hat = np.linalg.inv(np.identity(n=5) - A)
B = np.matrix(df.iloc[0:5,-1] - df.iloc[0:5,-2]).transpose()

## predicted producution levels for each industry for year 1997
X = hat * B
X
hat

array([[1.00254971e+00, 1.95798722e-05, 1.33701000e-04, 1.90794614e-05,
        4.06402433e-05],
       [1.87084566e-01, 1.00017658e+00, 1.22018689e-03, 6.21807382e-06,
        3.46179727e-04],
       [2.72836078e-02, 1.45860989e-01, 1.00814187e+00, 2.24217306e-03,
        2.85594208e-01],
       [4.32790949e-03, 2.29327011e-02, 7.16768409e-02, 1.00158351e+00,
        1.67697036e-01],
       [1.60762916e-02, 8.59441684e-02, 2.75440698e-02, 7.91974276e-03,
        1.00895953e+00]])

In [135]:
## 71 indsutries
df = io_current.iloc[0:71,1:].set_index('Name')
A = df.div(df['Total Commodity Output'],axis=0).iloc[0:71,1:72].to_numpy()
hat = np.linalg.inv(np.identity(n=len(A)) - A)
B = np.matrix(df.iloc[0:71,-1] - df.iloc[0:71,-2]).transpose()

## predicted production vector for 1997
X = hat * B
X

matrix([[ 372808.84868509],
        [ 316229.91658944],
        [ 979866.423247  ],
        [ 470676.4705198 ],
        [  70075.313456  ],
        [ 367184.39803225],
        [ 137447.05024431],
        [ 441677.6044396 ],
        [ 558841.31873542],
        [1098139.44327332],
        [ 758777.79719959],
        [ 352488.24535692],
        [ 489352.00347287],
        [ 464827.39257288],
        [ 415840.45085714],
        [ 235812.99776296],
        [ 130385.41312336],
        [ 177354.11157603],
        [ 337637.74764115],
        [ 467652.14028669],
        [ 148353.56562355],
        [ 507506.1939532 ],
        [ 502092.51629033],
        [ 426054.194965  ],
        [ 674141.92348152],
        [ 566487.09024152],
        [ 595270.77899877],
        [  48367.71171283],
        [   5796.34402459],
        [  15783.00267202],
        [ 132779.05486704],
        [ 181946.2618579 ],
        [ 325132.32228899],
        [ 140254.62385792],
        [ 313227.10180916],
        [ 233664.238

## Model 2: John B. Long, Jr. et al. (U. of Chicago, 1983) & Acemoglu et al. (Harvard, 2015)

Next, Acemogule et al. suggests that the inclusion of taxation and the federal government as an industry in the overall economic network. **In fact, one could argue that the method proposed by John B. Long and Acemoglu are simply a variant of the closed Leotnitz model.**


In [136]:
df = io_current.iloc[0:71,1:].set_index('Name')
A = df.div(df['Total Commodity Output'],axis=0).iloc[0:71,1:72].to_numpy()
hat = np.linalg.inv(np.identity(n=len(A)) - A)
B = np.matrix(df.iloc[0:71,-1] - df.iloc[0:71,-2]).transpose()

## predicted production vector for 1997
X = hat * B
X

matrix([[ 372808.84868509],
        [ 316229.91658944],
        [ 979866.423247  ],
        [ 470676.4705198 ],
        [  70075.313456  ],
        [ 367184.39803225],
        [ 137447.05024431],
        [ 441677.6044396 ],
        [ 558841.31873542],
        [1098139.44327332],
        [ 758777.79719959],
        [ 352488.24535692],
        [ 489352.00347287],
        [ 464827.39257288],
        [ 415840.45085714],
        [ 235812.99776296],
        [ 130385.41312336],
        [ 177354.11157603],
        [ 337637.74764115],
        [ 467652.14028669],
        [ 148353.56562355],
        [ 507506.1939532 ],
        [ 502092.51629033],
        [ 426054.194965  ],
        [ 674141.92348152],
        [ 566487.09024152],
        [ 595270.77899877],
        [  48367.71171283],
        [   5796.34402459],
        [  15783.00267202],
        [ 132779.05486704],
        [ 181946.2618579 ],
        [ 325132.32228899],
        [ 140254.62385792],
        [ 313227.10180916],
        [ 233664.238

## Model 3: Something Special

Finally, for those that. As of 2022, research into probabilistic graph networks.  **This will be left as an exercise for the reader.**

# Visualizations

This section will contain a couple of fancy network plots created to accompany the analysis and write-ups due March 17, 2022.

# Analysis

# Conclusion

## Limitations

    One clear limitation we faced during this project was the inability to guarantee nonsingularity for our IO-matrix A.

## Next Steps

As stated in the model portion above,

# References

1. Jason Choi and Andrew Foerster, (2017), The Changing Input-Output Network Structure of the U.S. Economy, Economic Review, (Q II), 23-49
2. Sekhon, Rupinder, and Roberta Bloom. Applications – Leontief Models. De Anza College, 4 Sept. 2021, https://math.libretexts.org/@go/page/37851.