# Investor - Flow of Funds - US

Check out [Investor Flow of Funds Exercises Video Tutorial](https://youtu.be/QG6WbOgC9QE) to watch a data scientist go through the exercises

### Introduction:

Special thanks to: https://github.com/rgrp for sharing the dataset.

### Step 1. Import the necessary libraries

In [1]:
import pandas as pd

### Step 2. Import the dataset from this [address](https://raw.githubusercontent.com/datasets/investor-flow-of-funds-us/master/data/weekly.csv). 

### Step 3. Assign it to a variable called 

In [2]:
url = 'https://raw.githubusercontent.com/datasets/investor-flow-of-funds-us/master/data/weekly.csv'
df = pd.read_csv(url)
df.head()

Unnamed: 0,Date,Total Equity,Domestic Equity,World Equity,Hybrid,Total Bond,Taxable Bond,Municipal Bond,Total
0,2011-10-05,-4002,-4499.0,497,-1354.0,-5828,-6258.0,430,-11184.0
1,2011-10-12,-7397,-5842.0,-1555,512.0,3954,3927.0,28,-2931.0
2,2011-10-19,-3292,-3466.0,174,1399.0,5652,5102.0,550,3759.0
3,2011-10-26,-3696,-2998.0,-698,2631.0,4910,4070.0,841,3846.0
4,2011-12-07,-7956,-5761.0,-2196,1089.0,3523,2068.0,1456,-3343.0


In [4]:
df.dtypes

Date                   str
Total Equity         int64
Domestic Equity    float64
World Equity         int64
Hybrid             float64
Total Bond           int64
Taxable Bond       float64
Municipal Bond       int64
Total              float64
dtype: object

In [8]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Total Equity,159.0,7816.427673,14772.255905,-23041.0,-3416.0,6003.0,17217.5,54173.0
Domestic Equity,50.0,-4908.44,4426.957647,-15411.0,-6394.0,-4591.5,-2748.75,5629.0
World Equity,159.0,6526.301887,11120.78905,-16192.0,-968.5,3322.0,11025.5,41572.0
Hybrid,50.0,-648.54,1508.122623,-5602.0,-1576.25,-567.0,388.25,2631.0
Total Bond,159.0,6209.238994,10455.112795,-17891.0,-212.0,4945.0,11387.0,40408.0
Taxable Bond,50.0,1170.0,6180.386825,-17264.0,-2036.0,2996.0,4685.5,17101.0
Municipal Bond,159.0,1328.345912,1913.66758,-3263.0,251.0,1130.0,2077.0,7412.0
Total,50.0,-4621.96,9175.230856,-37421.0,-10846.75,-2773.5,305.0,10590.0


### Step 4.  What is the frequency of the dataset?

In [32]:
# weekly data

### Step 5. Set the column Date as the index.

In [14]:
df = df.set_index('Date')
df.head()

KeyError: "None of ['Date'] are in the columns"

### Step 6. What is the type of the index?

In [15]:
df.index
# it is a 'object' type

DatetimeIndex(['2011-10-05', '2011-10-12', '2011-10-19', '2011-10-26',
               '2011-12-07', '2011-12-14', '2011-12-21', '2011-12-28',
               '2012-12-05', '2012-12-12',
               ...
               '2024-06-26', '2024-07-02', '2024-07-10', '2024-07-17',
               '2024-07-24', '2024-12-04', '2024-12-11', '2024-12-18',
               '2024-12-24', '2024-12-31'],
              dtype='datetime64[us]', name='Date', length=159, freq=None)

### Step 7. Set the index to a DatetimeIndex type

In [16]:
df.index = pd.to_datetime(df.index)
type(df.index)

pandas.DatetimeIndex

### Step 8.  Change the frequency to monthly, sum the values and assign it to monthly.

In [17]:
monthly = df.resample('M').sum()
monthly

ValueError: Invalid frequency: M. Failed to parse with error message: ValueError("'M' is no longer supported for offsets. Please use 'ME' instead.")

### Step 9. You will notice that it filled the dataFrame with months that don't have any data with NaN. Let's drop these rows.

In [37]:
monthly = monthly.dropna()
monthly

Unnamed: 0_level_0,Total Equity,Domestic Equity,World Equity,Hybrid,Total Bond,Taxable Bond,Municipal Bond,Total
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2012-12-31,-26156.0,-23126.0,-3031.0,526.0,9848.0,12613.0,-2765.0,-15782.0
2013-01-31,3661.0,-1627.0,5288.0,2730.0,12149.0,9414.0,2735.0,18540.0
2014-04-30,10842.0,1048.0,9794.0,4931.0,8493.0,7193.0,1300.0,24267.0
2014-05-31,-2203.0,-8720.0,6518.0,3172.0,13767.0,10192.0,3576.0,14736.0
2014-06-30,2319.0,-6546.0,8865.0,4588.0,9715.0,7551.0,2163.0,16621.0
2014-07-31,-7051.0,-11128.0,4078.0,2666.0,7506.0,7026.0,481.0,3122.0
2014-08-31,1943.0,-5508.0,7452.0,1885.0,1897.0,-1013.0,2910.0,5723.0
2014-09-30,-2767.0,-6596.0,3829.0,1599.0,3984.0,2479.0,1504.0,2816.0
2014-11-30,-2753.0,-7239.0,4485.0,729.0,14528.0,11566.0,2962.0,12502.0
2015-01-31,3471.0,-1164.0,4635.0,1729.0,7368.0,2762.0,4606.0,12569.0


### Step 10. Good, now we have the monthly data. Now change the frequency to year.

In [38]:
year = monthly.resample('AS-JAN').sum()
year

Unnamed: 0_level_0,Total Equity,Domestic Equity,World Equity,Hybrid,Total Bond,Taxable Bond,Municipal Bond,Total
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2012-01-01,-26156.0,-23126.0,-3031.0,526.0,9848.0,12613.0,-2765.0,-15782.0
2013-01-01,3661.0,-1627.0,5288.0,2730.0,12149.0,9414.0,2735.0,18540.0
2014-01-01,330.0,-44689.0,45021.0,19570.0,59890.0,44994.0,14896.0,79787.0
2015-01-01,15049.0,-10459.0,25508.0,7280.0,26028.0,17986.0,8041.0,48357.0


### BONUS: Create your own question and answer it.