# Mutual Fund Calculator <br>
In this notebook, we will learn how to calculate the potential returns on a mutual fund. <br>

In [68]:
# Importing the libraries we will use
import pandas_datareader as pdr
import datetime
import pandas as pd

### Find the historical performance of the Fund <br>
First, we need to find the historical performance of the fund. To get historical data, we will use yahoo finance. We will use *Average Rate of Return* and *Rolling Returns*

First, lets import the data for a mutal fund. <br>
We will only be working with closing values

In [69]:
fund = pdr.get_data_yahoo('VFIAX', 
                          start=datetime.datetime(2000, 12, 1), 
                          end=datetime.date.today())
fund.reset_index(inplace=True)

In [70]:
# Using only the close information of the fund.
# Also, converting the Date column to type(Date) in pandas
fund_close = fund[['Date', 'Close']]
fund_close['Date'] = pd.to_datetime(fund_close['Date'], format='%Y-%m-%d')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  fund_close['Date'] = pd.to_datetime(fund_close['Date'], format='%Y-%m-%d')


In [71]:
# Adding a column of year to better separate the data
fund_close['Year'] = pd.DatetimeIndex(fund_close['Date']).year
# Grouping the data by years
years = fund_close.groupby('Year')
list_years = list(years.groups.keys())

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  fund_close['Year'] = pd.DatetimeIndex(fund_close['Date']).year


### Calculating the Average Rate of Return <br>
Average Rare of return is shown typically as a percentage. One thing to note, ARR may not show how consistenly an investment produces the stated percentage. Since we are using an average, it may hide numerous outliers in the data.

In [100]:
# Yearly rate of return since beginning
yrr = []
for x in list_years:
    a = years.get_group(x).iloc[0]['Close']
    b = years.get_group(x).iloc[-1]['Close']
    y = ((b - a) / a) * 100
    yrr.append(y)


In [101]:
#Average Rate of Return
arr = sum(yrr) / len(yrr)
round(arr, 4)

6.5658

In [88]:
years.get_group(list_years[0])

Unnamed: 0,Date,Close,Year
0,2000-12-01,121.650002,2000
1,2000-12-04,122.550003,2000
2,2000-12-05,127.330002,2000
3,2000-12-06,125.029999,2000
4,2000-12-07,124.309998,2000
5,2000-12-08,126.75,2000
6,2000-12-11,127.720001,2000
7,2000-12-12,126.900002,2000
8,2000-12-13,125.889999,2000
9,2000-12-14,124.120003,2000


In [91]:
list_years[-5:]

[2018, 2019, 2020, 2021, 2022]