<div style="display:block">
    <div style="width: 20%; display: inline-block; text-align: left;">
    </div>
    <div style="width: 59%; display: inline-block">
        <h1  style="text-align: center">Calculating Percent Time Change</h1><br>
        <div style="width: 90%; text-align: center; display: inline-block;"><i>Author:</i> <strong>Anjana Ranjan</strong> </div>
    </div>
    <div style="width: 20%; text-align: right; display: inline-block;">
        <div style="width: 100%; text-align: left; display: inline-block;">
            <i>Created: </i>
            <time datetime="Enter Date" pubdate>June, 2018</time>
        </div>
    </div>
</div>

# Function  name
PercentTimeChange

# Functionality
## This function finds the percentage of growth/depreciation in terms of:
* Year Over Year
* Quarter over quarter
* Month Over Month
* Week Over Week
By dividing the dataset in terms of their dates

#  Required parameters
* Dataset with a date column in any format

# Input parameters
 
* Start Date(yyyy-dd-mm)
* End Date(yyyy-dd-mm)
* Attribute- any coloumn name
* Timeframe- year, quarter, month, week
* Function definition - YoY, QoQ, MoM, WoW

# Default parameters
*  The location to which the outPut csv file is saved


# Return
## Percentage change in the aforesaid attribute in:
* ### YoY based on the grouping of the attribute in terms of:
      * Years
      * Quarters
      * Months
      * Weeks
* ### QoQ based on the grouping of the attribute in terms of:
      * Quarters
      * Months
      * Weeks
* ### MoM based on the grouping of the attribute in terms of:
      * Months
      * Weeks
* ### WoW based on the grouping of the attribute in terms of:
      * Weeks      

# Code

In [1]:
import pandas as pd
import numpy as np
from datetime import datetime

df1 = pd.read_csv("USDAProj_Corn_2016to2017.csv")
df2 = pd.read_csv("gun-violence-data_01-2013_03-2018.csv")
df3 = pd.read_csv("results.csv")



In [9]:
class PercentTimeChange:
    
    def __init__(self, df, startDate, endDate, columnName, timeframe):
        self.df = df #dataframe
        self.df['Date'] = pd.to_datetime(df['Date']) #change date column to pandas datetime format
        self.startDate = datetime.strptime(startDate,'%Y-%d-%m') #accept start sate in yyyy-dd-mm 
        self.endDate = datetime.strptime(endDate,'%Y-%d-%m') #accept end sate in yyyy-dd-mm
        self.columnName = columnName #column name of a column on which percent time change operation is performed
        self.timeframe = timeframe #timeframe in years, quarters, months or weeks
        
    def module_masking(self):
        mask = (self.df['Date'] >= self.startDate) & (self.df['Date'] <= self.endDate)#calculate for dates between start and end date
        masked_df = self.df.loc[mask]
        masked_df = masked_df.set_index('Date')#set index of dataframe to date column
        return masked_df
        
    
    def YoY(self):

        if self.timeframe == 'Year':
            masked_df = self.module_masking()
            df_temp= masked_df[self.columnName].resample('Y').apply(sum)#grouped by sum of the atrribue for each year
            df_temp= ((df_temp - df_temp.shift(1))/ df_temp)*100
            return df_temp

        elif self.timeframe == 'Quarter':
            masked_df = self.module_masking()
            df_temp= masked_df[self.columnName].resample('Q').apply(sum)
            df_temp= ((df_temp - df_temp.shift(4))/ df_temp)*100
            return df_temp

        elif self.timeframe == 'Month':
            masked_df = self.module_masking()
            df_temp= masked_df[self.columnName].resample('M').apply(sum)
            df_temp= ((df_temp - df_temp.shift(12))/ df_temp)*100
            return df_temp

        elif self.timeframe == 'Week':
            masked_df = self.module_masking()
            df_temp= masked_df[self.columnName].resample('W').apply(sum)
            df_temp= ((df_temp - df_temp.shift(52))/ df_temp)*100
            return df_temp
        
        else:
            print('Invalid input')
            
        
        
    def QoQ(self):

        if self.timeframe == 'Quarter':
            masked_df = self.module_masking()
            df_temp= masked_df[self.columnName].resample('Q').apply(sum)
            df_temp= ((df_temp - df_temp.shift(1))/ df_temp)*100
            return df_temp

        elif self.timeframe == 'Month':
            masked_df = self.module_masking()
            df_temp= masked_df[self.columnName].resample('M').apply(sum)
            df_temp= ((df_temp - df_temp.shift(4))/ df_temp)*100
            return df_temp

        elif self.timeframe == 'Week':
            masked_df = self.module_masking()
            df_temp= masked_df[self.columnName].resample('W').apply(sum)
            df_temp= ((df_temp - df_temp.shift(16))/ df_temp)*100
            return df_temp

        else:
            print('Invalid input')
                
                
    def MoM(self):

        if self.timeframe == 'Month':
            masked_df = self.module_masking()
            df_temp= masked_df[self.columnName].resample('M').apply(sum)
            df_temp= ((df_temp - df_temp.shift(1))/ df_temp)*100
            return df_temp

        elif self.timeframe == 'Week':
            masked_df = self.module_masking()
            df_temp= masked_df[self.columnName].resample('W').apply(sum)
            df_temp= ((df_temp - df_temp.shift(4))/ df_temp)*100
            return df_temp

        else:
            print('Invalid input')
                
        
        
    def WoW(self):

        if self.timeframe == 'Week':
            masked_df = self.module_masking()
            df_temp= masked_df[self.columnName].resample('W').apply(sum)
            df_temp= ((df_temp - df_temp.shift(1))/ df_temp)*100
            return df_temp

        else:
            print('Invalid input')

        
    

        
        

# Testing 


In [52]:
import unittest
from pandas.util.testing import assert_series_equal
from PercentTimeChange import PercentTimeChange

class TimeTests(unittest.TestCase):

    
    def test_YoY(self):
        result = PercentTimeChange(df1, '2016-12-10', '2018-29-05', 'Yield', 'Year')
        test_file_name =  'USDAProj_Corn_2016to2017.csv'
        
        
        try:
            df_temp = pd.read_csv(test_file_name)
            df_temp['Date'] = pd.to_datetime(df_temp['Date'])
            mask = (df_temp['Date'] >= '10-12-2016') & (df_temp['Date'] <= '05-29-2018')
            masked_df = df_temp.loc[mask]
            df_temp = masked_df.set_index('Date')
            df_temp = df_temp['Yield'].resample('Y').apply(sum)
            df_temp = ((df_temp - df_temp.shift(1))/ df_temp)*100
            
            assert_series_equal(result.YoY(), df_temp)
        
        except IOError as e:
            print(e)
            print("TEST FAILED, check column name")
            
    def test_QoQ(self):
        result = PercentTimeChange(df1, '2016-12-10', '2018-29-05', 'Yield', 'Quarter')
        test_file_name =  'USDAProj_Corn_2016to2017.csv'
        
        try:
            df_temp = pd.read_csv(test_file_name)
            df_temp['Date'] = pd.to_datetime(df_temp['Date'])
            mask = (df_temp['Date'] >= '10-12-2016') & (df_temp['Date'] <= '05-29-2018')
            masked_df = df_temp.loc[mask]
            df_temp = masked_df.set_index('Date')
            df_temp = df_temp['Yield'].resample('Q').apply(sum)
            df_temp = ((df_temp - df_temp.shift(1))/ df_temp)*100
            
            assert_series_equal(result.QoQ(), df_temp)
        
        except IOError as e:
            print(e)
            print("TEST FAILED, check column name")
            
    def test_MoM(self):
        result = PercentTimeChange(df1, '2016-12-10', '2018-29-05', 'Yield', 'Month')
        test_file_name =  'USDAProj_Corn_2016to2017.csv'
        
        
        try:
            df_temp = pd.read_csv(test_file_name)
            df_temp['Date'] = pd.to_datetime(df_temp['Date'])
            mask = (df_temp['Date'] >= '10-12-2016') & (df_temp['Date'] <= '05-29-2018')
            masked_df = df_temp.loc[mask]
            df_temp = masked_df.set_index('Date')
            df_temp = df_temp['Yield'].resample('M').apply(sum)
            df_temp = ((df_temp - df_temp.shift(1))/ df_temp)*100
            
            assert_series_equal(result.MoM(), df_temp)
        
        except IOError as e:
            print(e)
            print("TEST FAILED, check column name")
            
    def test_WoW(self):
        result = PercentTimeChange(df1, '2016-12-10', '2018-29-05', 'Yield', 'Week')
        test_file_name =  'USDAProj_Corn_2016to2017.csv'
        
        
        try:
            df_temp = pd.read_csv(test_file_name)
            df_temp['Date'] = pd.to_datetime(df_temp['Date'])
            mask = (df_temp['Date'] >= '10-12-2016') & (df_temp['Date'] <= '05-29-2018')
            masked_df = df_temp.loc[mask]
            df_temp = masked_df.set_index('Date')
            df_temp = df_temp['Yield'].resample('W').apply(sum)
            df_temp = ((df_temp - df_temp.shift(1))/ df_temp)*100
            
            assert_series_equal(result.WoW(), df_temp)
        
        except IOError as e:
            print(e)
            print("TEST FAILED, check column name")

if __name__ == '__main__':
    unittest.main(argv=['first-arg-is-ignored'], exit=False)
        
        

....
----------------------------------------------------------------------
Ran 4 tests in 0.088s

OK


# Output testing

## Test 1

In [54]:
percentage = PercentTimeChange(df1, '2016-12-10', '2018-29-05', 'Yield', 'Year')
percentage.YoY()

Date
2016-12-31          NaN
2017-12-31    67.024628
Freq: A-DEC, Name: Yield, dtype: float64

### Pass/Fail
Pass:
The MoM in terms of months is calculated for the attribute 'yield' in dataframe1


## Test 2

In [36]:
percentage2 = PercentTimeChange(df2, '2013-01-01', '2016-21-05', 'n_injured', 'Year')
percentage2.YoY()

Date
2013-12-31           NaN
2014-12-31     95.743848
2015-12-31     14.703156
2016-12-31   -142.073609
Freq: A-DEC, Name: n_injured, dtype: float64

### Pass/Fail
Pass:
The YoY in terms of years is calculated for the attribute 'yield' in dataframe1


## Test 3

In [35]:
percentage3 = PercentTimeChange(df3, '1873-03-08', '2005-22-12', 'home_score', 'Quarter')
percentage3.YoY()

Date
1874-03-31           NaN
1874-06-30           NaN
1874-09-30           NaN
1874-12-31           NaN
1875-03-31      0.000000
1875-06-30           NaN
1875-09-30           NaN
1875-12-31           NaN
1876-03-31     71.428571
1876-06-30           NaN
1876-09-30           NaN
1876-12-31           NaN
1877-03-31   -600.000000
1877-06-30           NaN
1877-09-30           NaN
1877-12-31           NaN
1878-03-31     93.750000
1878-06-30           NaN
1878-09-30           NaN
1878-12-31           NaN
1879-03-31   -700.000000
1879-06-30    100.000000
1879-09-30           NaN
1879-12-31           NaN
1880-03-31     83.333333
1880-06-30          -inf
1880-09-30           NaN
1880-12-31           NaN
1881-03-31   -500.000000
1881-06-30           NaN
                 ...    
1998-09-30      4.984424
1998-12-31    -60.952381
1999-03-31     -0.353357
1999-06-30    -10.557185
1999-09-30     -4.220779
1999-12-31     19.847328
2000-03-31     51.623932
2000-06-30     32.475248
2000-09-30     26.13

### Pass/Fail
Pass:
The YoY in terms of quarters is calculated for the attribute 'home_score' in dataframe1
