# Weekly Challenge 03

*Original URL* https://community.alteryx.com/t5/Weekly-Challenge/Challenge-3-Running-Averages/td-p/36814 and [**My Alteryx Approach**](https://github.com/dsmdavid/Alteryx-Weekly-Challenge/tree/master/sub_Challenge%2303)

## Brief

The goal is to create 3 and 6 month running averages for the values contained in columns: u.CAGI, d.CAGI, u.IR, d.IR, u.NonIR, d.NonIR. Create the averages by horsepower (HP) Category. 


In [1]:
import pandas as pd

## Approach I want to follow:
1. Read the data.
1. Create a function to calculate X moving average.
1. Run the function as needed and combine into a single dataframe.

In [2]:
#Read the dataframe, use the Year-Month as date and add to index; MultiIndex= Date and HP category
df = pd.read_csv("./03_files/input.csv", parse_dates=[[1,2]], index_col=[0,1])
# Fill missing values with 0
df.fillna(value = 0, inplace=True) 

In [3]:
#display` data
df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,u.CAGI,d.CAGI,u.IR,d.IR,u.NonIR,d.NonIR
Year_Month,HP Category,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2009-01-01,10 hp,218.0,1178948.0,56.0,319316.0,162.0,859632.0
2009-02-01,10 hp,200.0,1066974.0,44.0,248806.0,156.0,818168.0
2009-03-01,10 hp,272.0,1399758.0,52.0,285730.0,220.0,1114028.0
2009-04-01,10 hp,178.0,996182.0,42.0,235730.0,136.0,760452.0
2009-05-01,10 hp,200.0,1117158.0,60.0,333908.0,140.0,783250.0


In [4]:
def returnMovingAverage(dataframe, grouping, periods):
    '''Returns the rolling mean for the given number of periods for a dataframe grouped by the grouping conditions.
    Expected values:
    dataframe = dataframe
    grouping = a list of columns/grouping elements
    periods = an int for the number of months/rows to calculate the moving average'''
    
    
    #Basic test of conditions
    if type(dataframe) != pd.core.frame.DataFrame:
        raise TypeError('First argument must be a dataframe')
    elif type(grouping) != list:
        raise TypeError('Second argument must be a list of elements to group by')
    elif type(periods) != int:
        raise TypeError('Third argument must be an int --number of months to calculate the moving calculation')
        
    
    #Calculate the moving average
    df = dataframe.groupby(by=grouping).rolling(periods, min_periods = 1).mean()

    #Rename columns to add the prefix:
    new_columns = []
    for column in df.columns:
        new_columns.append(('r'+str(periods)+'mo'+'_'+column).replace('.','_'))
    df.columns = new_columns
    
    #Rearrange the index
    df.index = df.index.droplevel(level=2)
    df = df.reorder_levels([1,0])
    
    
    return df

In [5]:
list_of_df = [df]
for period in (3,6):
    list_of_df.append(returnMovingAverage(df, ['HP Category'],period))
    

## Differences with the Alteryx Solution:
The generate rows in Alteryx is taking the closest value to fill the empty/absent values. Thus, the moving average of the first row is 3\**First row value*/3; the moving average of the second row is: (2\**First row value* + *Second row value*)/3.
Here I had 2 different approaches: either return **nan** when the value is missing, or allow the calculation with a single value

In [6]:
output = pd.concat(list_of_df, axis=1)
output.head(10)

Unnamed: 0_level_0,Unnamed: 1_level_0,u.CAGI,d.CAGI,u.IR,d.IR,u.NonIR,d.NonIR,r3mo_u_CAGI,r3mo_d_CAGI,r3mo_u_IR,r3mo_d_IR,r3mo_u_NonIR,r3mo_d_NonIR,r6mo_u_CAGI,r6mo_d_CAGI,r6mo_u_IR,r6mo_d_IR,r6mo_u_NonIR,r6mo_d_NonIR
Year_Month,HP Category,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
2009-01-01,10 hp,218.0,1178948.0,56.0,319316.0,162.0,859632.0,218.0,1178948.0,56.0,319316.0,162.0,859632.0,218.0,1178948.0,56.0,319316.0,162.0,859632.0
2009-02-01,10 hp,200.0,1066974.0,44.0,248806.0,156.0,818168.0,209.0,1122961.0,50.0,284061.0,159.0,838900.0,209.0,1122961.0,50.0,284061.0,159.0,838900.0
2009-03-01,10 hp,272.0,1399758.0,52.0,285730.0,220.0,1114028.0,230.0,1215227.0,50.666667,284617.333333,179.333333,930609.3,230.0,1215227.0,50.666667,284617.333333,179.333333,930609.3
2009-04-01,10 hp,178.0,996182.0,42.0,235730.0,136.0,760452.0,216.666667,1154305.0,46.0,256755.333333,170.666667,897549.3,217.0,1160466.0,48.5,272395.5,168.5,888070.0
2009-05-01,10 hp,200.0,1117158.0,60.0,333908.0,140.0,783250.0,216.666667,1171033.0,51.333333,285122.666667,165.333333,885910.0,213.6,1151804.0,50.8,284698.0,162.8,867106.0
2009-06-01,10 hp,258.0,1356456.0,70.0,371568.0,188.0,984888.0,212.0,1156599.0,57.333333,313735.333333,154.666667,842863.3,221.0,1185913.0,54.0,299176.333333,167.0,886736.3
2009-07-01,10 hp,270.0,1358426.0,68.0,343940.0,202.0,1014486.0,242.666667,1277347.0,66.0,349805.333333,176.666667,927541.3,229.666667,1215826.0,56.0,303280.333333,173.666667,912545.3
2009-08-01,10 hp,260.0,1170142.0,48.0,236706.0,212.0,933436.0,262.666667,1295008.0,62.0,317404.666667,200.666667,977603.3,239.666667,1233020.0,56.666667,301263.666667,183.0,931756.7
2009-09-01,10 hp,402.0,1876124.0,74.0,386406.0,328.0,1489718.0,310.666667,1468231.0,63.333333,322350.666667,247.333333,1145880.0,261.333333,1312415.0,60.333333,318043.0,201.0,994371.7
2009-10-01,10 hp,234.0,1234650.0,74.0,370878.0,160.0,863772.0,298.666667,1426972.0,65.333333,331330.0,233.333333,1095642.0,270.666667,1352159.0,65.666667,340567.666667,205.0,1011592.0


In [7]:
output.tail(10)

Unnamed: 0_level_0,Unnamed: 1_level_0,u.CAGI,d.CAGI,u.IR,d.IR,u.NonIR,d.NonIR,r3mo_u_CAGI,r3mo_d_CAGI,r3mo_u_IR,r3mo_d_IR,r3mo_u_NonIR,r3mo_d_NonIR,r6mo_u_CAGI,r6mo_d_CAGI,r6mo_u_IR,r6mo_d_IR,r6mo_u_NonIR,r6mo_d_NonIR
Year_Month,HP Category,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
2014-03-01,75 hp,200.0,5063774.0,50.0,1236018.0,150.0,3827756.0,208.666667,5413948.0,52.666667,1364643.0,156.0,4049305.0,210.333333,5485286.0,49.333333,1234984.0,161.0,4250302.0
2014-04-01,75 hp,196.0,5210746.0,68.0,1890194.0,128.0,3320552.0,201.333333,5272311.0,58.0,1566264.0,143.333333,3706047.0,211.0,5545836.0,57.0,1466396.0,154.0,4079441.0
2014-05-01,75 hp,186.0,4572086.0,44.0,1114995.8,142.0,3457090.2,194.0,4948869.0,54.0,1413736.0,140.0,3535133.0,210.0,5436711.0,58.0,1502228.0,152.0,3934483.0
2014-06-01,75 hp,208.0,5458388.0,54.0,1346500.1,154.0,4111887.9,196.666667,5080407.0,55.333333,1450563.0,141.333333,3629843.0,202.666667,5247177.0,54.0,1407603.0,148.666667,3839574.0
2014-07-01,75 hp,0.0,0.0,0.0,0.0,0.0,0.0,131.333333,3343491.0,32.666667,820498.6,98.666667,2522993.0,166.333333,4307901.0,45.333333,1193381.0,121.0,3114520.0
2014-08-01,75 hp,0.0,0.0,0.0,0.0,0.0,0.0,69.333333,1819463.0,18.0,448833.4,51.333333,1370629.0,131.666667,3384166.0,36.0,931284.7,95.666667,2452881.0
2014-09-01,75 hp,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.761021e-11,0.0,0.0,98.333333,2540203.0,27.666667,725281.7,70.666667,1814922.0
2014-10-01,75 hp,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.761021e-11,0.0,0.0,65.666667,1671746.0,16.333333,410249.3,49.333333,1261496.0
2014-11-01,75 hp,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.761021e-11,0.0,0.0,34.666667,909731.3,9.0,224416.7,25.666667,685314.6
2014-12-01,75 hp,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.761021e-11,0.0,0.0,0.0,0.0,0.0,3.880511e-11,0.0,0.0
