In [7]:
import numpy as np
import pandas as pd

## Covers how to construct numpy array with elements from list recycled to fit predetermined length (possibly repeating some elements consecutively a predetermined number of times)

**<em>R's rep_len function recycles elements recycles elements of a given vector a given number of times</em>**

<b>Here's how to duplicate this in python</b>

In [8]:
rep_len = lambda v,length_out: np.resize(v,length_out)

In [9]:
quarters = ['q'+str(i) for i in range(1,5)]
quarters

['q1', 'q2', 'q3', 'q4']

In [10]:
allQuarters = rep_len(quarters,(2016-2000)*4+3)
allQuarters[:10]

array(['q1', 'q2', 'q3', 'q4', 'q1', 'q2', 'q3', 'q4', 'q1', 'q2'], 
      dtype='<U2')

In [12]:
allQuarters[-10:]

array(['q2', 'q3', 'q4', 'q1', 'q2', 'q3', 'q4', 'q1', 'q2', 'q3'], 
      dtype='<U2')

In [17]:
assert len(allQuarters)==16*4+3

**<em>R's rep function repeats elements of a vector a given number of times to fill a predetermined length. If the predetermined length exceeds the length of the vector multiplied by the number of times to repeat the elements, the function repeats this process from the beginning until the predetermined length is exhausted.</em>**

**Here's how to duplicate this in python**

*<b>Note</b>: lambda defined in below cell only works when length_out param is no greater than len(x) $\times$ reps*

In [29]:
rep = lambda x,length_out,reps:np.repeat(x,reps)[:length_out]

In [30]:
len(rep(list(range(2000,2005)),27,4))

20

*Notice how only 20 elements were created in np array created in cell above. i.e. once elements in repeated array were exhausted, no recycling happened to fill predetermined length.*

*It works in the cell below since 67$\lt$68*

In [39]:
allYears = rep(list(range(2000,2017)),(2016-2000)*4+3,4).tolist()
allYears

[2000,
 2000,
 2000,
 2000,
 2001,
 2001,
 2001,
 2001,
 2002,
 2002,
 2002,
 2002,
 2003,
 2003,
 2003,
 2003,
 2004,
 2004,
 2004,
 2004,
 2005,
 2005,
 2005,
 2005,
 2006,
 2006,
 2006,
 2006,
 2007,
 2007,
 2007,
 2007,
 2008,
 2008,
 2008,
 2008,
 2009,
 2009,
 2009,
 2009,
 2010,
 2010,
 2010,
 2010,
 2011,
 2011,
 2011,
 2011,
 2012,
 2012,
 2012,
 2012,
 2013,
 2013,
 2013,
 2013,
 2014,
 2014,
 2014,
 2014,
 2015,
 2015,
 2015,
 2015,
 2016,
 2016,
 2016]

*Create 67 quarters of random normal data scattered about 50 (w/ stdev 5) and set index to what you'd think is appropriate*

In [40]:
prices = pd.DataFrame({'year':allYears,'quarter':allQuarters})
prices['price'] = np.random.normal(50,0.5)
prices.set_index(['year','quarter'],inplace = True)
prices

Unnamed: 0_level_0,Unnamed: 1_level_0,price
year,quarter,Unnamed: 2_level_1
2000,q1,50.19102
2000,q2,50.19102
2000,q3,50.19102
2000,q4,50.19102
2001,q1,50.19102
2001,q2,50.19102
2001,q3,50.19102
2001,q4,50.19102
2002,q1,50.19102
2002,q2,50.19102


<b><em>Here's the version of R's rep function that recycles values if parameter length_out exceeds len(x)*repeat. It includes the 'each' parameter feature using the parameter 'repeat' and the 'length.out' parameter using the parameter 'length_out'</em></b> 

In [60]:
def repeat_recycle(x, repeat, length_out):
    rep = lambda x,length_out,repeat:np.repeat(x,repeat)[:length_out]
    repeated = rep(x, length_out, repeat)
    if len(x)*repeat >= length_out:
        return repeated
    v = [None for i in range(length_out)]
    n = len(repeated)
    for i in range(length_out):
        v[i] = repeated[i%n]
    return np.array(v)

In [63]:
repeat_recycle(list(range(2000,2002)), repeat = 4, length_out = 15)


array([2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2000, 2000, 2000,
       2000, 2001, 2001, 2001])

*The original lambda would've given us the below (note it has the parameter order for length_out and repeat switched)*

In [64]:
rep(list(range(2000,2002)), 15, 4)

array([2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001])