<h1>Problem 1</h1>
Write a function that constructs an ndarray from data in a file and 
returns an n-period percent change on an ndarray after removing any nan values. Your function must make use of the following functions:
<li><a href= "https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html">np.genfromtxt</a>
<li><a href= "https://docs.scipy.org/doc/numpy/reference/generated/numpy.pad.html">np.pad</a>
<li><a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.column_stack.html">np.column_stack</a>
    

<p>Test your function out using the attached SBUX.csv file. Your function can assume that the file structure is (date,price)

<p>Note that percent change is a trailing percent change. At time t, it is defined as (p(t)/p(t-1)) - 1), where p(t) is the price at time t.




<h2>Procedure:</h2>
<li>Read the file using <span style="color:red">np.genfromtxt</span></li>
<li>Separate the array into two vectors, one with dates and the other with prices (prices should be float64)</li>
<li>You will need to make sure that any arrays you create are correctly aligned with the dates array!</li>
<li>Create a new vector "prices_0" that shifts all the data n places to the right, pads the first n data items (so that the date alignment is correct), drops the last n (so that the length of the array equals the length of prices). You will need <span style="color:red">np.pad</span> for this</li>
<li>use scalar arithmetic operations to create a new array with percentage changes</li>
<li>combine dates and percent changes to create a new 2-d array with percent changes (use <span style="color:red">np.column_stack</span> for this)</li>

In [1]:
import numpy as np
x = np.array([
    ['2018-09-25', '55.830688'],
    ['2018-09-26', '56.193737'],
    ['2018-09-27', '56.262421'],
    ['2018-09-28', '55.771816'],
    ['2018-10-01', '54.535500'],
    ['2018-10-02', '54.545311'],
    ['2018-10-03', '54.427563'],
    ['2018-10-04', '54.839672'],
    ['2018-10-05', '54.712109'],
    ['2018-10-08', '55.477455']])

print(x)

#For n=2, your function should return:

"""
np.array([        nan,         nan,  0.0077329 , -0.00750833, -0.03069404,
       -0.02199148, -0.00197921,  0.00539663,  0.00522798,  0.01162996])
"""

[['2018-09-25' '55.830688']
 ['2018-09-26' '56.193737']
 ['2018-09-27' '56.262421']
 ['2018-09-28' '55.771816']
 ['2018-10-01' '54.535500']
 ['2018-10-02' '54.545311']
 ['2018-10-03' '54.427563']
 ['2018-10-04' '54.839672']
 ['2018-10-05' '54.712109']
 ['2018-10-08' '55.477455']]


'\nnp.array([        nan,         nan,  0.0077329 , -0.00750833, -0.03069404,\n       -0.02199148, -0.00197921,  0.00539663,  0.00522798,  0.01162996])\n'

In [2]:
#np.pad
#np.pad pads an np array with the specified values
#for example, x is a vector of length 4
x=[1.2,3.1,2.3,1.4]

#we can convert it into a vector of length 8 adding two Nan's before and after the array
np.pad(x,(2,2),'constant',constant_values=np.nan)

array([nan, nan, 1.2, 3.1, 2.3, 1.4, nan, nan])

In [3]:
def get_pct_changes(file_name,n):
    import numpy as np
    
    # imports a csv file separated by commas with the column names and column formats set for easy accessing
    temp = np.genfromtxt(file_name, dtype={'names':('date', 'prices'),
                                    'formats':('<U13', 'float64')},
                                    delimiter=",")
    
    # creates an array out of each column of the numpy arrau
    dates = temp['date']
    prices = temp['prices']
    
    # creates a new array where the first n values are nan and the last n values are removed
    prices_0 = np.pad(prices, (n, 0), 'constant', constant_values=(np.nan))[:-n]
    
    # calculates percent change
    pct_changes = prices / prices_0 - 1
    
    # combines the two numpy arrays together by column
    combined = np.column_stack((dates, pct_changes))
    
    return combined

In [4]:
get_pct_changes("SBUX.csv",2)

array([['9/24/2018', 'nan'],
       ['9/25/2018', 'nan'],
       ['9/26/2018', '0.010231132487091843'],
       ['9/27/2018', '0.007732897721052678'],
       ['9/28/2018', '-0.007508327840876627'],
       ['10/1/2018', '-0.030694039988076627'],
       ['10/2/2018', '-0.021991484014076246'],
       ['10/3/2018', '-0.0019792062051324777'],
       ['10/4/2018', '0.005396632535471291'],
       ['10/5/2018', '0.0052279761267282066'],
       ['10/8/2018', '0.01162995650302201'],
       ['10/9/2018', '0.03497134427773574'],
       ['10/10/2018', '-0.009373861868753663'],
       ['10/11/2018', '-0.04938477767908833'],
       ['10/12/2018', '0.007855754938409953'],
       ['10/15/2018', '0.034451243770345696'],
       ['10/16/2018', '0.02409211955937418'],
       ['10/17/2018', '0.041409748876085395'],
       ['10/18/2018', '0.014357342080123292'],
       ['10/19/2018', '-0.007445039990621627'],
       ['10/22/2018', '0.0044337835848318186'],
       ['10/23/2018', '0.002557211675890736'],
      

<h1>Problem 2</h1>
A naive trader comes up with the following trading strategy:
<li>First calculate the n-period percent change on a price series</li>
<li>If the n-period percent change on day t is positive, then buy the stock and hold for the next day</li>
<li>if the n-period percent change on day t is negative or zero, then do nothing</li>

Write a function that computes the total return from this strategy. You can assume that a constant capital is invested each day (in other words,  sum up the one-day percent change following each day that the n-day percent change is positive)

For the example above (x), your function should return:

0.00294255

For the entire SBUX.csv file, your function should return:

0.28934086191208763

For this function:

<li>Get the n-day percent changes using the function from problem 1</li>
<li>construct a mask on the 2nd column (pct changes) for values >0.0</li>
<li>get the 1-day percent changes using the function from problem 1</li>
<li>shift the mask values to the right keeping the array size constant</li>
<li>apply the shifted mask to the one day percent change and add up the values</li>
<li>return the result!</li>

In [5]:
def p_and_l(filename,n):
    import numpy as np
    
    # gets n-day percent changes
    n_pct_changes = get_pct_changes("SBUX.csv",n)
    
    # creates mask on second column for values > 0.0
    mask = n_pct_changes[:,1].astype('float64') > 0.0
    
    # gets one-day percent changes
    one_pct_changes = get_pct_changes("SBUX.csv", 1)
    
    # shifts the mask values right to account for the one day difference of holding/selling following day
    mask_shifted = np.pad(mask, (1, 0), 'constant', constant_values=(False))[:-1]
    
    # returns the result of the sum of the percent change values
    return sum(one_pct_changes[:,1].astype('float64')[mask_shifted])
    
p_and_l("SBUX=.csv",2)

  


0.28934086191208763