# Bootstrapping a Linear Regression

In [1]:
include("jlFiles/printmat.jl")

include("jlFiles/lagnPs.jl")
include("jlFiles/excise.jl")
include("jlFiles/OlsFn.jl")

OlsFn (generic function with 1 method)

In [2]:
#xx   = readdlm("Data/FFmFactorsPs.csv",',',header=true)
#x    = xx[1]
#Rme  = x[:,2]
#RSMB = x[:,3]                #small minus big firms
#RHML = x[:,4]                #high minus low book-to-market ratio
#Rf   = x[:,5]                    #interest rate
#
#
#x = readdlm("Data/FF25Ps.csv",',')  #no header line: x is matrix
#R  = x[:,2:end]                    #returns for 25 FF portfolios
#Re = R - repmat(Rf,1,size(R,2))   #excess returns for the 25 FF portfolios
#y = Re(:,[1,25])                 #use only portfolio 1 (small growth) and 25 (large value)
#x = [ones(size(Re,1),1) Rme RSMB RHML]
#------------------------------------------------------------------------------

xx  = readdlm("Data/BondPremiaPs.csv",',',header=true)
xx  = xx[1]
rx  = xx[:,2:5]                   #bond excess returns
f   = xx[:,6:end]                 #forward rates

x = [ones(size(f,1)) lagnPs(f,12)]

yx = excise([rx[:,4] x])          #to use in regressions 
y  = yx[:,1]
x  = yx[:,2:end]

(T,n) = size(y,1,2)                 #no. obs and no. test assets
K     = size(x,2)

println("T = $T, n = $n, K = $K")

T = 580, n = 1, K = 6


## Point Estimates

In [3]:
(bLS,u,yhat,Covb,) = OlsFn(y,x)              #OLS estimate and classical std errors
StdbLS = sqrt.(diag(Covb))

println("\nLS coeffs and std")
printmat([bLS';StdbLS'])


LS coeffs and std
    -3.306    -4.209    10.627   -14.397     7.096     1.284
     0.823     0.712     4.509    12.885    15.862     6.898



## Bootstrap


After that follows the bootstrap itself .

The code makes NSim loops. 

In each loop, we initially define a random
starting point (row number) of each block (by using the rand()
function)---and create a vector of all rows that are in a block. For instance,
suppose we randomly draw that the blocks should start on rows $27$ and $35$
(...assuming only two blocks in each simulation) and that we have decided that
each block should contain $10$ rows, then the artificial sample will pick out
rows $27-36$ and $35-44$. Clearly, some rows can be in several blocks. Once we
have $T$ rows, we define a new series of residuals, $\tilde{u}_{t}$.

Then, new values of the dependent variable are created as 
$\tilde{y}_{t}=x_{t}^{\prime}\beta+\tilde{u}_{t}$ and we redo the estimation on 
($\tilde{y}_{t},x_{t}$).

In [4]:
BlockSize = 10                  #size of blocks
NSim      = 2000                 #no. of simulations
srand(123)

nBlocks = round(Int,ceil(T/BlockSize))             #number of blocks, rounded up
bBoot   = fill(NaN,(NSim,K*n))                       #vec(b), [beq1 beq2..beqn]
for i = 1:NSim                                       #loop over simulations
  t_i        = rand(1:T,nBlocks,1)                   #nBlocks x 1, random starting row of blocks
  t_i        = t_i .+ collect(0:BlockSize-1)'        #nBlocks x BlockSize, each row is a block
  vv_i       = t_i .> T
  t_i[vv_i]  = t_i[vv_i] - T                         #wrap around if index > T
  #println(t_i)                                      #uncomment to see which rows that are picked out
  t_i        = vec(t_i')                             #column vector of the blocks
  utilde     = u[t_i,:]
  ytilde     = x*bLS + utilde[1:T,:]
  b_i,       = OlsFn(ytilde,x)                       #,skips the remaining outputs
  bBoot[i,:] = b_i
end

println("\nAverage bootstrap estimates and bootstrapped std")
printmat([mean(bBoot,1); std(bBoot,1)])

println("\nbootstrapped std/OLS std")
printmat(std(bBoot,1)./StdbLS')


Average bootstrap estimates and bootstrapped std
    -3.318    -4.164    10.636   -14.643     7.371     1.206
     2.072     1.391     8.024    23.175    29.076    12.906


bootstrapped std/OLS std
     2.517     1.955     1.780     1.799     1.833     1.871

