# Summary and plan

Hi folks:

## What is in the notebook and what I did

* I fixed some name space collisions that were causing confusion
    * I changed the block size from b to w and started using the term warp in some places
    * I change P to M in 6.4 
* Importantly I fixed a typo in what is now (6.9) in the definition of b
    * The clue was that the reduced model value did not match the non-reduced model
* I learned that PSB and SR1 behave very similarly on a quadratic test problem for their first step
    * I guess I was not surprised but they are VERY similar
* I learned that the fit quality is very close on this first step for a quadratic.
    * This has to do with extracting and accurately relaxing the "large" eigenvalues. 
    * Still it is much closer than I thought it would be. 
    * We should not expect this on real problems
* I fixed all the things in the manuscript I could find to do with this stuff. 
    
 ## Plan
 
 1. I may have misunderstood but I did not see the functions isolated out in a safe place so that nobody touches them once we are confident that they are "correct". 
       * I think the updates and linear algebraic builds etc should be somewhere safe.
       * I would have an "include" file at the top but that is probably completely retro.
 2. Dan could you fix the issues I fixed and see if it fixes all your problems.
 3. This afternoon I am going to implement a fake version in Mathematica running on the Rosenbrock function
     * This is so that I can start to think how to describe the results.
     * I am being guided by the stuff on my board (thanks Ben) from Wednesday
     * I am going to start editing the numerical experiment description.
     
 

In [4]:
using LinearAlgebra, TRS

orth (generic function with 1 method)

#  AAS Additions and comments
1. orth undefined. Defined and tested below.
2. I believe testing the updates was confusing inverses.  Changed to be consistent with manuscript.  Changes marked in cells
3. Cells split up to increaase granularity.
4. I expected the updates to be defined and exported from a package. I copied and included the defs from the oother notebook this is not good practice. It looks as though they are intended to be in a file util.jl but I can not find the file. 

In [79]:
# include("util.jl")
function orth(A)
    Matrix(qr(A).Q)
end
(n,s) = (10,4)
S = orth(randn(n,s))
norm(S'*S - I)

7.9289057297445255e-16

#  AAS Additions and comments
1. Cells split up to increaase granularity.
4. I expected the updates to be defined and exported from a package. 
    * I copied and included the defs from the oother notebook this is not good practice. 
    * It looks as though they are intended to be in a file util.jl but I can not find the file. 
    *

# Update Tests

1. Underlying constant Hessian is A is symmettric
1. Approximations per manuscaript are H~inv(A) and B~A
1. Update info satisfies V = A*U
    * U is input 
    * V is output
    * Automatically U'*V is symmettric
1. Switching to new block dimension internal size w (see revised manuscript). 
1. I expected the updates to be defined in a package or included in a file. . 
    * I copied and included the defs from the oother notebook this is not good practice. 
    * It looks as though they are intended to be in a file util.jl but I can not find the file. 
1. Symmetric tag deleted to check symmetry and then restored. 
1. Output symmetric and satisfies 
1. Updates look like they match defs.  Should be stored in a single location where they can not be broken and imported using standard techniques. 

In [77]:
## From Test!
# util.jl 
function bSR1(H::AbstractArray{<:Real}, U::AbstractArray{<:Real}, V::AbstractArray{<:Real}, δ::Float64)
    U_minus_HV = U - H*V
    return Symmetric(H + U_minus_HV * pinv(U_minus_HV'*V, δ) *  U_minus_HV')
    #return H + U_minus_HV * pinv(U_minus_HV'*V, δ) *  U_minus_HV'
end

function bPSB(H::AbstractArray{<:Real}, U::AbstractArray{<:Real}, V::AbstractArray{<:Real}, δ::Float64)
    T₁ = pinv(V'*V, δ)   
    T₂ = V*T₁*(U - H*V)'     
    return Symmetric(H + T₂ + T₂' - T₂*V*T₁*V') 
    #return H + T₂ + T₂' - T₂*V*T₁*V' 
end

# Minimalist Tests
(n,w) = (324,4)
# define tests Symmettric but not SPD
Temp = randn(n,n);
A=Temp+Temp'
Temp = randn(n,n);
H0=Temp+Temp'
# define sample data
U = randn(n,2w); V=A*U;
# Compute updates 
delta=1e-6
Hp = bSR1(H0,U,V,delta);
Hs = bPSB(H0,U,V,delta);
# Check symmetry.
map(norm, (Hp-Hp',Hs-Hs'))
#Check Secant equation
map(norm, (Hp*V-U,Hs*V-U))

(2.822787533668037e-10, 2.750804979956145e-11)

# Trust region sub problem construction

1. The plan is to build out the trust region sub problem and check the derivation. 
    * Working from the manuscript.  
    * Equation numbers and variable names match
1. Working from Dans notebook but testing on general problems.  Lots of changes! 
    * Changed s to w per change in man.
    * Built A for clarity
    * I may have been confused but I think there was a missing Inverse in construction of mp (6.1). 

In [242]:
# Building problem
(n,w) = (32,6);
# Build A and d function f = 1/2 x.a.x + d.x:  
# Avoiding b to avoid namespace collisions.
A  = (Temp = rand(n, n); Symmetric(Temp'+ Temp'))
d = rand(n);
function f(x)
    0.5*x'*A*x + d'*x
end
# Initializing H0 and x0 
H0 = (Temp = rand(n, n); Symmetric(Temp'+ Temp'))
x0 = rand(n)
# Fake data for U and V from A.  In the real world would use gHS
# Naming as in Alg 4.1 with zeros appended.  df is gradf
S0 = orth(rand(n, 2w-1))
Y0 =A*S0
df0 = A*x0 + d
h0 = A*df0
# Build U and V per "I think I need to add a numbered equation" unless I am being blind. 
# Remember U is input and V is output.  This needs an equation number in the manuscript
U = [S0 df0]
V = [Y0 h0]
# check data is consistent
# println(norm(V-A*U))
# update Hp and Hs using

delta = 1e-6
Hp = bSR1(H0,U,V,delta)
Hs = bPSB(H0,U,V,delta)
# build standard quadratic model which is consciously not named in (6.1)
# There are two here called mSTDs and mSTDp. Variable is full dimensional p to match 6.1
InvHp = inv(Hp)
InvHs = inv(Hs)

function mSTDp(p) 
    0.5*p'*InvHp*p + df0'*p
end
function mSTDs(p) 
    0.5*p'*InvHs*p + df0'*p
end
# build fancy quadratic models named in (6.2)
function mkp(q) 
    0.5*q'*Hp*q + (Hp*df0)'*q
end
function mks(q) 
    0.5*q'*Hs*q + (Hs*df0)'*q
end
# check substitution p = H*q in both variants
q0=rand(n);
(mSTDs(Hs*q0)-mks(q0), mSTDp(Hp*q0)-mkp(q0))

# Build out the arguments for the the TRS call
# 6.4  defines explicit representations of the search spaces M.  
# Note: These were called P which I changed to M to avoid a name space collision with TRS. 
# As before, appended p and s indicates which update is used.
Mp = [df0 Hp*df0 S0]
Ms = [df0 Hs*df0 S0]
# Building out the alternate representations for the spaces in 6.5
# Note I just noticed that this space (matrices) is visibly independent of the update
# I made both for consistency. 
Qs = [h0 df0 Y0]
Qp = [h0 df0 Y0]
# Just checking
# println(size(Qs)); svd(Qs).S
# Build out the arguments for the the TRS call
# 6.4  defines explicit representations of the search spaces M.  
# Note: These were called P which I changed to M to avoid a name space collision with TRS. 
# As before, appended p and s indicates which update is used.
Mp = [df0 Hp*df0 S0]
Ms = [df0 Hs*df0 S0]

# Building out the alternate representations for the spaces in 6.5
# Note I just noticed that this space (matrices) is visibly independent of the update
# I made both for consistency. 
Qs = [h0 df0 Y0]
Qp = [h0 df0 Y0]
# Just checking
# println(size(Qs)); svd(Qs).S
# Making the P and b arguments (6.9) for Julia TRS
# Asserting symmetry to make TRS happy!
# As before, appended p and s indicates which update is used.
Ps = Symmetric(Qs'*Hs*Qs)
Pp = Symmetric(Qp'*Hp*Qp)
# the Ps are not the same. checking conditioning Conditioning is are similar
#[eigen(Ps).values eigen(Pp).values]
# the bs are not the same.  There was a typo in (6.8) in the expressions for b  
bs = Qs'*Hs*df0
bp = Qp'*Hp*df0
# the Cs are not the same 
Cs = Symmetric(Qs'*Hs*Hs*Qs)
Cp = Symmetric(Qp'*Hp*Hp*Qp)
# checking conditioning.  They are similar
#[eigen(Cs).values eigen(Cp).values]
# calling trs_small per (6.8)
DeltaRadius = 2.7;
as, sFlags = trs_small(Ps,bs,DeltaRadius,Cs);
ap, pFlags = trs_small(Pp,bp,DeltaRadius,Cp);
# The flags return (hard_case, 0, 0, [lambdas]) 
# we should really almost never be in the hard case. Sampled and the results match intuition
# Comparing constraint from low dimensional model in the full dimensional model
[as'*Cs*as-[DeltaRadius^2] ap'*Cp*ap-[DeltaRadius^2];
norm(Hs*Qs*as)-DeltaRadius norm(Hp*Qp*ap)-DeltaRadius]
# Comparing obj vals from low dimensional, restricted, and full-dimensional model
# The two update models are similar but not the same.  Again matching intuition
# First col is (6.7), second is (6.6) aka (6.2), third is (6.1), fourth is actual reduction 
# First row is SR1 second row is PSB
# Note lots of things are returning 1x1 matrices.  My f returns a scalar so [f(x0)] is needed to subtract!
# Note rho is very good here! It should be they are quadratics.
[0.5*as'*Ps*as+bs'*as mks(Qs*as) mSTDs(Hs*Qs*as) f(x0+Hs*Qs*as)-[f(x0)];
    0.5*ap'*Pp*ap+bp'*ap mkp(Qp*ap) mSTDp(Hp*Qp*ap) f(x0+Hp*Qp*ap)-[f(x0)]]

2×4 Matrix{Float64}:
 -137.32   -137.32   -137.32   -137.324
 -136.464  -136.464  -136.464  -136.597

# Summary and plan

Hi folks:

## What is in the notebook and what I did

* I fixed some name space collisions that were causing confusion
    * I changed the block size from b to w and started using the term warp in some places
    * I change P to M in 6.4 
* Importantly I fixed a typo in what is now (6.9) in the definition of b
    * The clue was that the reduced model value did not match the non-reduced model
* I learned that PSB and SR1 behave very similarly on a quadratic test problem for their first step
    * I guess I was not surprised but they are VERY similar
* I learned that the fit quality is very close on this first step for a quadratic.
    * This has to do with extracting and accurately relaxing the "large" eigenvalues. 
    * Still it is much closer than I thought it would be. 
    * We should not expect this on real problems
* I fixed all the things in the manuscript I could find to do with this stuff. 
    
 ## Plan
 
 1. I may have misunderstood but I did not see the functions isolated out in a safe place so that nobody touches them once we are confident that they are "correct". 
       * I think the updates and linear algebraic builds etc should be somewhere safe.
       * I would have an "include" file at the top but that is probably completely retro.
 2. Dan could you fix the issues I fixed and see if it fixes all your problems.
 3. This afternoon I am going to implement a fake version in Mathematica running on the Rosenbrock function
     * This is so that I can start to think how to describe the results.
     * I am being guided by the stuff on my board (thanks Ben) from Wednesday
     * I am going to start editing the numerical experiment description.
     
 

In [236]:
mSTDs(Hs*Qs*as)

1×1 Matrix{Float64}:
 -63.11371235242522

In [237]:
f(x0)

163.83470881876383