In [2]:
using VMLS
using LinearAlgebra

## Chapter 8
# Linear equations
### 8.1 Linear and affine functions
**Matrix-vector product function.** Let’s define an instance of the matrix-vector
product function, and then numerically check that superposition holds.

In [1]:
A = [-0.1 2.8 -1.6; 2.3 -0.6 -3.6] # Define 2x3 matrix A

2×3 Array{Float64,2}:
 -0.1   2.8  -1.6
  2.3  -0.6  -3.6

In [3]:
f(x) = A*x # Define matrix-vector product function

f (generic function with 1 method)

In [4]:
# Let’s check superposition
x = [1, 2, 3]; y= [-3, -1, 2];
alpha = 0.5; beta = -1.6;
lhs = f(alpha*x+beta*y)
rhs = alpha*f(x)+beta*f(y)
lhs,rhs

([9.47, 16.75], [9.47, 16.75])

In [24]:
norm(lhs-rhs), f([0,1,0]), A[:,2] # Should be second column of A

(1.7763568394002505e-15, [2.8, -0.6], [2.8, -0.6])

**De-meaning matrix.** Let’s create a de-meaning matrix, and check that it works
on a vector.

In [25]:
de_mean(n) = eye(n) .- 1/n; # De-meaning matrix
 x = [0.2, 2.3, 1.0];
 ex1 = de_mean(length(x))*x # De-mean using matrix multiplication
 ex2 = x .- avg(x) # De-mean by subtracting mean
ex1, ex2

([-0.966667, 1.13333, -0.166667], [-0.966667, 1.13333, -0.166667])

**Examples of functions that are not linear.** The componentwise absolute value
and the sort function are examples of nonlinear functions. These functions are
easily computed by `abs` and `sort`. By default, the `sort` function sorts in increasing
order, but this can be changed by adding an optional keyword argument.

In [27]:
f(x) = abs.(x) # componentwise absolute value

f (generic function with 1 method)

In [28]:
x = [1, 0]; y = [0, 1]; alpha = -1; beta = 2;
f(alpha*x + beta*y)

2-element Array{Int64,1}:
 1
 2

In [29]:
alpha*f(x) + beta*f(y)

2-element Array{Int64,1}:
 -1
  2

In [30]:
f(x) = sort(x, rev = true) # sort in decreasing order
f(alpha*x + beta*y)

2-element Array{Int64,1}:
  2
 -1

In [31]:
alpha*f(x) + beta*f(y)

2-element Array{Int64,1}:
 1
 0

### 8.2 Linear function models
**Price elasticity of demand.** Let’s use a price elasticity of demand matrix to pre-
dict the demand for three products when the prices are changed a bit. Using this
we can predict the change in total profit, given the manufacturing costs.

In [32]:
p = [10, 20, 15]; # Current prices
d = [5.6, 1.5, 8.6]; # Current demand (say in thousands)
c = [6.5, 11.2, 9.8]; # Cost to manufacture
profit = (p-c)'*d # Current total profit

77.51999999999998

In [33]:
# Demand elasticity matrix
E = [-0.3 0.1 -0.1; 0.1 -0.5 0.05 ; -0.1 0.05 -0.4]

3×3 Array{Float64,2}:
 -0.3   0.1   -0.1 
  0.1  -0.5    0.05
 -0.1   0.05  -0.4 

In [34]:
p_new = [9, 21, 14]; # Proposed new prices
delta_p = (p_new-p)./p # Fractional change in prices

3-element Array{Float64,1}:
 -0.1                
  0.05               
 -0.06666666666666667

In [35]:
delta_d = E*delta_p 
# Predicted fractional change in demand

3-element Array{Float64,1}:
  0.04166666666666667
 -0.03833333333333334
  0.03916666666666667

In [36]:
d_new = d .* (1 .+ delta_d) 
# Predicted new demand

3-element Array{Float64,1}:
 5.833333333333333
 1.4425           
 8.936833333333333

In [38]:
profit_new = (p_new-c)'*d_new 
# Predicted new profit

profit_new, profit

(66.25453333333333, 77.51999999999998)

If we trust the linear demand elasticity model, we should not make these price
changes.

**Taylor approximation.** Consider the nonlinear function $f : R^2 → R^2$ given by
$$
f(x) =
\begin{bmatrix}
‖x − a‖\\
‖x − b‖
\end{bmatrix}
=
\begin{bmatrix} 
\sqrt{(x_1 − a_1)^2 + (x_2 − a_2)^2}\\
\sqrt{(x_1 − b_1)^2 + (x_2 − b_2)^2}
\end{bmatrix}
.
$$

The two components of $f$ give the distance of $x$ to the points $a$ and $b$. The function is differentiable, except when $x = a$ or $x = b$. Its derivative or $Jacobian$ matrix is given by

$$
Df(z) =
\begin{bmatrix}
\frac{∂f_1}{∂x_1}(z) & \frac{∂f_1}{∂x_2}(z) \\
\frac{∂f_2}{∂x_1}(z) & \frac{∂f_2}{∂x_2}(z)
\end{bmatrix}
 = 
\begin{bmatrix}
\frac{‖z_1 − a_1‖}{‖z − a‖} & \frac{‖z_2 − a_2‖}{‖z − a‖} \\
\frac{‖z_1 − b_1‖}{‖z − b‖} & \frac{‖z_2 − b_2‖}{‖z − b‖} 
\end{bmatrix}
.
$$
Let’s form the $Taylor$ approximation of $f$ for some specific values of $a$, $b$, and $z$,
and then check it against the true value of $f$ at a few points near $z$.

In [39]:
f(x) = [ norm(x-a), norm(x-b) ];
Df(z) = [ (z-a)' / norm(z-a) ; (z-b)' / norm(z-b) ];
f_hat(x) = f(z) + Df(z)*(x-z);
a = [1, 0]; b = [1, 1]; z = [0, 0];
f([0.1, 0.1])

2-element Array{Float64,1}:
 0.9055385138137417
 1.2727922061357855

In [40]:
f_hat([0.1, 0.1])

2-element Array{Float64,1}:
 0.9               
 1.2727922061357857

In [41]:
f([0.5, 0.5])

2-element Array{Float64,1}:
 0.7071067811865476
 0.7071067811865476

In [43]:
f_hat([0.5, 0.5])

2-element Array{Float64,1}:
 0.5               
 0.7071067811865477

Regression model. We revisit the regression model for the house sales data in
Section [2.3](https://web.stanford.edu/~boyd/vmls/vmls.pdf#section*.192). The model is

$$
ŷ = x^Tβ + v = β_1x_1 + β_2x_2 + v,
$$

where $ŷ$ is the predicted house sale price, $x_1$ is the house area in $1000$ square feet, and $x_2$ is the number of bedrooms. 

In the following code we construct the $2 × 774$ data matrix $X$ and vector of outcomes $y^d$, for the $N = 774$ examples in the data set. We then calculate the regression model predictions $ŷ^d$, the prediction errors $r^d$, and the $RMS$ prediction error.

In [49]:
# parameters in regression model
beta = [148.73, -18.85]; v = 54.40;
D = house_sales_data();
yd = D["price"]; # vector of outcomes
N = length(yd)
X = [ D["area"] D["beds"] ]';
N, size(X) 

(774, (2, 774))

In [50]:
ydhat = X'*beta .+ v; # vector of predicted outcomes
rd = yd - ydhat; # vector of predicted errors
rms(rd) # RMS prediction error

74.84571862623022

In [51]:
# Compare with standard deviation of prices
stdev(yd)

112.78216159756509

### 8.3 Systems of linear equations
**Balancing chemical reactions.** We verify the linear balancing equations on page [155](https://web.stanford.edu/~boyd/vmls/vmls.pdf#section*.192)
of VMLS, for the simple example of electrolysis of water.

In [52]:
R = [2 ; 1]
P = [2 0 ; 0 2]
# Check balancing coefficients [2,2,1]
coeff = [2,2,1];
[R -P]*coeff

2-element Array{Int64,1}:
 0
 0