content/cvxr_examples/intro.Rmd

---
title: Introduction
author: Anqi Fu and Balasubramanian Narasimhan
date: '2017-10-29'
categories:
  - Simple Least Squares
slug: cvxr_intro
---

Consider a simple linear regression problem where it is desired to
estimate a set of parameters using a least squares criterion. 

We generate some synthetic data where we know the model completely,
that is

$$ Y = X\beta + \epsilon $$ 

where $Y$ is a $100\times 1$ vector, $X$ is a $100\times 10$ matrix,
$\beta = [-4,\ldots ,-1, 0, 1, \ldots, 5]$ is a $10\times 1$ vector, and
$\epsilon \sim N(0, 1)$.


```{r}
set.seed(123)

n <- 100
p <- 10
beta <- -4:5   # beta is just -4 through 5.

X <- matrix(rnorm(n * p), nrow=n)
colnames(X) <- paste0("beta_", beta)
Y <- X %*% beta + rnorm(n)
```

Given the data $X$ and $Y$, we can estimate the $\beta$ vector using
`lm` function in R that fits a standard regression model.

```{r}
ls.model <- lm(Y ~ 0 + X)   # There is no intercept in our model above
m <- matrix(coef(ls.model), ncol = 1)
rownames(m) <- paste0("$\\beta_{", 1:p, "}$")
library(kableExtra)
knitr::kable(m, format = "html") %>%
    kable_styling("striped") %>%
    column_spec(1:2, background = "#ececec")
```

These are the least-squares estimates and can be seen to be reasonably
close to the original $\beta$ values -4 through 5.

## The `CVXR` formulation

The `CVXR` formulation states the above as an optimization problem:

$$
  \begin{array}{ll}
    \underset{\beta}{\mbox{minimize}} & \|y - X\beta\|_2^2,
  \end{array}
$$
which directly translates into a problem that `CVXR` can solve as shown
in the steps below.

- Step 0. Load the `CVXR` library

```{r}
suppressWarnings(library(CVXR, warn.conflicts=FALSE))
```

- Step 1. Define the variable to be estimated

```{r}
betaHat <- Variable(p)
```

- Step 2. Define the objective to be optimized

```{r}
objective <- Minimize(sum((Y - X %*% betaHat)^2))
```
Notice how the objective is specified using functions such as `sum`,
`*%*` and `^`, that are familiar to R users despite that fact that
`betaHat` is no ordinary R expression but a `CVXR` expression.

- Step 3. Create a problem to solve

```{r}
problem <- Problem(objective)
```

- Step 4. Solve it!

```{r}
result <- solve(problem)
```

- Step 5. Extract solution and objective value

```{r, echo = FALSE}
solution <- result$getValue(betaHat)
cat(sprintf("Objective value: %f\n", result$value))
```

We can indeed satisfy ourselves that the results we get matches that
from `lm`.

```{r}
m <- cbind(result$getValue(betaHat), coef(ls.model))
colnames(m) <- c("CVXR est.", "lm est.")
rownames(m) <- paste0("$\\beta_{", 1:p, "}$")
knitr::kable(m, format = "html") %>%
    kable_styling("striped") %>%
    column_spec(1:3, background = "#ececec")
```

## Wait a minute! What have we gained?

On the surface, it appears that we have replaced one call to `lm` with
at least five or six lines of new R code. On top of that, the code
actually runs slower, and so it is not clear what was really achieved.

So suppose we knew that the $\beta$s were nonnegative and we wish to
take this fact into account. This
is
[nonnegative least squares regression](https://en.wikipedia.org/wiki/Non-negative_least_squares) and
`lm` would no longer do the job.

In `CVXR`, the modified problem merely requires the addition of a constraint to the
problem definition.

```{r}
problem <- Problem(objective, constraints = list(betaHat >= 0))
result <- solve(problem)
m <- matrix(result$getValue(betaHat), ncol = 1)
rownames(m) <- paste0("$\\beta_{", 1:p, "}$")
knitr::kable(m, format = "html") %>%
    kable_styling("striped") %>%
    column_spec(1:2, background = "#ececec")
```

We can verify once again that these values are comparable to those
obtained from another R package,
say [nnls]( https://CRAN.R-project.org/package=nnls). 

```{r}
library(nnls)
nnls.fit <- nnls(X, Y)$x
```

```{r}
m <- cbind(result$getValue(betaHat), nnls.fit)
colnames(m) <- c("CVXR est.", "nnls est.")
rownames(m) <- paste0("$\\beta_{", 1:p, "}$")
knitr::kable(m, format = "html") %>%
    kable_styling("striped") %>%
    column_spec(1:3, background = "#ececec")
```

## Okay that was cool, but...

As you no doubt noticed, we have done nothing that other R packages
could not do.

So now suppose further, for some extraneous reason, that the sum of
$\beta_2$ and $\beta_3$ is known to be negative and but all other
$\beta$s are positive.

It is clear that this problem would not fit into any standard
package. But in `CVXR`, this is easily done by adding a few
constraints. 

To express the fact that $\beta_2 + \beta_3$ is negative, we construct
a row matrix with zeros everywhere, except in positions 2 and 3 (for
$\beta_2$ and $\beta_3$ respectively).

```{r}
A <- matrix(c(0, 1, 1, rep(0, 7)), nrow = 1)
colnames(A) <- paste0("$\\beta_{", 1:p, "}$")
knitr::kable(A, format = "html") %>%
    kable_styling("striped") %>%
    column_spec(1:10, background = "#ececec")
```

The sum constraint is nothing but 
$$
A\beta < 0
$$

which we express in R as

```{r}
constraint1 <- A %*% betaHat < 0
```

_NOTE_: The above constraint can also be expressed simply as

```{r, eval = FALSE}
constraint1 <- betaHat[2] + betaHat[3] < 0
```
but it is easier working with matrices in general with `CVXR`. 


For the positivity for rest of the variables, we construct a $10\times
10$ matrix $A$ to have 1's along the diagonal everywhere except rows 2
and 3 and zeros everywhere.

```{r}
B <- diag(c(1, 0, 0, rep(1, 7)))
colnames(B) <- rownames(B) <- paste0("$\\beta_{", 1:p, "}$")
knitr::kable(B, format = "html") %>%
    kable_styling("striped") %>%
    column_spec(1:11, background = "#ececec")
```

The constraint for positivity is 
$$
B\beta > 0
$$

which we express in R as 

```{r}
constraint2 <- B %*% betaHat > 0
```
Now we are ready to solve the problem just as before. 

```{r}
problem <- Problem(objective, constraints = list(constraint1, constraint2))
result <- solve(problem)
```

And we can get the estimates of $\beta$.

```{r}
m <- matrix(result$getValue(betaHat), ncol = 1)
rownames(m) <- paste0("$\\beta_{", 1:p, "}$")
knitr::kable(m, format = "html") %>%
    kable_styling("striped") %>%
    column_spec(1:2, background = "#ececec")
```

This demonstrates the chief advantage of `CVXR`: _flexibility_. Users
can quickly modify and re-solve a problem, making our package ideal
for prototyping new statistical methods. Its syntax is simple and
mathematically intuitive. Furthermore, `CVXR` combines seamlessly with
native R code as well as several popular packages, allowing it to be
incorporated easily into a larger analytical framework. The user is
free to construct statistical estimators that are solutions to a
convex optimization problem where there may not be a closed form
solution or even an implementation. Such solutions can then be
combined with resampling techniques like the bootstrap to estimate
variability.


## Further Reading

We hope we have whet your appetite. You may wish to
read [a longer introduction](/cvxr_examples/gentle-intro/)
with more examples. 

We also have a number of [tutorial examples](/examples/)
available to study and mimic.

## Session Info

```{r}
sessionInfo()
```

## Source

[R Markdown](https://github.com/bnaras/cvxr_docs/blob/master/content/cvxr_examples/intro.Rmd)

## References