# Information Retrieval in High Dimensional Data
## Lab 7

|     |     |
| --- | --- |
| **Name:** | Uzair Akbar |
| **Matriculation Number:** | 03697290 |
| **E-mail:** | [uzair.akbar@tum.de](mailto:uzair.akbar@tum.de) |

## CVXOPT
### Task 1
Machine Learning tasks are typically thought of optimization problems, e.g. minimizing an error function or maximizing a probability. Ideally, the optimization problem turns out to be convex, which implies that any local minimum is the global minimum of the formulation, and what is even more important, we can.  In the following, it will be assumed that you have some basic knowledge about convex optimization. The intention of this task is to familiarize ourselves with CVXOPT, one of the most-widely used convex optimization toolboxes.

**a)**  Go to `cvxopt.org` and follow the installation instructions for your distribution. For conda, you need to run
`conda install -c conda-forge cvxopt`

**b)** Skim through the **Examples** section on `cvxopt.org` to get an overview of the functionality of the different solvers of CVXOPT.

In [4]:
from cvxopt import matrix, solvers
import numpy as np

**c)** Implement a function `minsq` which expects a NumPy array `A` of shape `(m,n)` and a NumPy array `y` of shape `(m,)` as its arguments and returns a NumPy array `x` of shape `(n,)` that solves the following problem.

<center>$\mathrm{min_\mathbf{x}} \|\mathbf A\mathbf{x}-\mathbf{y}\|$.</center>

Test your function by feeding it with appropriate inputs and comparing the results with the ones you get by using `np.linalg.pinv`. Experiment by adding white Gaussian noise to `y`. If CVXOPT does not accept your NumPy arrays, try casting them to `double`.

In [5]:
def minsq(A, y):
    P=matrix(np.dot(A.T,A).astype('double'))
    q=matrix(-np.dot(A.T,y).astype('double'))
    x=solvers.qp(P,q)
    return np.array(x['x'])

A=np.array([[10, 40],[20, 0],[-30, 40]])
y=np.array([50,20,10])+np.random.randn(3,)

print('A:', A)
print('y:', y)
print('x:', minsq(A,y).squeeze())
print('np.dot(pinv(A),y):', np.dot(np.linalg.pinv(A),y))

A: [[ 10  40]
 [ 20   0]
 [-30  40]]
y: [49.09765874 18.33829948 10.79004839]
x: [0.9440985  0.98462096]
np.dot(pinv(A),y): [0.9440985  0.98462096]
