Computational mathematics for learning and data analysis [2019-2020]: project non-ML n°12 implementation

The problem

(P)

is the linear least squares problem min_w∥X⁺w−y∥ where X⁺ is the matrix obtained by augmenting the (tall thin) matrix X from the ML-cup dataset by prof. Micheli with a few functions of your choice of the features of the dataset, and y is one of the corresponding output vectors. For instance, if X contains columns [x1,x2], you may add functions such as log(x1), x1², x1*x2, …

(A1)

is an algorithm of the class of Conjugate Gradient methods [references: J. Nocedal, S. Wright, Numerical Optimization].

(A2)

is thin QR factorization with Householder reflectors [Trefethen, Bau, Numerical Linear Algebra, Lecture 10], in the variant where one does not form the matrix Q, but stores the Householder vectors uk and uses them to perform (implicitly) products with Q and Q^T.

No off-the-shelf solvers allowed. In particular you must implement yourself the thin QR factorization, and the computational cost of your implementation should scale linearly with the largest dimension of the matrix X.

How to tun the code

To install the requirements with

pip install -r requirements.txt

To see the results of (A1) applied to (P), execute the Conjugate Gradient method with

python test_cg.py

and for (A2) execute QR factorization method with

python test_qr.py

To test the computational cost of (A1) and (A2) applied to 3 different matrices

m=n,
m>n
m>>n

use

python times.py

this will override the plots

square.png
little_m.png
big_m.png

in the results folder. The numerical results of times.py will be saved in the txt files {method}_{type of matrix}.txt (for example cg_little_m.txt is the Conjugate Gradient applied to the matrix m>n). The format is the following:

the first line are the sizes
the second line are the times

and everything is saved in scientific notation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Computational mathematics for learning and data analysis [2019-2020]: project non-ML n°12 implementation

The problem

(P)

(A1)

(A2)

How to tun the code

Files

README.md

Latest commit

History

README.md

File metadata and controls

Computational mathematics for learning and data analysis [2019-2020]: project non-ML n°12 implementation

The problem

(P)

(A1)

(A2)

How to tun the code