Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: linalg helper: orthogonal complement of matrix #3039

Open
josef-pkt opened this issue Jun 19, 2016 · 10 comments
Open

ENH: linalg helper: orthogonal complement of matrix #3039

josef-pkt opened this issue Jun 19, 2016 · 10 comments

Comments

@josef-pkt
Copy link
Member

josef-pkt commented Jun 19, 2016

(parking another helper function, so it doesn't get lost)

Written to understand orthogonal complement which is heavily used in VECM/cointegration.
Should also show up in canonical correlation, reduced rank regression and similar.
Also looks similar to reparameterization to include constraints (which I wrote without understanding the math).

general story: econometrics books often refer to things without telling how to compute it.
Eg.
constraints: w.l.o.g. we only consider this constraint, because the other forms can be obtained by reparameterization/transformation".

Related to here: normalization of cointegrating vectors to [eye, beta_2] requires a sequence of equations so that first variables with coefficients normalize to eye include elements of all cointegrating vectors. (But authors like Lutkepohl don't say how.)
My guess, not verified, is that below this will show up in the inv of the first part during normalization.
I need an example to check this.

# -*- coding: utf-8 -*-
"""
Created on Fri Jun 17 13:42:36 2016

Author: Josef Perktold
License: BSD-3


example from http://ltcconline.net/greenl/courses/203/Vectors/orthogonalComplements.htm

no reference for the use of SVD, but it should be "obvious"

"""

import numpy as np

def orthogonal_complement(x, normalize=True, threshold=1e-15):
    """Compute orthogonal complement of a matrix

    this works along axis zero, i.e. rank == column rank,
    or number of rows > column rank
    otherwise orthogonal complement is empty

    TODO possibly: use normalize='top' or 'bottom'

    """
    x = np.asarray(x)
    r, c = x.shape
    if r < c:
        import warnings
        warnings.warn('fewer rows than columns', UserWarning)

    # we assume svd is ordered by decreasing singular value, o.w. need sort
    s, v, d = np.linalg.svd(x)
    rank = (v > threshold).sum()

    oc = s[:, rank:]

    if normalize:
        k_oc = oc.shape[1]
        oc = oc.dot(np.linalg.inv(oc[:k_oc, :]))
    return oc


xxx = np.array([[1,0,1,0,2], [0,1,1,1,0], [1,1,1,1,1]]).T
ocn = np.round(orthogonal_complement(xxx, normalize=True), 2)
print(ocn)
print(orthogonal_complement(xxx, normalize=False))
print(ocn.T.dot(xxx))

"""
# some math courses seem to normalize at the bottom

>>> x2 = np.array([[1., 3, 0], [2, 1, 4]])
>>> np.round(orthogonal_complement(x2.T, normalize=True), 2)
array([[ 1.  ],
       [-0.33],
       [-0.42]])
>>> orthogonal_complement(x2.T, normalize=True)
array([[ 1.        ],
       [-0.33333333],
       [-0.41666667]])
>>> oc2 = orthogonal_complement(x2.T, normalize=True)
>>> oc2 / oc2[-1]
array([[-2.4],
       [ 0.8],
       [ 1. ]])
"""
@josef-pkt
Copy link
Member Author

bump I think something like this would be useful for transforming spline basis functions in #5296

If we have some spline basis, and we want to remove some subspace from it.

  • currently, I have the "centering" transformation to replicate patsy/mgcv to remove constant and center spline basis so that the mean partial prediction is zero.
  • in a spline basis that starts with a constant and a trend column, we can just remove the first one or two columns.
  • in the general case, we want to take out some subspace, e.g. constant and linear function if they are not columns already.

in two steps we could do this with

  • get orthogonal complement, e.g. residuals of multivariate ols
  • get full rank subspace by including only singular/eigen vectors with positive singular values

The orthogonal_complement above sounds like it's just getting the "null" space.
Aside: I don't remember what normalize is supposed to be.

We want something similar to CCA that includes the projection on a set of conditioning variables.
We have a cca function and a full CCA class which might handle this case (but I haven't checked the math yet).

#5296 adds a matrix_sqrt helper function that includes handling of singular matrices and null space, but is only for symmetric square matrices like covariance matrices.

@josef-pkt
Copy link
Member Author

see also #4992 for organizing linalg tools

@josef-pkt
Copy link
Member Author

josef-pkt commented Nov 2, 2018

another case, I don't know whether that is handled by the above
use case #5364 variable addition score/LM test

reduced rank partial projection
I would like that partial projection returns the orthogonal space with minimal number of columns, i.e. so that the orthogonal array has full rank.

bs.basis_.shape
(203, 20)

from statsmodels.stats.multivariate_tools import partial_project
projection_resid = partial_project(bs.basis_, res_glm.model.exog)

np.linalg.matrix_rank(projection_resid.resid), projection_resid.resid.shape[1]
(18, 20)

the pca function can do this, but we need to know the keepdim in advance

from statsmodels.sandbox.tools.tools_pca import pca
red = pca(projection_resid.resid, keepdim=18)[1]

@JaeDukSeo
Copy link

JaeDukSeo commented Nov 20, 2018

@josef-pkt great code love it

@JaeDukSeo
Copy link

@josef-pkt are you a professor?

@JaeDukSeo
Copy link

if so sorry for my language, the code is great

@josef-pkt
Copy link
Member Author

(I edited your comment a bit to keep the language here clean.)
When I was an economist I got up to being an assistant professor.

Thanks, it's good to here that I'm not the only one that likes this kind of code. The code itself is pretty simple, but it's difficult for someone who is not a linalg expert which I'm not.

@josef-pkt
Copy link
Member Author

docstring missing, e.g. what is normalize doing?
we can add a second x0=None for the usecase of having the orthogonal complement of x relative to x0. I'm not sure the terminology is correct, but that's what we need in my comment above for variable addition hypothesis tests.

@JaeDukSeo
Copy link

great code! just came back to see

@josef-pkt
Copy link
Member Author

checking my local files

ex_gam_bs_autos_cleaned.ipynb includes a section on score_test using partial_project and pca
https://gist.github.com/josef-pkt/453de603b019143e831fbdd4dfb6aa30

aside: I didn't find a GAM notebook in the doc examples

@josef-pkt josef-pkt modified the milestones: 0.10, 0.15 Jan 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants