Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add squared norm (quadrance) to numpy.linalg. #18250

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

DPDmancul
Copy link

The main aim of this pull request is to add a function sqnorm to easily calculate the squared norm (also called quadrance in an euclidean space or Frobenius norm for matrices) of an array.
For notation consistency with numpy.linalg.norm this function is also able to return one of eight different matrix norms (possibly squared), or one of an infinite number of vector element powers sum (described in the documentation), depending on the value of the ord parameter.

Motivation

When calculating errors it's very common to calculate the squared norm, now there are two solutions: calculate the norm and square it (very common but with a drawback: norm applies a (worthless) squared root, so the result can be not very precise) or sum the square of the elements (difficult to read); with the squared norm it's more compact, readable and less prone to approximation errors. Some minimal examples:

>>> x = np.array([[1,2],[3,4]])
>>> y = np.array([[2,2],[4,4]])

>>> # RSS
>>> ((x - y)**2).sum()   # using sum of squares
2
>>> LA.norm(x - y)**2    # using norm 
2.0000000000000004
>>> LA.sqnorm(x - y)     # using sqnorm
2.0

>>> # MSE
>>> ((x - y)**2).mean()          # using sum of squares
0.5
>>> LA.norm(x - y)**2 / x.size   # using norm
0.5000000000000001
>>> LA.sqnorm(x - y) / x.size    # using sqnorm 
0.5

What can do

The following can be calculated:

ord for matrices for vectors
None squared Frobenius norm squared 2-norm
'fro' squared Frobenius norm --
'nuc' nuclear norm --
inf max(sum(abs(x), axis=1)) max(abs(x))
-inf min(sum(abs(x), axis=1)) min(abs(x))
0 -- sum(x != 0)
1 max(sum(abs(x), axis=0)) as below
-1 min(sum(abs(x), axis=0)) as below
2 sq. 2-norm (largest sq. sing. value) as below
-2 smallest sq. singular value as below
other -- sum(abs(x)**ord)

Nomenclature

I called it sqnorm as a contraction of squared norm, but it could be called in a more meaningful way, since the squared norm is only its main aim, but not all what it does.

Other changes

Since sqnorm does essentially the same work of norm without extracting the root (except few cases), in this pull request the majority of norm results are based on sqnorm result in order to avoid double code.

some norm cases was rewritten to reuse sqnorm
@DPDmancul
Copy link
Author

The test Build_Test / debug (pull_request) failed but not due to my commits but the container cannot download a package:

E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/pool/main/g/glibc/libc6-dbg_2.31-0ubuntu9.1_amd64.deb  404  Not Found [IP: 52.252.75.106 80]

Unfortunately I don't know how to restart the test.

@charris
Copy link
Member

charris commented Jan 29, 2021

Unfortunately I don't know how to restart the test.

Don't worry about it. Because this adds a new function it needs to be run by the mailing list first. My own preference would be to start with an absolute_square ufunc, which ISTR has been discussed before.

ret = add.reduce(absx, axis=axis, keepdims=keepdims)
ret **= (1 / ord)
return ret
return ret ** (1 / ord)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

norm_sq feels like a weird name when ord != 2 for this case, although I also can't really think of a better name.

@rgommers
Copy link
Member

I'm not sure that I find the motivation for adding a separate function for this convincing. The accuracy issue in the example given is 2 * eps, which is negligible:

>>> x = np.array([[1, 2], [3, 4]])
>>> y = np.array([[2, 2], [4, 4]])
>>> (np.linalg.norm(x - y)**2 - 2)/ np.finfo(np.float64).eps
2.0

The performance improvement amounts to avoiding one square root of a scalar - on the order of 1 us.

Neither seems worth going through a lot of trouble for.

Base automatically changed from master to main March 4, 2021 02:05
@mattip
Copy link
Member

mattip commented Mar 10, 2022

@DPDmancul did this ever hit the mailing list? Also please relate to the comment

My own preference would be to start with an absolute_square ufunc, which ISTR has been discussed before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Pending authors' response
Development

Successfully merging this pull request may close these issues.

None yet

5 participants