Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Change vdot docs to mention computing the standard inner product on a space of matrices #26342

Open
rileyjmurray opened this issue Apr 24, 2024 · 1 comment

Comments

@rileyjmurray
Copy link

rileyjmurray commented Apr 24, 2024

Issue with current documentation:

The docs for vdot in NumPy 1.26 (link here) contain the following verbiage. I've emphasized some problematic text in bold.

Note that vdot handles multidimensional arrays differently than dot: it does not perform a matrix product, but flattens input arguments to 1-D vectors first. Consequently, it should only be used for vectors.

The bolded text is misleading because it doesn't account for the value of this function in computing the standard inner product on vector spaces of matrices. See below for details on this value in the cases of real and complex data.

Real matrices

If $\mathcal{H}$ the vector space of real $m \times n$ matrices, and $A,B$ are matrices in $\mathcal{H}$, then their (standard) inner product is $\langle A, B \rangle_{\mathcal{H}} = \text{trace}(A^T B)$. The naive way to implement this is to write np.trace(A.T @ B). However, you can get the same result much more efficiently with np.vdot(A, b).

I see people use the naive method in their code all the time. That's bad, because the complexity difference of the two methods is stark. Here's a table comparing them. (Small disclaimer: I don't know how numpy implements vdot in C, but the following figures are easily achievable.)

trace(A.T @ B) vdot(A, B)
time $O(n m^2)$ $O(m n)$
space $O( n^2 )$ $O(1)$

Complex matrices

Now suppose $\mathcal{H}$ is the vector space of complex $m \times n$ matrices. The standard inner product of matrices $A,B \in \mathcal{H}$ is $\langle A, B \rangle_{\mathcal{H}} = \text{trace}(A^\dagger B)$, where $A^{\dagger}$ is the conjugate transpose of $A$.

Here's a table comparing three ways one might compute this inner product. As before, I'm assuming that vdot's C implementation doesn't have any obvious inefficiencies. I'm also assuming that calling .conj() returns a new (conjugated) array, rather than a view of an existing array.

trace(A.T.conj() @ B) dot(A.conj().ravel(), B.ravel()) vdot(A, B)
time complexity $O(n m^2)$ $O(mn)$ $O(mn)$
space complexity $O( mn + n^2 )$ $O(m n)$ $O(1)$

Besides the complexity considerations,vdot has a secondary advantage: it accounts for the fact that the standard inner product is conjugate-linear in the first argument and linear in the second argument. If someone isn't mindful of where they put the conjugate then it's easy to use the return value from something like `dot(A.conj().ravel(), B.ravel())`` in a way that's incorrect in the context of a larger project.

Idea or request for content:

The documentation could be updated to something like the following.

Note that vdot handles multidimensional arrays differently than dot: it does not perform a matrix product, but flattens input arguments to 1-D vectors first. The runtime of this function is linear in a.size and b.size. When (a, b) are 2-D arrays of the same shape, this function returns their Frobenius inner-product (also known as the trace inner product or the standard inner product on a vector space of matrices).

@ngoldbaum
Copy link
Member

I don’t see a problem with changing the text to note it can be used on stacks of matrices, please send in a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants