Performance improvement: PCA could be much faster

## Is your feature request related to a problem? ##


tl;dr `PCA().run()` could be much faster if we compute the covariance matrix in one shot rather than build it incrementally per-frame, and also would benefit a lot from switching from `np.linalg.eig` to `np.linalg.eigh`.

Taking the ADK equilibrium trajectory data as an example workflow, we might do something like:

```python
import MDAnalysis as mda
from MDAnalysisData import datasets
from MDAnalysis.analysis import pca

adk = datasets.fetch_adk_equilibrium()
universe = mda.Universe(adk.topology, adk.trajectory)
pc = pca.PCA(universe, align=True)
pc.run()
```

This takes 881 seconds (~15 mins) on a GCloud c2d-standard-4 (only 5.2s though if you only want to look at Calcium atoms or something).
Among the largest contributors to the runtime are the incremental covariance matrix computation (one `dot` call per frame):

https://github.com/MDAnalysis/mdanalysis/blob/35e433458d191105b7b9e6f217afd68c3c230828/package/MDAnalysis/analysis/pca.py#L332

And also using `eig` to calculate the eigenvalues and eigenvectors of that covariance matrix:
https://github.com/MDAnalysis/mdanalysis/blob/35e433458d191105b7b9e6f217afd68c3c230828/package/MDAnalysis/analysis/pca.py#L336

## Describe the solution you'd like ##


Two changes here would make a big difference:
- Computing the covariance matrix once during `_conclude` (can use the highly optimized blas gemm kernel).
- Using `eigh` instead of `eig`, since the covariance matrix is symmetric and real-valued by construction.


## Describe alternatives you've considered ##


The alternative is to do one or the other or nothing, but I think both of these changes are fairly straightforward. There's going to be a modest change in peak memory usage; building the covariance matrix in one go will require more than before, but `eigh` allocates a smaller workspace, so it's a bit of a wash. I'll put some memory profiling in my PR.


## Additional context ##

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvement: PCA could be much faster #5351

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance improvement: PCA could be much faster #5351

Description

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions