Skip to content

Cholesky-based precision calculation #3067

Open
@david-cortes-intel

Description

@david-cortes-intel

Note: this is a transcription from the docs section on ideas for contributors.

In line with scikit-learn's EmpiricalCovariance estimator the Covariance algorithm from scikit-learn-intelex by default also calculates and stores the Precision - i.e. the inverse of the covariance. This inverse is obtained by eigendecomposition, and it may be used within the scikit-learn-intelex interface to calculate Mahalanobis distances.

However, for full-rank matrices, it's likely faster to obtain the precision matrix out of the covariance by a Cholesky-based inversion, at the expense of slightly reduced numerical accuracy. This could be implemented on the oneDAL side by handling the option to calculate the precision in the C++ interface, storing the precision in the C++ object, and calculating it with Cholesky when possible, falling back to eigenvalue-based decomposition if Cholesky fails or is too inexact. Note that implementation of the idea about partial eigendecompositions would also be of use here, as Cholesky-based inversion would not be applicable to rank-deficient matrices, in which case it should go directly for eigendecomposition.

Having a triangular factorization of the precision would also open the possibility of speeding up Mahalanobis distance calculations, which would be faster with triangular matrices than with full-rank square root matrices as produced by eigendecomposition. While Mahalanobis distance is typically calculated with the Cholesky of the precision, a different Cholesky-like factorization would also suffice - for example, it would be faster to obtain a factorization of the precision from the Cholesky of the covariance, such as suggested in this StackExchange answer, which could then be stored on the C++ object and used for Mahalanobis distance calculations by adding a new method.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions