Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for linear algebra #749

Open
perazz opened this issue Dec 4, 2023 · 6 comments
Open

Support for linear algebra #749

perazz opened this issue Dec 4, 2023 · 6 comments
Labels
idea Proposition of an idea and opening an issue to discuss it

Comments

@perazz
Copy link
Contributor

perazz commented Dec 4, 2023

Hello stdlib developers,
I'm opening this issue to summarize and coalesce upcoming efforts to integrate linear algebra operations in stdlib, in particular:

  1. Accessible interfaces for common linear algebra operations

We want stdlib to be able to solve linear algebra tasks by wrapping against libraries like BLAS, LAPACK, SCALAPACK, with a user-friendly interface. I think this means:

  • To get the best possible user-level API (almost no inputs at all), maintaining familiarity the syntax that other libraries also provide (scipy, etc.) but have an expert interface with all control knobs to tune the algorithms (e.g. via settings stored in derived types). Thanks to Fortran arrays, If this is done well, I believe stdlib could have the best linear algebra API out there;
  • Wrapping against BLAS and LAPACK means that the low-level API should probably not be changed (so that down the road, stdlib could just link against external BLAS libraries or platform-dependent frameworks), but a reference implementation must be provided anyways. Should we aim at maintaining our own Modernized BLAS/LAPACK at fortran-lang, or just automate the download process from netlib? I would personally like the idea to develop a Modernized version once and forall (those reference implementations will almost never change anymore), although those are huge repos that would require significant work.
  1. Support for common IO formats for matrices and tensors
  • To allow efficient working with multidimensional arrays, we should support easy serialization/deserialization and conversion among formats. This could include formats like NPZ, matrixmarket, etc. An easily extendible API should be provided to have plugins for other formats which might or might not fit in the scope of stdlib.

  • I believe an equally important task is to define derived types that are capable of storing temporary information which is not strictly matrix data (e.g. matrix factorization, or working arrays for the matrix solvers), to avoid unnecessary overhead in case of repeated algebra operations. This means that "simple" array storage may need to be replaced with matrix derived types in those cases.

Because there are plenty of options in defining these APIs, It is crucial to the success of this task that as much feedback as possible is given, so I would like to encourage all ideas - and criticisms - to be discussed on this issue, so that we can come up with the best possible version. I am also opening a discussion page on the Fortran-lang Discourse that we can use for more verbose discussions.

Thank you,
Federico

cc @certik @awvwgk @fortran-lang/stdlib @fortran-lang/admins

Linked issues

Linear algebra (BLAS/LAPACK)
#1
#10
#67
#450
#476

-> Regarding dense algebra, I've started a discussion at #450

Sparse algebra
#38

@perazz perazz added the idea Proposition of an idea and opening an issue to discuss it label Dec 4, 2023
@jalvesz
Copy link
Contributor

jalvesz commented Jan 6, 2024

Hi,

In order to help this idea move forward regarding its extension to sparse algebra I thought about moving parts of FSPARSE here.

I thought about doing a first PR to define the sparse types with something like:

Click to open (stdlib_sparse_kinds.fypp)
#:include "common.fypp"
!> The `stdlib_sparse_kinds` module provides derived type definitions for different sparse matrices
!>
!> This code was modified from https://github.com/jalvesz/FSPARSE by its author: Alves Jose
module stdlib_sparse_kinds
    use stdlib_kinds, only: int8, int16, int32, int64, sp, dp, xdp, qp
    implicit none
    
    private
  
    ! -- Global parameters
    enum, bind(C)
        enumerator :: k_NOSYMMETRY !! Full Sparse matrix (no symmetry considerations)
        enumerator :: k_SYMTRIINF  !! Symmetric Sparse matrix with triangular inferior storage
        enumerator :: k_SYMTRISUP  !! Symmetric Sparse matrix with triangular supperior storage
    end enum
    ! -- Classes
    type, abstract :: sparse_t
      integer :: nrows = 0 !! number of rows
      integer :: ncols = 0 !! number of columns
      integer :: nnz   = 0 !! number of non-zero values
      integer :: sym   = k_NOSYMMETRY !! assumed storage symmetry
      integer :: base  = 1 !! index base = 0 for (C) or 1 (Fortran)
    end type
  
    !! COO: COOrdinates compresed format
    type, public, extends(sparse_t) :: COO_t
      logical               :: isOrdered = .false. !! wether the matrix is ordered or not
      integer, allocatable  :: index(:,:) !! Matrix coordinates index(2,nnz)
    contains
      procedure :: malloc => malloc_coo
    end type

    #:for k1, t1 in (REAL_KINDS_TYPES)
    type, public, extends(COO_t) :: COO_${k1}$
        ${t1}$, allocatable :: data(:) !! single precision values
    end type
    #:endfor

    !! CSR: Compressed sparse row or Yale format
    type, extends(sparse_t) :: CSR_t  
      integer, allocatable  :: col(:)    !! matrix column pointer
      integer, allocatable  :: rowptr(:) !! matrix row pointer
    contains
      procedure :: malloc => malloc_csr
    end type
  
    #:for k1, t1 in (REAL_KINDS_TYPES)
    type, public, extends(CSR_t) :: CSR_${k1}$
        ${t1}$, allocatable :: data(:) !! single precision values
    end type
    #:endfor

    !! CSC: Compressed sparse column
    type, extends(sparse_t) :: CSC_t  
      integer, allocatable  :: colptr(:) !! matrix column pointer
      integer, allocatable  :: row(:)    !! matrix row pointer
    contains
      procedure :: malloc => malloc_csc
    end type
  
    #:for k1, t1 in (REAL_KINDS_TYPES)
    type, public, extends(CSC_t) :: CSC_${k1}$
        ${t1}$, allocatable :: data(:) !! single precision values
    end type
    #:endfor
  
    !! Compressed ELLPACK
    type, extends(sparse_t) :: ELL_t 
      integer               :: K = 0 !! maximum number of nonzeros per row
      integer, allocatable  :: index(:,:) !! column indices
    contains
      procedure :: malloc => malloc_ell
    end type
  
    #:for k1, t1 in (REAL_KINDS_TYPES)
    type, public, extends(ELL_t) :: ELL_${k1}$
        ${t1}$, allocatable :: data(:,:) !! single precision values
    end type
    #:endfor

contains

    subroutine malloc_coo(self,num_rows,num_cols,nnz)
        class(COO_t) :: self
        integer, intent(in) :: num_rows
        integer, intent(in) :: num_cols
        integer, intent(in) :: nnz

        integer,  allocatable :: temp_idx(:,:)
        !-----------------------------------------------------

        self%nrows = num_rows
        self%ncols = num_cols
        self%nnz = nnz

        if(.not.allocated(self%index)) then
            allocate(temp_idx(2,nnz) , source = 0 )
        else
            allocate(temp_idx(2,nnz) , source = self%index )
        end if
        call move_alloc(from=temp_idx,to=self%index)

        select type(self)
            #:for k1, t1 in (REAL_KINDS_TYPES)
            type is(COO_${k1}$)
                block
                ${t1}$, allocatable :: temp_data_${k1}$(:)
                if(.not.allocated(self%data)) then
                    allocate(temp_data_${k1}$(nnz) , source = 0._${k1}$ )
                else
                    allocate(temp_data_${k1}$(nnz) , source = self%data )
                end if
                call move_alloc(from=temp_data_${k1}$,to=self%data)
                end block
            #:endfor
        end select
    end subroutine

    subroutine malloc_csr(self,num_rows,num_cols,nnz)
        class(CSR_t) :: self
        integer, intent(in) :: num_rows
        integer, intent(in) :: num_cols
        integer, intent(in) :: nnz

        integer,  allocatable :: temp_idx(:)
        !-----------------------------------------------------

        self%nrows = num_rows
        self%ncols = num_cols
        self%nnz   = nnz

        if(.not.allocated(self%col)) then
            allocate(temp_idx(nnz) , source = 0 )
        else
            allocate(temp_idx(nnz) , source = self%col )
        end if
        call move_alloc(from=temp_idx,to=self%col)

        if(.not.allocated(self%rowptr)) then
            allocate(temp_idx(num_rows+1) , source = 0 )
        else
            allocate(temp_idx(num_rows+1) , source = self%rowptr )
        end if
        call move_alloc(from=temp_idx,to=self%rowptr)

        select type(self)
            #:for k1, t1 in (REAL_KINDS_TYPES)
            type is(CSR_${k1}$)
                block
                ${t1}$, allocatable :: temp_data_${k1}$(:)
                if(.not.allocated(self%data)) then
                    allocate(temp_data_${k1}$(nnz) , source = 0._${k1}$ )
                else
                    allocate(temp_data_${k1}$(nnz) , source = self%data )
                end if
                call move_alloc(from=temp_data_${k1}$,to=self%data)
                end block
            #:endfor
        end select
    end subroutine

    subroutine malloc_csc(self,num_rows,num_cols,nnz)
        class(CSC_t) :: self
        integer, intent(in) :: num_rows
        integer, intent(in) :: num_cols
        integer, intent(in) :: nnz

        integer,  allocatable :: temp_idx(:)
        !-----------------------------------------------------

        self%nrows = num_rows
        self%ncols = num_cols
        self%nnz   = nnz

        if(.not.allocated(self%row)) then
            allocate(temp_idx(nnz) , source = 0 )
        else
            allocate(temp_idx(nnz) , source = self%row )
        end if
        call move_alloc(from=temp_idx,to=self%row)

        if(.not.allocated(self%colptr)) then
            allocate(temp_idx(num_cols+1) , source = 0 )
        else
            allocate(temp_idx(num_cols+1) , source = self%colptr )
        end if
        call move_alloc(from=temp_idx,to=self%colptr)

        select type(self)
            #:for k1, t1 in (REAL_KINDS_TYPES)
            type is(CSC_${k1}$)
                block
                ${t1}$, allocatable :: temp_data_${k1}$(:)
                if(.not.allocated(self%data)) then
                    allocate(temp_data_${k1}$(nnz) , source = 0._${k1}$ )
                else
                    allocate(temp_data_${k1}$(nnz) , source = self%data )
                end if
                call move_alloc(from=temp_data_${k1}$,to=self%data)
                end block
            #:endfor
        end select
    end subroutine

    subroutine malloc_ell(self,num_rows,num_cols,num_nz_rows)
        class(ELL_t) :: self
        integer, intent(in) :: num_rows    !! number of rows
        integer, intent(in) :: num_cols    !! number of columns
        integer, intent(in) :: num_nz_rows !! number of non zeros per row

        integer,  allocatable :: temp_idx(:,:)
        !-----------------------------------------------------

        self%nrows = num_rows
        self%ncols = num_cols
        self%K     = num_nz_rows

        if(.not.allocated(self%index)) then
            allocate(temp_idx(num_rows,num_nz_rows) , source = 0 )
        else
            allocate(temp_idx(num_rows,num_nz_rows) , source = self%index )
        end if
        call move_alloc(from=temp_idx,to=self%index)

        select type(self)
            #:for k1, t1 in (REAL_KINDS_TYPES)
            type is(ELL_${k1}$)
                block
                ${t1}$, allocatable :: temp_data_${k1}$(:,:)
                if(.not.allocated(self%data)) then
                    allocate(temp_data_${k1}$(num_rows,num_nz_rows) , source = 0._${k1}$ )
                else
                    allocate(temp_data_${k1}$(num_rows,num_nz_rows) , source = self%data )
                end if
                call move_alloc(from=temp_data_${k1}$,to=self%data)
                end block
            #:endfor
        end select
    end subroutine
    
end module stdlib_sparse_kinds

Other formats can be added following a similar pattern. Before going any further with the methods, I though that having some comments about the derived types for the data containment is important.

For instance:

  • Should dense matrices be included as well as one of the derived types? or plain allocatables should do?

@jvdp1
Copy link
Member

jvdp1 commented Jan 7, 2024

@jalvesz it seems a good start to me. I developed a library with similar interfaces. There have been already a lot on discussion about sparse matrices, and it is always difficult to find a clear clonclusion. So, I suggest to keep the first draft a simple as possible. Other formats (and dense matrices) could be added later.

@perazz
Copy link
Contributor Author

perazz commented Jan 8, 2024

Would it make sense to open a new branch on the stdlib repo to coalesce linear algebra progress around?

@jvdp1
Copy link
Member

jvdp1 commented Jan 14, 2024

Would it make sense to open a new branch on the stdlib repo to coalesce linear algebra progress around?

Yes, it is probably a good idea. There is also #189 related to this topic that is open since a while (cc @jalvesz )

There is also this page related to sparse matrices

@jalvesz
Copy link
Contributor

jalvesz commented Jan 14, 2024

Let me know when you open the branch and I'll move the PR under that one, indeed it would be easier to coalesce all linear algebra topics under one branch that could move a bit faster and serve as playground

@gnikit
Copy link
Member

gnikit commented Jan 15, 2024

Hey @perazz is this related to the grant money from STF and the work done over at https://github.com/perazz/fortran-lapack or would this be an independent piece of work (to be performed by the community)? Regardless, it would make sense to have an stdlib branch containing the relevant progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Proposition of an idea and opening an issue to discuss it
Projects
None yet
Development

No branches or pull requests

4 participants