Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linalg: least squares #801

Merged
merged 27 commits into from
May 8, 2024
Merged

linalg: least squares #801

merged 27 commits into from
May 8, 2024

Conversation

perazz
Copy link
Contributor

@perazz perazz commented Apr 21, 2024

Least squares solution to to system $Ax=b$, i.e. such that the 2-norm $||b-Ax||$ is minimized., based on LAPACK *GELSD functions.

  • base implementation
  • tests
  • exclude unsupported xdp
  • documentation
  • submodule
  • examples
  • subroutine interface

Prior art:

  • NumPy: lstsq(a, b, rcond='warn')
  • Scipy: lstsq(a, b, cond=None, overwrite_a=False, overwrite_b=False, check_finite=True, lapack_driver=None)
  • IMSL: Result = IMSL_QRSOL(B, [A] [, AUXQR] [, BASIS] [, /DOUBLE] [, QR] [, PIVOT] [, RESIDUAL] [, TOLERANCE])

Proposed implementation: generic interface for either 1 RHS or many RHS's

  • x = lstsq(a,b) -> simplest call
  • x = lstsq(a,b,cond,overwrite_a,rank,err) -> expert call:
    • real(kind(a)), intent(in), optional :: cond = optional cutoff for rank evaluation: singular values s(i)<=cond*maxval(s) are considered 0. Default: 0.0
    • logical(lk), optional, intent(in) :: overwrite_a = option to avoid internal allocation, but destroy input matrix a. Default: .false.
    • integer(ilp), optional, intent(out) :: rank = return rank of A, if requested
    • type(linalg_state_type), optional, intent(out) :: err = return state variable

cc: @jvdp1 @jalvesz @fortran-lang/stdlib I believe this is ready for consideration.

@perazz perazz marked this pull request as ready for review April 22, 2024 15:15
Copy link
Member

@gnikit gnikit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Maybe some comments about the chosen workspace sizes in ...gesv_space... subroutine would help with future maintenance or performance issues.

Is there anything else that is missing, or are we happy to merge?

I would really like to follow the same pattern of issuing a Release after the merge, is there a reason why we should not do that?

src/stdlib_linalg_least_squares.fypp Outdated Show resolved Hide resolved
src/stdlib_linalg_least_squares.fypp Outdated Show resolved Hide resolved
#:if rk!="xdp"

! Compute the least-squares solution to a real system of linear equations Ax = b
module function stdlib_linalg_${ri}$_lstsq_${ndsuf}$(a,b,cond,overwrite_a,rank,err) result(x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a general design question: wouldn't it be better to have the base API be a subroutine like:

stdlib_linalg_${ri}$_lstsq_${ndsuf}$(a,b,x,cond,overwrite_a,rank,err)
...
${rt}$, intent(inout), target :: x${nd}$

And build the function interface out of this? It would enable users to recycle even the memory allocated for the solution vector if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default functional style here is chosen as it is in line with the approach taken by the most used libraries. I definitely support having one more interface using a subroutine. In this case, what should be called like? Do you think it could be a good idea to open a separate PR with related discussion on it?

Copy link
Contributor

@jalvesz jalvesz Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, what should be called like?

I'm not sure if there is already a convention within stdlib regarding an implementation having both a function and a subroutine interfaces simultaneously ... could be something like slstsq ? too many letters I feel... when working on to_num I used *_base where base were all subroutines, which are called by the function interface, so another idea: lstsq_base ?

Do you think it could be a good idea to open a separate PR with related discussion on it?

That can be a good solution to avoid holding this PR ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to see if I could cook a quick subroutine interface. However, we also need to decide what to do with internal allocation (can be significant):

! Allocate working space
call ${ri}$gesv_space(m,n,nrhs,lrwork,liwork,lcwork)
#:if rt.startswith('complex')
allocate(rwork(lrwork),cwork(lcwork),iwork(liwork))
#:else
allocate(rwork(lrwork),iwork(liwork))
#:endif

As long as this is always allocated in place, re-using x will not lead to significant performance benefits. So yes, I would defer this to a separate PR!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if there is already a convention within stdlib regarding an implementation having both a function and a subroutine interfaces simultaneously

I am not aware of such conventions. However, I am in favour of providing both functions and subroutines when it is possible, and especially when large arrays might be involved..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the implementation of subroutine stdlib_linalg_${ri}$_lstsq_space_${ndsuf}$ I'm wondering if it isn't better to just interface directly call ${ri}$gelsd_space(m,n,nrhs,lrwork,liwork,lcwork) as the main purpose of the former seems to be to extract the sizes out of the matrix and the rhs, but given that this interface is meant for an advanced use with more control over data allocation I would say that the latter is the most ideal functionality to be interfaced.

Pushing this further, maybe this work space would be worth encapsulating in a DT, allocation of the working arrays could be interfaced in a function that returns the DT with the arrays properly allocated.

type, public :: linalg_lstsq_work_space
   ...
end type

This DT could be created by encapsulating the operations currently done between lines 262 and 289, which maybe could help manipulating and recycling these data buffers.

Copy link
Contributor Author

@perazz perazz May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your idea @jalvesz. To save us from the hassle of doing this for every single function, should we envision a storage space DT meant to work with all linear algebra functions? Something like:

type :: linalg_storage
   integer(ilp), allocatable :: iwork(:)
   integer(ilp), allocatable :: pivot(:)
   real(sp), allocatable :: diagonals_sp(:) 
   real(dp), allocatable :: diagonals_dp(:)
   real(qp), allocatable :: diagonals_qp(:)
   real(sp), allocatable :: rwork_sp(:) 
   real(dp), allocatable :: rwork_dp(:)
   real(qp), allocatable :: rwork_qp(:)
   complex(sp), allocatable :: cwork_sp(:) 
   complex(dp), allocatable :: cwork_dp(:)
   complex(qp), allocatable :: cwork_qp(:)
   ...
end type

There is no way to avoid embedding variables for all possible datatypes and solvers, but we could at least simplify the initialization:

type(linalg_storage) :: work
work = linalg_storage('solve',A,b)
work = linalg_storage('lstsq',A,b)
...

Copy link
Contributor

@jalvesz jalvesz May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could split it by kind:

type, abstract :: linalg_storage
   integer(ilp), allocatable :: iwork(:)
   integer(ilp), allocatable :: pivot(:)
end type

type, extends(linalg_storage) :: linalg_storage_sp
   real(sp), allocatable :: diagonals(:) 
   real(sp), allocatable :: work(:) 
end type

type, extends(linalg_storage) :: linalg_storage_dp
   real(dp), allocatable :: diagonals(:) 
   real(dp), allocatable :: work(:) 
end type

type, extends(linalg_storage) :: linalg_storage_csp
   complex(sp), allocatable :: diagonals(:) 
   complex(sp), allocatable :: work(:) 
end type

type, extends(linalg_storage) :: linalg_storage_cdp
   complex(sp), allocatable :: diagonals(:) 
   complex(sp), allocatable :: work(:) 
end type

And for the user he would need to select according to the working precession:

type(linalg_storage_sp) :: work
work = linalg_storage('solve',A,b) !> or  linalg_storage('solve',num_rows,num_cols,num_rhs) ?
work = linalg_storage('lstsq',A,b) !> or  linalg_storage('lstsq',num_rows,num_cols,num_rhs) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Respectfully, I believe this looks a bit overengineered. I'm not sure if the user would be happy to check what version of the derived type they're using every time. Keep in mind that internally, the sizes of all necessary arrays will always be checked. So if the user tries to submit an invalid storage variable to a problem, an error will be returned.

However, I also agree with you that having so many allocatable arrays inside it is also pretty dumb/non-optimal (PS: complex cases need 3: real, integer, complex storage). The best way to me would be to have a parameterized derived type. We may try it as we don't need type-bound procedures, but I don't know if compiler support is wide enough:

type :: linalg_storage(wp)
   integer, kind :: wp
   integer(ilp), allocatable :: pivot(:)
   real(wp), allocatable :: diagonals(:) 
   real(wp), allocatable :: rwork(:) 
   complex(wp), allocatable :: cwork(:) 
end type

So at the end of the day, maybe the big derived type is the best tradeoff?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the PDT seems like the ideal solution, I'm also worried about the support in compilers, but keepings its use plain should allow for a broad use.

So at the end of the day, maybe the big derived type is the best tradeoff?

Probably.

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR. Here are some suggestions.

doc/specs/stdlib_linalg.md Outdated Show resolved Hide resolved
doc/specs/stdlib_linalg.md Outdated Show resolved Hide resolved
doc/specs/stdlib_linalg.md Outdated Show resolved Hide resolved
doc/specs/stdlib_linalg.md Outdated Show resolved Hide resolved
doc/specs/stdlib_linalg.md Outdated Show resolved Hide resolved
example/linalg/example_lstsq.f90 Outdated Show resolved Hide resolved
test/linalg/test_linalg_lstsq.fypp Show resolved Hide resolved
perazz and others added 9 commits April 26, 2024 08:33
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
@perazz
Copy link
Contributor Author

perazz commented Apr 26, 2024

Thank you all @jvdp1 @gnikit @jalvesz for the comments!
I believe it's polished enough now.
Regarding the allocation question @jalvesz, please let me know. Because there are 3/4 allocations per case, the subroutine interface would look similar to gelsd: something like call lstsq_solve(A,b,x,rwork,iwork,cwork,rank,err) so if one wants to handle the solution fully controlling it, maybe (thinking aloud) they could just use the gelsd LAPACK interface that we also provide? Otherwise, I'd say let's open a separate issue for this, so we can discuss further and find a suitable common ground

@jalvesz
Copy link
Contributor

jalvesz commented Apr 26, 2024

Regarding the allocation question @jalvesz, please let me know. Because there are 3/4 allocations per case, the subroutine interface would look similar to gelsd: something like call lstsq_solve(A,b,x,rwork,iwork,cwork,rank,err) so if one wants to handle the solution fully controlling it, maybe (thinking aloud) they could just use the gelsd LAPACK interface that we also provide?

Indeed, full control on that regard can be done using the gelsd interface directly.

I'd say let's open a separate issue for this, so we can discuss further and find a suitable common ground

Yes, let's keep this idea in mind for latter on.

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @perazz . LGTM. I only have a couple of minor suggestions.

src/stdlib_linalg.fypp Outdated Show resolved Hide resolved
#:if rk!="xdp"

! Compute the least-squares solution to a real system of linear equations Ax = b
module function stdlib_linalg_${ri}$_lstsq_${ndsuf}$(a,b,cond,overwrite_a,rank,err) result(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if there is already a convention within stdlib regarding an implementation having both a function and a subroutine interfaces simultaneously

I am not aware of such conventions. However, I am in favour of providing both functions and subroutines when it is possible, and especially when large arrays might be involved..

@perazz
Copy link
Contributor Author

perazz commented Apr 29, 2024

@jvdp1 @jalvesz @gnikit I added the subroutine interface. This interface is now far more complicated, so I would like to ask your help deciding if the expert user interface has now too many control knobs? We now also have to export an interface to let the user compute what's the minimum working array space for all cases.

Here is the new interfaces I've prepared:

pure module subroutine stdlib_linalg_${ri}$_lstsq_space_${ndsuf}$(a,b,lrwork,liwork#{if rt.startswith('c')}#,lcwork#{endif}#)
!> Input matrix a[m,n]
${rt}$, intent(inout), target :: a(:,:)
!> Right hand side vector or array, b[n] or b[n,nrhs]
${rt}$, intent(in) :: b${nd}$
!> Size of the working space arrays
integer(ilp), intent(out) :: lrwork,liwork#{if rt.startswith('c')}#,lcwork#{endif}#
end subroutine stdlib_linalg_${ri}$_lstsq_space_${ndsuf}$

module subroutine stdlib_linalg_${ri}$_solve_lstsq_${ndsuf}$(a,b,x,real_storage,int_storage,&
#{if rt.startswith('c')}#cmpl_storage,#{endif}#cond,singvals,overwrite_a,rank,err)
!> Input matrix a[n,n]
${rt}$, intent(inout), target :: a(:,:)
!> Right hand side vector or array, b[n] or b[n,nrhs]
${rt}$, intent(in) :: b${nd}$
!> Result array/matrix x[n] or x[n,nrhs]
${rt}$, intent(inout), contiguous, target :: x${nd}$
!> [optional] real working storage space
real(${rk}$), optional, intent(inout), target :: real_storage(:)
!> [optional] integer working storage space
integer(ilp), optional, intent(inout), target :: int_storage(:)
#:if rt.startswith('complex')
!> [optional] complex working storage space
${rt}$, optional, intent(inout), target :: cmpl_storage(:)
#:endif
!> [optional] cutoff for rank evaluation: singular values s(i)<=cond*maxval(s) are considered 0.
real(${rk}$), optional, intent(in) :: cond
!> [optional] list of singular values [min(m,n)], in descending magnitude order, returned by the SVD
real(${rk}$), optional, intent(out), target :: singvals(:)
!> [optional] Can A,b data be overwritten and destroyed?
logical(lk), optional, intent(in) :: overwrite_a
!> [optional] Return rank of A
integer(ilp), optional, intent(out) :: rank
!> [optional] state return flag. On error if not requested, the code will stop
type(linalg_state_type), optional, intent(out) :: err
end subroutine stdlib_linalg_${ri}$_solve_lstsq_${ndsuf}$

@jalvesz
Copy link
Contributor

jalvesz commented Apr 29, 2024

I'll try to take a close look at this soon @perazz, great work!! This makes me think about the issues when one wants to solve sparse linear systems with different preconditioners, one also needs to create a work space for the preconditioner. So I think it is worth defining this notion properly now.

perazz and others added 3 commits April 29, 2024 19:07
@jvdp1
Copy link
Member

jvdp1 commented May 7, 2024

Thank you @perazz . It sounds ready to be merged IMO. Do you agree? @jalvesz @gnikit ?

@jalvesz
Copy link
Contributor

jalvesz commented May 8, 2024

Yes, I think this PR is ready. The subroutine & DT discussion can be continued down the road in a new PR, which are not show stoppers IMO ;)

@gnikit
Copy link
Member

gnikit commented May 8, 2024

LGTM, however some CI MacOS tests are failing. Do we know what is causing that?

@jalvesz
Copy link
Contributor

jalvesz commented May 8, 2024

Those are fails from before this merged PR #807 by @perazz, I think running the tests again would be successful.

@jvdp1
Copy link
Member

jvdp1 commented May 8, 2024

Thank you @perazz /all. I will merge this PR now, and I suggest to wait on #806 to create a new release.

@jvdp1 jvdp1 merged commit 33559f5 into fortran-lang:master May 8, 2024
17 checks passed
@perazz
Copy link
Contributor Author

perazz commented May 9, 2024

Great! Thank you @jvdp1 @jalvesz @gnikit. For the new release, how about waiting until most of the linear algebra functions are in? But, your proposal is also ok to me.

@perazz perazz deleted the least_squares branch May 9, 2024 06:17
@jvdp1
Copy link
Member

jvdp1 commented May 9, 2024

For the new release, how about waiting until most of the linear algebra functions are in? But, your proposal is also ok to me.

Which other linear algebra functions do you plan to add? Could they be grouped based on some theme?

@perazz
Copy link
Contributor Author

perazz commented May 9, 2024

If we look at NumPy.linalg for example, good LAPACK-based functions would be:

  • eigenvalues (eig, eigh, eigvals, eigvalsh)
  • condition number (cond)
  • matrix decompositions (svd, cholesky, qr)
  • pseudo-inverse (based on svd)

@jvdp1
Copy link
Member

jvdp1 commented May 9, 2024

We could follow the same subsections ("Matrix and vector products", "decompositions", "matrix eigenvalues", "norms", "solving equations") as in Numpy. Each section could be related with a new release. However, this would need some orders of implementation and impact your plans, which I don't want. So, as you prefer regarding releases.

@gnikit
Copy link
Member

gnikit commented May 9, 2024

For the new release, how about waiting until most of the linear algebra functions are in? But, your proposal is also ok to me.

For me the answer depends on how long it will take for all the linear algebra functions to be completed and the final release to be issued. My original thought was to aim for quick feedback loops from the users/community to inform future linalg API PRs.
The way I see it, incremental releases for high quality, self-contained work, come at a small cost to development. This is how I would classify these last few PRs.

However, the main objective is to deliver the linear algebra API not to follow release best-practices, so I leave this entirely up to you @perazz.

@perazz
Copy link
Contributor Author

perazz commented May 9, 2024

Makes a lot of sense @gnikit. If you'd like to prepare a new release already @jvdp1, I'm in for that, do let me know if/what inputs do you need from my side. For example, could be 0.5.1 so the community can respond to the API on Discourse. Thanks!

@jvdp1 jvdp1 mentioned this pull request May 11, 2024
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants