-
Notifications
You must be signed in to change notification settings - Fork 105
batched api / eti documentation #2643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
355e932
to
2b36852
Compare
* Par_ilut enhancements and fixes 1) Improve documentation of parameters and confusing code segments 2) Do not let leftover junk in L_entries and U_entries to cause the sorted check to fail. Those are output-only parameters. 3) Allow the user to skip the residual computation by setting residual_norm_delta_stop to zero (or < 0). This can save some memory if memory is tight. Signed-off-by: James Foucar <jgfouca@sandia.gov> * Docs Signed-off-by: James Foucar <jgfouca@sandia.gov> * Fix doc build errors Signed-off-by: James Foucar <jgfouca@sandia.gov> --------- Signed-off-by: James Foucar <jgfouca@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: romintomasetti <romin.tomasetti@gmail.com> Signed-off-by: alphataubio <alphataubio@gmail.com>
* Avoid illegal use of View::stride and View::extent in BLAS 1 As the documentation says you can't legally call extent(1)/stride(1) on a rank-1 View. Signed-off-by: Christian Trott <crtrott@sandia.gov> * Avoid illegal use of View::stride and View::extent in Batched As the documentation says you can't legally call extent(1)/stride(1) on a rank-1 View Signed-off-by: Christian Trott <crtrott@sandia.gov> --------- Signed-off-by: Christian Trott <crtrott@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
Including the `std_algorithms/Kokkos_IsSorted.hpp` header is needed for Trilinos builds to compile Addresses issue kokkos#2612 Signed-off-by: Nathan Ellingwood <ndellin@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
This older rocm toolchain is very slow compared to the new rocm 6.2.1 so it is disabled in favor of that newer version. Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
…#2615) Signed-off-by: Daniel Arndt <arndtd@ornl.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: Mitch Murphy <alphataubio@gmail.com> Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
6add37a
to
adf0244
Compare
* osx workflow: removing explicit parallelism levels Switching to implicit parallelism levels for build and installation as it may change over time based on github hardware and we don't want to muck with it everytime... Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> * Explicitly setting the parallelism level Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> * Untabify Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> * Bumping cxx standard to 20 and removing deprecated_code_3 The standard should be raised ahead of the upcoming release 5.0.0 to make sure the compiler is supporting it and also to prepare the code itself if needed. Additionally removing the old Kokkos_ENABLE_DEPRECATED_CODE_3 configuration option as it is no longer supported. Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> --------- Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
…ave changed (kokkos#2568) Signed-off-by: Junchao Zhang <jczhang@anl.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
* Update changelog for 4.6.01 Signed-off-by: Nathan Ellingwood <ndellin@sandia.gov> * workflows: update kokkos version to 4.6.01 Signed-off-by: Nathan Ellingwood <ndellin@sandia.gov> --------- Signed-off-by: Nathan Ellingwood <ndellin@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
* ConjTrans support for batched team gemm Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp> * fix: scalar types Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp> * use rank() rather than rank to check the rank of View types Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp> * Add a helper to avoid illegal use of extent and stride Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp> --------- Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp> Co-authored-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp> Signed-off-by: alphataubio <alphataubio@gmail.com>
Bumps [m2r2](https://github.com/crossnox/m2r2) from 0.3.2 to 0.3.4. - [Changelog](https://github.com/CrossNox/m2r2/blob/development/CHANGES.md) - [Commits](CrossNox/m2r2@v0.3.2...v0.3.4) --- updated-dependencies: - dependency-name: m2r2 dependency-version: 0.3.4 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: alphataubio <alphataubio@gmail.com>
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.16 to 3.28.17. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@28deaed...60168ef) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: 3.28.17 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: alphataubio <alphataubio@gmail.com>
Bumps [charset-normalizer](https://github.com/jawah/charset_normalizer) from 3.4.1 to 3.4.2. - [Release notes](https://github.com/jawah/charset_normalizer/releases) - [Changelog](https://github.com/jawah/charset_normalizer/blob/master/CHANGELOG.md) - [Commits](jawah/charset_normalizer@3.4.1...3.4.2) --- updated-dependencies: - dependency-name: charset-normalizer dependency-version: 3.4.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: alphataubio <alphataubio@gmail.com>
* Lapack - SVD: Unifying SVector layout to LayoutLeft This reduces ETI and allows users to use any valid layout and still hit the ETI target. Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> * Lapack - SVD: applying clang-format Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> * Lapack - SVD: updating the documentation to reflect new requirement Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> --------- Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp> Co-authored-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp> Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: Brian Kelley <bmkelle@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
* Add certifi==2025.4.26 certifi is required by requests==2.32.2, but requests does not provide a hash. This causes our --require-hashes to fail, but we don't catch it on github because certifi is pre-installed there. Signed-off-by: Carl Pearson <cwpears@sandia.gov> * Add certifi source hash to build_requirements.txt Signed-off-by: Carl Pearson <cwpears@sandia.gov> Co-authored-by: Luc Berger <lberge@sandia.gov> --------- Signed-off-by: Carl Pearson <cwpears@sandia.gov> Co-authored-by: Luc Berger <lberge@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
* Add common test for atexit(finalize) Make sure TPL singletons can clean themselves up when Kokkos::finalize is run at program exit. Signed-off-by: Brian Kelley <bmkelle@sandia.gov> * Rework TPL singletons - introduce simple Singleton class that reduces duplication in the TPL files - this replaces global unique_ptr to store the singleton object. unique_ptr had a a double-free bug when Kokkos::finalize was called as an atexit hook Signed-off-by: Brian Kelley <bmkelle@sandia.gov> * Formatting Signed-off-by: Brian Kelley <bmkelle@sandia.gov> * Update for PR feedback Signed-off-by: Brian Kelley <bmkelle@sandia.gov> * Revert fix for unused var warning Signed-off-by: Brian Kelley <bmkelle@sandia.gov> --------- Signed-off-by: Brian Kelley <bmkelle@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
Bumps [snowballstemmer](https://github.com/snowballstem/snowball) from 2.2.0 to 3.0.1. - [Changelog](https://github.com/snowballstem/snowball/blob/master/NEWS) - [Commits](snowballstem/snowball@v2.2.0...v3.0.1) --- updated-dependencies: - dependency-name: snowballstemmer dependency-version: 3.0.1 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: alphataubio <alphataubio@gmail.com>
Bumps [actions/dependency-review-action](https://github.com/actions/dependency-review-action) from 4.6.0 to 4.7.0. - [Release notes](https://github.com/actions/dependency-review-action/releases) - [Commits](actions/dependency-review-action@ce3cf95...38ecb5b) --- updated-dependencies: - dependency-name: actions/dependency-review-action dependency-version: 4.7.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
* Fixup documentation cover page Signed-off-by: Damien L-G <dalg24@gmail.com> * Fix Core "Getting Started" URL Signed-off-by: Damien L-G <dalg24@gmail.com> --------- Signed-off-by: Damien L-G <dalg24@gmail.com> Signed-off-by: alphataubio <alphataubio@gmail.com>
* Clone H100 PR tests onto Lychee Signed-off-by: Carl Pearson <cwpears@sandia.gov> * h100_lychee job cleanups Signed-off-by: Carl Pearson <cwpears@sandia.gov> * h100_lychee: add TPL build Signed-off-by: Carl Pearson <cwpears@sandia.gov> --------- Signed-off-by: Carl Pearson <cwpears@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
* Removing stride_ from blas and sparse Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> * Switching stride_N() for stride(N) Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> * Applying clang-format Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> --------- Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: Carl Pearson <cwpears@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
…2640) * Lapack - SVD: fixing build issue, will look at test tomorrow Modifying the ETI specialization to make it consistent with a LayoutLeft vector view. Fixing some issue in RK unit-test. Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> * Lapack - svd: removing tests for layout right and guarding m < n Cuda and all other TPLs only support layout left so no point in testing for layout right, additionally Cuda only supports m >= n use case so I am guarding the tests with matrices where m < n. Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> * Applying clang-format Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> --------- Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov> Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
Signed-off-by: alphataubio <alphataubio@gmail.com>
adf0244
to
858d8ae
Compare
@lucbv READY FOR REVIEW ( Preview: https://alphataubio.com/kokkos-kernels/ ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some initial comment on the first few files, more to come.
Parameters | ||
========== | ||
|
||
:member: Team execution policy instance (only for team version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's actually not a TeamPolicy
instance but rather a TeamPolicy::member_type
its API is documented here
docs/source/API/batched-index.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is mostly wrong and misleading, it needs to be removed.
Kokkos::View<scalar_type**, Kokkos::LayoutRight, memory_space> A("A", n, n); | ||
|
||
// Initialize matrix on host | ||
auto A_host = Kokkos::create_mirror_view(A); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates a host view with a different memory allocation from that of A
so you can keep it around to verify results later.
auto A_host = Kokkos::create_mirror_view(A); | |
auto Aorig_host = Kokkos::create_mirror(A); |
// Copy results back to host | ||
Kokkos::deep_copy(A_host, A); | ||
|
||
// Verify results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove everything below as it makes things a bit longer and more complicated. This is not meant to be a unit-test, if we are not testing this properly then we should add an appropriate test. Instead printing the values to screen could be appropriate to demonstrate that the function worked appropriately.
// Copy results back to host | ||
Kokkos::deep_copy(A_host, A); | ||
|
||
// Verify for each batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, there is not need for all of this here.
- :math:`u` is the Householder vector | ||
- :math:`\tau` is the Householder scalar | ||
- :math:`u^T` is the transpose of :math:`u` | ||
- :math:`u^H` is the conjugate transpose of :math:`u` (for complex matrices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also no need for this
Parameters | ||
========== | ||
|
||
:member: Team execution policy instance (only for team version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is a team policy member
========== | ||
|
||
:member: Team execution policy instance (only for team version) | ||
:u2: View containing the Householder vector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really: view containing the entries of u
starting at the second element, so it's size is n-1 if u is of size n
// Copy results back to host | ||
Kokkos::deep_copy(A_host, A); | ||
|
||
// Verify results: Manually compute H*A and compare |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not useful for an example, this is already implemented in our unit-tests with proper checks
// Copy results back to host | ||
Kokkos::deep_copy(A_host, A); | ||
|
||
// Verify results for a few batches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove everything below
Uh oh!
There was an error while loading. Please reload this page.