Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More docs formatting #5

Draft
wants to merge 6 commits into
base: develop
Choose a base branch
from
Draft

More docs formatting #5

wants to merge 6 commits into from

Conversation

cgmb
Copy link
Owner

@cgmb cgmb commented Oct 1, 2021

Testing...

The docs for 4.1.0 were rebuilt a few months after release, and they
changed in appearance (for the worse). Examining the build logs, it
seems that the packages used for the build changed from Sphinx-3.5.3
and breathe-4.28.0 to Sphinx-4.0.2 and breathe-4.30.0.

Following the ReadTheDocs advice for Reproducable Builds [1], I'm now
specifying exact versions for our key dependencies.

[1]: https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
cgmb added a commit that referenced this pull request Feb 8, 2022
* update row swaping methods (laswp)
* 2- and 3-steps recursion/iteration
* add local iamax+ger+scal
* rebase develop / fix merge conflicts
* tuning new blocksizes normal case
* tuning new blocksizes batch cases
* back to 2-steps recursion
* tuning new blocksizes for non-pivoting versions
* remove specialized kernels for small panel matrices
* update workspace requirements
* Changelog and documentation
* GETRF suggestions (#5)
     - Eliminate dynamic allocation
     - Extract swap function
* Changed pivot from a template argument to a function argument (ROCm#6)
     - Changed pivot from a template argument to a function argument
     - Add pivot argument to template functions
* Use new swap helper
* fix workspace-size bug
* add launch bounds
* variable thread-group sizes

Co-authored-by: Cory Bloor <Cordell.Bloor@amd.com>
Co-authored-by: Troy Alderson <58866654+tfalders@users.noreply.github.com>
cgmb pushed a commit that referenced this pull request Feb 17, 2022
* Improved variables_map emulation

* Prototype Arguments class

* Prototype argument model for gesvd

* Addressed review comments

* Prototype argument model for sygv/hegv

* Print names of arguments not consumed by tests

* Addressed review comment

* change format in messages and outputs (#5)

* format messages and outputs

* format messages and outputs

* review corrections

* Addressed review comments

* New argument model for syev/heev

* New argument model for sygst/hegst

* New argument model for sytrd/hetrd

* New argument model for potrf

* New argument model for getrs

* New argument model for getri

* New argument model for getrf

* New argument model for geqrf

* New argument model for geqlf

* New argument model for gels

* New argument model for gelqf

* New argument model for gebrd

* New argument model for bdsqr, sterf, and steqr

* New argument model for labrd, lacgv, laswp, and latrd

* New argument model for larf, larfb, larfg, and larft

* New argument model for orgxx functions

* New argument model for ormxx functions

* New argument model for HMM test

* Final clean-up

* Consistent style for bench arguments

* Updated changelog

* Alphabetize arguments

* Bug fixes

* Addressed review comments

* Apply clang format

* Addressed review comment

* Addressed some review comments

* Remove static defaults from m, n, k, storev, side, k1, k2, and nu

* Default to square matrices

* Updated help string

* Use m as required parameter instead of n

* Adjust laswp and latrd defaults

Co-authored-by: Juan Zuniga-Anaya <50754207+jzuniga-amd@users.noreply.github.com>
cgmb added a commit that referenced this pull request May 3, 2024
* use wave of threads

* use rocthrust for prefix sum

* replace thrust::copy with hipMemcpyAsync

* add __restrict__ and change ifactor

* is_found in shared memory, single loop in copy kernel

* update comments

* add option to use rocprim

* use rocprim, remove rocthrust

* Add rocprim as a build dependency (#5)

* remove #undef HIP_CHECK

* getMemorySize return a status value

* rename LUp to ptrT

* add SPLITLU_SWITCH_SIZE

* removed shared array is_found

* additional comments and assertions the number of threads in thread block is a multiple of warpSize

* use 2D grid of threads in thread block

* minor updates to remove extra  checks for rocprim and consider using shared memory in splitlu_kernel

* update to use HIP_CHECK and ROCBLAS_CHECK macros

* remove SPLITLU_BS1

* prepare splitlu and sumlu test code

* add random sparse matrix generator to tests

* files and directories re-organization

* files and directories re-organization

* file renaming

* revise code for zeros in diagonals

* use atomic operations

* add more cases for splitlu

* clean up code for debugging

* add option for sorted and unsorted column indices

* simplify code

* minor updates to address concerns

* update with performance improvement for splitlu

* Resolve conflict with 6bcc458

* remove duplicate files

* resolve conflicts with develop branch

* remove unnecessary files

---------

Co-authored-by: Cory Bloor <Cordell.Bloor@amd.com>
Co-authored-by: jzuniga-amd <juan.zuniga-anaya@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant