Skip to content

Conversation

@Gadflyii
Copy link
Owner

@Gadflyii Gadflyii commented Oct 7, 2025

Make sure to read the contributing guidelines before submitting a PR

ggerganov and others added 7 commits October 7, 2025 08:21
* metal : ssm_scan minor opts

* metal : get_rows optimize

* metal : cpy optimize

* metal : ssm_conv opt

* metal : ssm_scan simplify

* metal : ssm_Scan opt
* tests : add -INF blocks to the KQ mask in the FA tests

* cont : bump -INF block size to 64

Co-authored-by: Jeff Bolz <jbolz@nvidia.com>

* ggml : prevent division by zero in FA CPU op

---------

Co-authored-by: Jeff Bolz <jbolz@nvidia.com>
* metal : pad K, V and Mask when needed

* cont : simplify

* cuda : add TODO about KV padding requirement

* metal : add comments

* metal : remove mask padding requirement
Update the README file to match the newly added functionality of
exposing multiple devices from a single server.

Co-authored-by: Diego Devesa <slarengh@gmail.com>
* webui : added download action (#13552)

* webui : import and export (for all conversations)

* webui : fixed download-format, import of one conversation

* webui : add ExportedConversations type for chat import/export

* feat: Update naming & order

* chore: Linting

* webui : Updated static build output

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
@Gadflyii Gadflyii merged commit dec0c8d into Gadflyii:master Oct 7, 2025
76 of 118 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants