Skip to content

Conversation

@slaren
Copy link
Member

@slaren slaren commented Nov 12, 2025

This can make a significant difference for large arrays.

  Device description: 13th Gen Intel(R) Core(TM) i9-13900K
  Device memory: 175085 MB (175085 MB free)

PR:
  ARGSORT(type=f32,ne=[65000,16,1,1],order=0):                  1033 runs -  4128.78 us/run -     8125 kB/run -    1.88 GB/s

master:
  ARGSORT(type=f32,ne=[65000,16,1,1],order=0):                  1033 runs -  7251.82 us/run -     8125 kB/run -    1.07 GB/s

@slaren slaren requested a review from ggerganov as a code owner November 12, 2025 20:23
@slaren slaren force-pushed the sl/cpu-argsort-template branch from f6fcec0 to 456e83d Compare November 12, 2025 20:24
@github-actions github-actions bot added testing Everything test related ggml changes relating to the ggml tensor library for machine learning labels Nov 12, 2025
@ggerganov ggerganov merged commit 879dec3 into master Nov 13, 2025
72 checks passed
@ggerganov ggerganov deleted the sl/cpu-argsort-template branch November 13, 2025 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants