Finish sdsdot/hsdot extended-precision dot products (C8 part 4)#21
Merged
Conversation
Both were broken and had no header prototype: sdsdot used `n` instead of `N` and returned an undefined `dot`; hsdot initialized a float32_t accumulator directly from a float16_t alpha. Neither took rndMode. Rewrite both cleanly: sdsdot accumulates single-precision products in double precision plus the bias alpha; hsdot accumulates half-precision products in single precision. Both handle negative strides, canonicalize their NaN output, take rndMode, gain header prototypes, and join the Makefile. test_sdsdot.c: 1 + 2*4 + 3*5 = 24 for both. 179/179 tests pass. This completes every Level-1 routine the README lists as implemented. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
38b8360 to
3961160
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Both were broken and had no header prototype:
sdsdotusedninstead ofNand returned an undefineddot;hsdotinitialized afloat32_taccumulator directly from afloat16_talpha. Neither tookrndMode.Rewritten cleanly —
sdsdotaccumulates single-precision products in double precision plus the biasalpha;hsdotaccumulates half-precision products in single precision. Both handle negative strides, canonicalize NaN output, takerndMode, gain header prototypes, and join the Makefile.test_sdsdot.c:1 + 2·4 + 3·5 = 24for both. 179/179.This completes every Level-1 routine the README lists as implemented.
🤖 Generated with Claude Code