Skip to content

asm: missing function alignment#31284

Closed
npajkovsky wants to merge 1 commit into
openssl:masterfrom
npajkovsky:missing-fn-align
Closed

asm: missing function alignment#31284
npajkovsky wants to merge 1 commit into
openssl:masterfrom
npajkovsky:missing-fn-align

Conversation

@npajkovsky
Copy link
Copy Markdown

clang-22 reports missing alignment on MacOS.

ld: warning: arm64 function not 4-byte aligned: _asm_aescbc_sha1_hmac from libcrypto.a(libcrypto-lib-aes-sha1-armv8.o)
ld: warning: arm64 function not 4-byte aligned: _asm_aescbc_sha256_hmac from libcrypto.a(libcrypto-lib-aes-sha256-armv8.o)
ld: warning: arm64 function not 4-byte aligned: _asm_sha1_hmac_aescbc_dec from libcrypto.a(libcrypto-lib-aes-sha1-armv8.o)
ld: warning: arm64 function not 4-byte aligned: _asm_sha256_hmac_aescbc_dec from libcrypto.a(libcrypto-lib-aes-sha256-armv8.o)

clang-22 reports missing alignment on MacOS.

ld: warning: arm64 function not 4-byte aligned: _asm_aescbc_sha1_hmac from libcrypto.a(libcrypto-lib-aes-sha1-armv8.o)
ld: warning: arm64 function not 4-byte aligned: _asm_aescbc_sha256_hmac from libcrypto.a(libcrypto-lib-aes-sha256-armv8.o)
ld: warning: arm64 function not 4-byte aligned: _asm_sha1_hmac_aescbc_dec from libcrypto.a(libcrypto-lib-aes-sha1-armv8.o)
ld: warning: arm64 function not 4-byte aligned: _asm_sha256_hmac_aescbc_dec from libcrypto.a(libcrypto-lib-aes-sha256-armv8.o)

Signed-off-by: Nikola Pajkovsky <nikolap@openssl.org>
@npajkovsky npajkovsky requested a review from a team May 25, 2026 07:43
@npajkovsky npajkovsky self-assigned this May 25, 2026
@github-actions github-actions Bot added the severity: fips change The pull request changes FIPS provider sources label May 25, 2026
Copy link
Copy Markdown
Contributor

@Sashan Sashan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm fine with it. Still have a question for arm folks (@tom-cosgrove-arm)
why some files are using .align 4 for functions for example here in
crypto/aes/asm/aes-sha512-armv8.pl

  2071  
  2072  .global asm_sha512_hmac_aescbc_dec
  2073  .type   asm_sha512_hmac_aescbc_dec,%function
  2074  
  2075  .align  4
  2076  asm_sha512_hmac_aescbc_dec:

and other places OpenSSL uses .align 5 like here for montgomery multiplication in ./bn/asm/armv8-mont.pl

    76  
    77  .globl  bn_mul_mont
    78  .type   bn_mul_mont,%function
    79  .align  5
    80  bn_mul_mont:
    81          AARCH64_SIGN_LINK_REGISTER
    82  .Lbn_mul_mont:
    83          tst     $num,#3

Is .align 5 meaningful? in other words: when .align 4 should be used and when .align 5 should be used? thanks.

@openssl-machine openssl-machine added the approval: review pending This pull request needs review by a committer label May 25, 2026
@esyr
Copy link
Copy Markdown
Member

esyr commented May 25, 2026

.align 4 aligns to 16 bytes and .align 5 aligns to 32 bytes on arm: "For other systems, including ppc, i386 using a.out format, arm and strongarm, it is the number of low-order zero bits the location counter must have after advancement. For example .align 3 advances the location counter until it is a multiple of 8. If the location counter is already a multiple of 8, no change is needed. "[1].

[1] https://sourceware.org/binutils/docs/as.html#Align

Copy link
Copy Markdown
Contributor

@tom-cosgrove-arm tom-cosgrove-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@github-project-automation github-project-automation Bot moved this to Waiting Merge in Development Board May 26, 2026
@openssl-machine openssl-machine added approval: done This pull request has the required number of approvals and removed approval: review pending This pull request needs review by a committer labels May 26, 2026
@npajkovsky
Copy link
Copy Markdown
Author

@tom-cosgrove-arm is there any reason to use .align 5 in ./bn/asm/armv8-mont.pl or is it a bug?

@tom-cosgrove-arm
Copy link
Copy Markdown
Contributor

@tom-cosgrove-arm is there any reason to use .align 5 in ./bn/asm/armv8-mont.pl or is it a bug?

Given the use of both .align 4 and .align 5 it's almost certainly intentional, and on earlier AArch64 cores (the code dates back at least as far as 2015) might have made a small performance improvement. Probably not worth changing - the risk of small perf regression on earlier cores vs slight size benefit would suggest making no change.

Copy link
Copy Markdown
Member

@esyr esyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the actual alignment should be.

Comment thread crypto/aes/asm/aes-sha1-armv8.pl
Comment thread crypto/aes/asm/aes-sha256-armv8.pl
@esyr
Copy link
Copy Markdown
Member

esyr commented May 26, 2026

Hmmm, a quick git grep -C 10 '%function' **/*arm*| awk '/^--$/{ if (found == 0) { print p; }; found = 0; p = ""; } /.*/{ p = p "\n" $0; } /\.align/{ found = 1; }' yielded also the following:

crypto/armv4cpuid.pl-.global	OPENSSL_cleanse
crypto/modes/asm/ghash-armv4.pl-.global	gcm_gmult_4bit
crypto/sha/asm/sha512-armv4.pl-.global	sha512_block_data_order

@tom-cosgrove-arm, may I ask to evaluate whether those functions need .align directives as well?

@t8m t8m added branch: master Applies to master branch triaged: bug The issue/pr is/fixes a bug tests: exempted The PR is exempt from requirements for testing branch: 3.5 Applies to openssl-3.5 branch: 3.6 Applies to openssl-3.6 branch: 4.0 Applies to openssl-4.0 labels May 27, 2026
@tom-cosgrove-arm
Copy link
Copy Markdown
Contributor

@tom-cosgrove-arm, may I ask to evaluate whether those functions need .align directives as well?

Thanks for checking. Yes, looks like OPENSSL_cleanse and gcm_gmult_4bit are definitely missing an align. sha512_block_data_order is almost certainly okay, because the data before it starts with .align 5 then has a load of WORD64s, then either .word OPENSSL_armcap_P followed by skip 32-4 or just skip 32. However, adding an align 5 there too wouldn't hurt, and would definitely be more future-proof if anyone goes to add more data.

@openssl-machine
Copy link
Copy Markdown
Collaborator

24 hours has passed since 'approval: done' was set, but as this PR has been updated in that time the label 'approval: ready to merge' is not being automatically set. Please review the updates and set the label manually.

@npajkovsky npajkovsky added approval: ready to merge The 24 hour grace period has passed, ready to merge and removed approval: done This pull request has the required number of approvals labels May 28, 2026
openssl-machine pushed a commit that referenced this pull request May 28, 2026
clang-22 reported missing alignment on MacOS:

    ld: warning: arm64 function not 4-byte aligned: _asm_sha1_hmac_aescbc_dec from libcrypto.a(libcrypto-lib-aes-sha1-armv8.o)
    ld: warning: arm64 function not 4-byte aligned: _asm_sha256_hmac_aescbc_dec from libcrypto.a(libcrypto-lib-aes-sha256-armv8.o)

Add ".align 4" directives to the affected functions.

Signed-off-by: Nikola Pajkovsky <nikolap@openssl.org>

Reviewed-by: Saša Nedvědický <sashan@openssl.org>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.foundation>
Reviewed-by: Eugene Syromiatnikov <esyr@openssl.org>
MergeDate: Thu May 28 08:31:59 2026
(Merged from #31284)
@esyr
Copy link
Copy Markdown
Member

esyr commented May 28, 2026

sha512_block_data_order is almost certainly okay, because the data before it starts with .align 5

And that array should be in .rodata, strictly speaking.

@tom-cosgrove-arm
Copy link
Copy Markdown
Contributor

sha512_block_data_order is almost certainly okay, because the data before it starts with .align 5

And that array should be in .rodata, strictly speaking.

Yes, but that's not a change for a "missing function alignment" PR

openssl-machine pushed a commit that referenced this pull request May 28, 2026
clang-22 reported missing alignment on MacOS:

    ld: warning: arm64 function not 4-byte aligned: _asm_sha1_hmac_aescbc_dec from libcrypto.a(libcrypto-lib-aes-sha1-armv8.o)
    ld: warning: arm64 function not 4-byte aligned: _asm_sha256_hmac_aescbc_dec from libcrypto.a(libcrypto-lib-aes-sha256-armv8.o)

Add ".align 4" directives to the affected functions.

Signed-off-by: Nikola Pajkovsky <nikolap@openssl.org>

Reviewed-by: Saša Nedvědický <sashan@openssl.org>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.foundation>
Reviewed-by: Eugene Syromiatnikov <esyr@openssl.org>
MergeDate: Thu May 28 08:31:59 2026
(Merged from #31284)

(cherry picked from commit 94fbc02)
openssl-machine pushed a commit that referenced this pull request May 28, 2026
clang-22 reported missing alignment on MacOS:

    ld: warning: arm64 function not 4-byte aligned: _asm_sha1_hmac_aescbc_dec from libcrypto.a(libcrypto-lib-aes-sha1-armv8.o)
    ld: warning: arm64 function not 4-byte aligned: _asm_sha256_hmac_aescbc_dec from libcrypto.a(libcrypto-lib-aes-sha256-armv8.o)

Add ".align 4" directives to the affected functions.

Signed-off-by: Nikola Pajkovsky <nikolap@openssl.org>

Reviewed-by: Saša Nedvědický <sashan@openssl.org>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.foundation>
Reviewed-by: Eugene Syromiatnikov <esyr@openssl.org>
MergeDate: Thu May 28 08:31:59 2026
(Merged from #31284)

(cherry picked from commit 94fbc02)
@esyr esyr removed the branch: 3.5 Applies to openssl-3.5 label May 28, 2026
@esyr
Copy link
Copy Markdown
Member

esyr commented May 28, 2026

The commit that has introduced the affected files, openssl-3.6.0-alpha1~907 "Implement interleaving aes-cbc-hmac-sha on aarch64", is not present in openssl-3.5, removing the branch label.

@esyr
Copy link
Copy Markdown
Member

esyr commented May 28, 2026

Applied to master, openssl-4.0, and openssl-3.6, thank you for your contribution.

@esyr esyr closed this May 28, 2026
@github-project-automation github-project-automation Bot moved this from Waiting Merge to Done in Development Board May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval: ready to merge The 24 hour grace period has passed, ready to merge branch: master Applies to master branch branch: 3.6 Applies to openssl-3.6 branch: 4.0 Applies to openssl-4.0 severity: fips change The pull request changes FIPS provider sources tests: exempted The PR is exempt from requirements for testing triaged: bug The issue/pr is/fixes a bug

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants