Skip to content

x86_64/AArch64: Add .size directive to all assembly functions#956

Merged
hanno-becker merged 1 commit into
mainfrom
asm-size
Feb 7, 2026
Merged

x86_64/AArch64: Add .size directive to all assembly functions#956
hanno-becker merged 1 commit into
mainfrom
asm-size

Conversation

@mkannwischer
Copy link
Copy Markdown
Contributor

Add size information to function symbols.
size information is added through a MLD_ASM_FN_SIZE macro (mapping to .size for elf-targets and nothing otherwise) which gets added automatically by autogen. The makes the assembly
functions show up with their correct size in the elf instead of always having a zero size, making it easier to see how much space each function takes up.

Add size information to function symbols.
size information is added through a MLD_ASM_FN_SIZE macro (mapping to .size
for elf-targets and nothing otherwise) which gets added automatically by
autogen. The makes the assembly
functions show up with their correct size in the elf instead of always
having a zero size, making it easier to see how much space each
function takes up.

Co-authored-by: Anders Sonmark <Anders.Sonmark@axis.com>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
mkannwischer added a commit to pq-code-package/mlkem-native that referenced this pull request Feb 7, 2026
Add size information to function symbols.
Size information is added through a MLD_ASM_FN_SIZE macro (mapping to .size
for elf-targets and nothing otherwise) which gets added automatically by
autogen. The makes the assembly
functions show up with their correct size in the elf instead of always
having a zero size, making it easier to see how much space each
function takes up.

- Port of pq-code-package/mldsa-native#956

Co-authored-by: Anders Sonmark <Anders.Sonmark@axis.com>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Feb 7, 2026

CBMC Results (ML-DSA-44)

Full Results (174 proofs)
Proof Status Current Previous Change
**TOTAL** 2140s 2088s +2.5%
sign_verify_internal 265s 249s +6%
polyvecl_pointwise_acc_montgomery_c 223s 208s +7%
mld_attempt_signature_generation 216s 213s +1%
poly_pointwise_montgomery_c 153s 140s +9%
rej_uniform_native 130s 130s +0%
mld_invntt_layer 119s 116s +3%
mld_ct_memcmp 89s 84s +6%
mld_ntt_layer 46s 45s +2%
keccak_squeezeblocks_x4 45s 43s +5%
sign_signature_internal 44s 44s +0%
polyvec_matrix_expand 31s 29s +7%
fqmul 21s 19s +11%
rej_uniform_c 21s 19s +11%
rej_uniform 20s 20s +0%
poly_chknorm_c 19s 17s +12%
mld_compute_t0_t1_tr_from_sk_components 17s 12s +42%
poly_uniform_eta_4x 17s 15s +13%
polymat_permute_bitrev_to_custom 16s 15s +7%
polyt0_unpack 16s 16s +0%
poly_uniform_4x 15s 15s +0%
keccakf1600x4_permute_native 13s 12s +8%
polyeta_unpack 13s 12s +8%
keccak_absorb_once_x4 12s 13s -8%
polyz_unpack_c 12s 11s +9%
mld_ntt_butterfly_block 11s 11s +0%
mld_polyvecl_permute_bitrev_to_custom_native 10s 12s -17%
mld_check_pct 9s 10s -10%
polyveck_add 9s 8s +12%
keccakf1600_permute_native 8s 9s -11%
poly_invntt_tomont_c 8s 7s +14%
polyveck_decompose 8s 7s +14%
keccak_absorb 7s 6s +17%
keccakf1600_permute 7s 7s +0%
mld_compute_pack_z 7s 4s +75%
polyw1_pack 7s 4s +75%
ntt_native_x86_64 6s 2s +200%
poly_chknorm_native 6s 4s +50%
poly_pointwise_montgomery_native 6s 4s +50%
poly_use_hint_c 6s 5s +20%
polyvec_matrix_expand_serial 6s 6s +0%
polyveck_ntt 6s 6s +0%
polyveck_pack_eta 6s 4s +50%
polyveck_power2round 6s 6s +0%
polyveck_reduce 6s 4s +50%
polyveck_use_hint 6s 3s +100%
sign_verify_pre_hash_shake256 6s 6s +0%
make_hint 5s 3s +67%
poly_challenge 5s 4s +25%
poly_decompose_native 5s 5s +0%
poly_pointwise_montgomery 5s 4s +25%
polyvec_matrix_pointwise_montgomery 5s 8s -38%
polyveck_caddq 5s 6s -17%
polyveck_chknorm 5s 5s +0%
polyveck_make_hint 5s 4s +25%
polyveck_pointwise_poly_montgomery 5s 5s +0%
polyveck_unpack_eta 5s 2s +150%
polyvecl_chknorm 5s 4s +25%
polyvecl_pointwise_acc_montgomery_native 5s 2s +150%
rej_eta_native 5s 5s +0%
shake128x4_absorb_once 5s 4s +25%
shake256_absorb 5s 2s +150%
sign 5s 6s -17%
sign_pk_from_sk 5s 5s +0%
sign_signature 5s 3s +67%
sign_signature_extmu 5s 6s -17%
unpack_sk 5s 3s +67%
decompose 4s 4s +0%
keccakf1600x4_extract_bytes 4s 2s +100%
mld_ct_cmask_nonzero_u8 4s 4s +0%
mld_sample_s1_s2 4s 6s -33%
pack_pk 4s 4s +0%
pack_sig_z 4s 5s -20%
pack_sk 4s 3s +33%
poly_add 4s 5s -20%
poly_caddq 4s 4s +0%
poly_ntt 4s 4s +0%
poly_ntt_native 4s 4s +0%
poly_shiftl 4s 4s +0%
poly_uniform 4s 5s -20%
polyt1_pack 4s 2s +100%
polyveck_invntt_tomont 4s 5s -20%
polyveck_pack_t0 4s 2s +100%
polyveck_pack_w1 4s 3s +33%
polyveck_shiftl 4s 5s -20%
polyveck_sub 4s 5s -20%
polyvecl_ntt 4s 4s +0%
polyvecl_pointwise_acc_montgomery 4s 3s +33%
polyvecl_uniform_gamma1 4s 3s +33%
polyvecl_unpack_eta 4s 2s +100%
polyz_unpack 4s 2s +100%
polyz_unpack_native 4s 3s +33%
rej_eta 4s 5s -20%
shake128_absorb 4s 3s +33%
shake256x4_squeezeblocks 4s 5s -20%
sign_keypair 4s 4s +0%
sign_keypair_internal 4s 8s -50%
sign_signature_pre_hash_internal 4s 3s +33%
sign_signature_pre_hash_shake256 4s 5s -20%
unpack_hints 4s 6s -33%
caddq 3s 5s -40%
fqscale 3s 2s +50%
keccakf1600_extract_bytes (big endian) 3s 4s -25%
keccakf1600_xor_bytes (big endian) 3s 2s +50%
keccakf1600x4_xor_bytes 3s 2s +50%
mld_ct_cmask_nonzero_u32 3s 3s +0%
mld_prepare_domain_separation_prefix 3s 5s -40%
mld_value_barrier_i64 3s 3s +0%
pack_sig_c_h 3s 3s +0%
poly_caddq_c 3s 2s +50%
poly_caddq_native_aarch64 3s 3s +0%
poly_chknorm 3s 2s +50%
poly_decompose 3s 3s +0%
poly_invntt_tomont 3s 5s -40%
poly_ntt_c 3s 2s +50%
poly_power2round 3s 2s +50%
poly_sub 3s 2s +50%
poly_uniform_eta 3s 5s -40%
poly_uniform_gamma1 3s 3s +0%
poly_uniform_gamma1_4x 3s 4s -25%
poly_use_hint_native 3s 4s -25%
polyeta_pack 3s 5s -40%
polyt0_pack 3s 3s +0%
polyvecl_permute_bitrev_to_custom 3s 3s +0%
polyvecl_unpack_z 3s 6s -50%
polyz_pack 3s 3s +0%
reduce32 3s 2s +50%
rej_eta_c 3s 3s +0%
shake128_init 3s 2s +50%
shake128x4_squeezeblocks 3s 3s +0%
shake256_finalize 3s 5s -40%
shake256_init 3s 2s +50%
shake256_squeeze 3s 4s -25%
sign_verify 3s 3s +0%
sign_verify_extmu 3s 6s -50%
sign_verify_pre_hash_internal 3s 4s -25%
keccak_finalize 2s 3s -33%
keccak_init 2s 2s +0%
keccak_squeeze 2s 2s +0%
keccakf1600_xor_bytes 2s 4s -50%
keccakf1600x4_permute 2s 2s +0%
mld_ct_abs_i32 2s 3s -33%
mld_ct_cmask_neg_i32 2s 3s -33%
mld_ct_get_optblocker_i64 2s 2s +0%
mld_ct_get_optblocker_u32 2s 4s -50%
mld_ct_sel_int32 2s 3s -33%
mld_h 2s 5s -60%
mld_keccakf1600_extract_bytes 2s 3s -33%
mld_sample_s1_s2_serial 2s 3s -33%
mld_value_barrier_u32 2s 2s +0%
mld_value_barrier_u8 2s 3s -33%
montgomery_reduce 2s 2s +0%
poly_caddq_native 2s 2s +0%
poly_decompose_c 2s 4s -50%
poly_invntt_tomont_native 2s 4s -50%
poly_make_hint 2s 5s -60%
polyveck_unpack_t0 2s 3s -33%
polyvecl_pack_eta 2s 4s -50%
polyvecl_uniform_gamma1_serial 2s 4s -50%
power2round 2s 2s +0%
shake128_finalize 2s 2s +0%
shake128_release 2s 2s +0%
shake128_squeeze 2s 2s +0%
sign_open 2s 8s -75%
sys_check_capability 2s 3s -33%
unpack_sig 2s 3s -33%
use_hint 2s 3s -33%
mld_ct_get_optblocker_u8 1s 2s -50%
poly_reduce 1s 2s -50%
poly_use_hint 1s 1s +0%
polyt1_unpack 1s 5s -80%
shake256 1s 2s -50%
shake256_release 1s 2s -50%
shake256x4_absorb_once 1s 3s -67%
unpack_pk 1s 4s -75%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Feb 7, 2026

CBMC Results (ML-DSA-87)

Full Results (174 proofs)
Proof Status Current Previous Change
**TOTAL** 2408s 2207s +9.1%
mld_attempt_signature_generation 247s 231s +7%
sign_verify_internal 225s 213s +6%
polyvecl_pointwise_acc_montgomery_c 188s 162s +16%
poly_pointwise_montgomery_c 158s 130s +22%
polyvec_matrix_expand 144s 128s +12%
rej_uniform_native 134s 120s +12%
polyvec_matrix_expand_serial 109s 101s +8%
mld_ct_memcmp 90s 78s +15%
mld_invntt_layer 72s 71s +1%
mld_ntt_layer 55s 49s +12%
keccak_squeezeblocks_x4 44s 44s +0%
sign_signature_internal 42s 40s +5%
mld_compute_t0_t1_tr_from_sk_components 26s 25s +4%
polymat_permute_bitrev_to_custom 26s 24s +8%
fqmul 22s 19s +16%
rej_uniform 22s 22s +0%
poly_chknorm_c 21s 17s +24%
rej_uniform_c 21s 18s +17%
poly_uniform_eta_4x 16s 16s +0%
polyt0_unpack 16s 14s +14%
poly_uniform_4x 15s 15s +0%
polyvec_matrix_pointwise_montgomery 15s 14s +7%
keccakf1600x4_permute_native 14s 13s +8%
polyveck_add 14s 14s +0%
polyveck_power2round 14s 15s -7%
mld_ntt_butterfly_block 13s 11s +18%
polyeta_unpack 12s 13s -8%
keccak_absorb_once_x4 11s 12s -8%
mld_sample_s1_s2_serial 11s 6s +83%
sign 11s 8s +38%
keccakf1600_permute 10s 9s +11%
mld_polyvecl_permute_bitrev_to_custom_native 10s 9s +11%
polyveck_caddq 10s 8s +25%
polyveck_chknorm 10s 6s +67%
polyveck_reduce 10s 10s +0%
poly_decompose_c 9s 7s +29%
poly_invntt_tomont_c 9s 8s +12%
polyveck_invntt_tomont 9s 9s +0%
polyveck_pointwise_poly_montgomery 9s 6s +50%
mld_check_pct 8s 7s +14%
poly_make_hint 8s 3s +167%
polyveck_ntt 8s 7s +14%
mld_compute_pack_z 7s 6s +17%
polyvecl_ntt 7s 9s -22%
polyvecl_uniform_gamma1 7s 4s +75%
sign_pk_from_sk 7s 9s -22%
sign_verify_extmu 7s 4s +75%
keccak_absorb 6s 4s +50%
keccakf1600_permute_native 6s 9s -33%
mld_sample_s1_s2 6s 8s -25%
poly_uniform_gamma1_4x 6s 5s +20%
polyveck_decompose 6s 9s -33%
polyveck_shiftl 6s 7s -14%
polyveck_sub 6s 7s -14%
polyveck_use_hint 6s 6s +0%
polyz_unpack_c 6s 6s +0%
rej_eta_c 6s 5s +20%
sign_keypair_internal 6s 5s +20%
keccak_squeeze 5s 2s +150%
keccakf1600_extract_bytes (big endian) 5s 3s +67%
mld_h 5s 6s -17%
pack_pk 5s 2s +150%
poly_add 5s 5s +0%
poly_challenge 5s 4s +25%
poly_ntt_native 5s 3s +67%
poly_uniform 5s 3s +67%
polyveck_make_hint 5s 5s +0%
polyveck_unpack_eta 5s 7s -29%
polyvecl_chknorm 5s 4s +25%
polyvecl_unpack_eta 5s 4s +25%
polyvecl_unpack_z 5s 6s -17%
polyz_unpack_native 5s 2s +150%
reduce32 5s 2s +150%
rej_eta_native 5s 3s +67%
shake128_finalize 5s 3s +67%
sign_keypair 5s 3s +67%
sign_open 5s 4s +25%
sign_signature 5s 4s +25%
sign_signature_pre_hash_internal 5s 4s +25%
sign_verify 5s 5s +0%
sign_verify_pre_hash_internal 5s 3s +67%
unpack_hints 5s 5s +0%
unpack_sig 5s 4s +25%
unpack_sk 5s 5s +0%
keccakf1600x4_permute 4s 3s +33%
keccakf1600x4_xor_bytes 4s 3s +33%
mld_ct_get_optblocker_i64 4s 1s +300%
mld_ct_get_optblocker_u8 4s 2s +100%
ntt_native_x86_64 4s 3s +33%
pack_sig_z 4s 3s +33%
poly_caddq 4s 3s +33%
poly_caddq_native 4s 2s +100%
poly_decompose_native 4s 4s +0%
poly_invntt_tomont_native 4s 2s +100%
poly_power2round 4s 5s -20%
poly_reduce 4s 3s +33%
poly_shiftl 4s 2s +100%
poly_uniform_gamma1 4s 2s +100%
poly_use_hint_native 4s 3s +33%
polyt0_pack 4s 4s +0%
polyveck_pack_eta 4s 2s +100%
polyz_pack 4s 3s +33%
shake128_absorb 4s 2s +100%
shake128_squeeze 4s 2s +100%
shake256x4_squeezeblocks 4s 2s +100%
sign_signature_pre_hash_shake256 4s 7s -43%
sign_verify_pre_hash_shake256 4s 4s +0%
caddq 3s 4s -25%
keccak_finalize 3s 5s -40%
keccakf1600_xor_bytes 3s 2s +50%
keccakf1600_xor_bytes (big endian) 3s 3s +0%
make_hint 3s 3s +0%
mld_ct_cmask_nonzero_u8 3s 4s -25%
mld_prepare_domain_separation_prefix 3s 5s -40%
montgomery_reduce 3s 4s -25%
pack_sig_c_h 3s 2s +50%
poly_caddq_c 3s 5s -40%
poly_caddq_native_aarch64 3s 4s -25%
poly_chknorm 3s 5s -40%
poly_invntt_tomont 3s 5s -40%
poly_ntt 3s 2s +50%
poly_pointwise_montgomery 3s 2s +50%
poly_sub 3s 4s -25%
poly_uniform_eta 3s 4s -25%
poly_use_hint 3s 2s +50%
polyt1_pack 3s 2s +50%
polyt1_unpack 3s 5s -40%
polyveck_pack_t0 3s 4s -25%
polyveck_pack_w1 3s 1s +200%
polyvecl_pack_eta 3s 2s +50%
polyvecl_permute_bitrev_to_custom 3s 4s -25%
polyvecl_pointwise_acc_montgomery 3s 2s +50%
polyvecl_pointwise_acc_montgomery_native 3s 4s -25%
polyvecl_uniform_gamma1_serial 3s 3s +0%
polyw1_pack 3s 3s +0%
polyz_unpack 3s 4s -25%
power2round 3s 3s +0%
rej_eta 3s 2s +50%
shake128x4_absorb_once 3s 1s +200%
shake256_absorb 3s 1s +200%
shake256_finalize 3s 2s +50%
shake256_init 3s 2s +50%
shake256x4_absorb_once 3s 2s +50%
sign_signature_extmu 3s 5s -40%
sys_check_capability 3s 3s +0%
decompose 2s 4s -50%
fqscale 2s 3s -33%
keccak_init 2s 2s +0%
keccakf1600x4_extract_bytes 2s 5s -60%
mld_ct_get_optblocker_u32 2s 2s +0%
mld_ct_sel_int32 2s 1s +100%
mld_value_barrier_i64 2s 3s -33%
mld_value_barrier_u8 2s 2s +0%
pack_sk 2s 2s +0%
poly_chknorm_native 2s 2s +0%
poly_decompose 2s 2s +0%
poly_ntt_c 2s 2s +0%
poly_pointwise_montgomery_native 2s 2s +0%
poly_use_hint_c 2s 3s -33%
polyeta_pack 2s 4s -50%
polyveck_unpack_t0 2s 4s -50%
shake128_init 2s 5s -60%
shake128_release 2s 2s +0%
shake128x4_squeezeblocks 2s 4s -50%
shake256 2s 4s -50%
shake256_release 2s 5s -60%
shake256_squeeze 2s 2s +0%
unpack_pk 2s 3s -33%
use_hint 2s 5s -60%
mld_ct_abs_i32 1s 2s -50%
mld_ct_cmask_neg_i32 1s 2s -50%
mld_ct_cmask_nonzero_u32 1s 2s -50%
mld_keccakf1600_extract_bytes 1s 2s -50%
mld_value_barrier_u32 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Feb 7, 2026

CBMC Results (ML-DSA-65)

Full Results (174 proofs)
Proof Status Current Previous Change
**TOTAL** 2439s 2319s +5.2%
sign_verify_internal 372s 357s +4%
polyvecl_pointwise_acc_montgomery_c 250s 229s +9%
mld_attempt_signature_generation 212s 202s +5%
poly_pointwise_montgomery_c 136s 136s +0%
rej_uniform_native 135s 130s +4%
polyvec_matrix_expand 95s 93s +2%
mld_ct_memcmp 83s 79s +5%
mld_invntt_layer 74s 72s +3%
polyvec_matrix_expand_serial 66s 63s +5%
mld_ntt_layer 55s 55s +0%
sign_signature_internal 46s 44s +5%
keccak_squeezeblocks_x4 45s 44s +2%
mld_compute_t0_t1_tr_from_sk_components 23s 24s -4%
rej_uniform_c 21s 17s +24%
rej_uniform 20s 20s +0%
fqmul 19s 19s +0%
poly_uniform_eta_4x 19s 17s +12%
polymat_permute_bitrev_to_custom 19s 20s -5%
poly_chknorm_c 18s 18s +0%
polyveck_decompose 18s 16s +12%
polyt0_unpack 16s 15s +7%
keccakf1600x4_permute_native 15s 13s +15%
poly_uniform_4x 15s 18s -17%
polyvec_matrix_pointwise_montgomery 15s 14s +7%
keccak_absorb_once_x4 13s 12s +8%
sign 13s 9s +44%
mld_ntt_butterfly_block 12s 12s +0%
mld_polyvecl_permute_bitrev_to_custom_native 11s 10s +10%
polyveck_add 11s 10s +10%
poly_invntt_tomont_c 10s 9s +11%
keccakf1600_permute_native 9s 7s +29%
mld_compute_pack_z 9s 6s +50%
polyveck_invntt_tomont 9s 9s +0%
polyveck_ntt 9s 9s +0%
keccak_absorb 8s 5s +60%
keccakf1600_permute 8s 7s +14%
mld_check_pct 8s 5s +60%
polyveck_chknorm 8s 6s +33%
polyveck_power2round 8s 10s -20%
sign_keypair_internal 8s 5s +60%
mld_sample_s1_s2_serial 7s 5s +40%
poly_decompose_c 7s 8s -12%
polyeta_unpack 7s 5s +40%
polyveck_caddq 7s 6s +17%
polyveck_shiftl 7s 12s -42%
polyveck_sub 7s 9s -22%
polyveck_use_hint 7s 7s +0%
sign_pk_from_sk 7s 8s -12%
poly_uniform 6s 4s +50%
polyt0_pack 6s 6s +0%
polyveck_make_hint 6s 3s +100%
polyveck_reduce 6s 6s +0%
polyveck_unpack_t0 6s 3s +100%
polyvecl_ntt 6s 8s -25%
polyz_unpack_c 6s 5s +20%
sign_verify 6s 3s +100%
mld_h 5s 2s +150%
mld_sample_s1_s2 5s 5s +0%
poly_invntt_tomont 5s 2s +150%
poly_make_hint 5s 4s +25%
poly_ntt_native 5s 2s +150%
poly_power2round 5s 6s -17%
poly_uniform_eta 5s 4s +25%
polyeta_pack 5s 3s +67%
polyveck_pointwise_poly_montgomery 5s 4s +25%
polyvecl_chknorm 5s 6s -17%
polyvecl_pack_eta 5s 4s +25%
polyvecl_pointwise_acc_montgomery_native 5s 3s +67%
polyz_unpack 5s 3s +67%
polyz_unpack_native 5s 2s +150%
power2round 5s 4s +25%
shake256 5s 3s +67%
sign_keypair 5s 2s +150%
sign_signature 5s 5s +0%
sign_verify_extmu 5s 3s +67%
sign_verify_pre_hash_shake256 5s 2s +150%
unpack_hints 5s 6s -17%
keccak_squeeze 4s 3s +33%
mld_ct_abs_i32 4s 2s +100%
mld_ct_cmask_nonzero_u32 4s 3s +33%
mld_ct_get_optblocker_u32 4s 1s +300%
mld_prepare_domain_separation_prefix 4s 3s +33%
mld_value_barrier_i64 4s 1s +300%
mld_value_barrier_u8 4s 3s +33%
ntt_native_x86_64 4s 3s +33%
pack_sig_c_h 4s 1s +300%
pack_sig_z 4s 4s +0%
pack_sk 4s 2s +100%
poly_add 4s 4s +0%
poly_caddq_native 4s 4s +0%
poly_challenge 4s 3s +33%
poly_decompose_native 4s 4s +0%
poly_uniform_gamma1_4x 4s 4s +0%
poly_use_hint 4s 2s +100%
poly_use_hint_c 4s 4s +0%
poly_use_hint_native 4s 3s +33%
polyt1_unpack 4s 3s +33%
polyvecl_pointwise_acc_montgomery 4s 3s +33%
polyvecl_uniform_gamma1_serial 4s 4s +0%
rej_eta_native 4s 4s +0%
shake128x4_squeezeblocks 4s 2s +100%
shake256_init 4s 3s +33%
shake256_squeeze 4s 2s +100%
shake256x4_absorb_once 4s 3s +33%
sign_open 4s 7s -43%
sign_signature_extmu 4s 5s -20%
sign_signature_pre_hash_internal 4s 7s -43%
unpack_sk 4s 3s +33%
use_hint 4s 3s +33%
caddq 3s 3s +0%
decompose 3s 4s -25%
fqscale 3s 3s +0%
keccakf1600_extract_bytes (big endian) 3s 2s +50%
keccakf1600_xor_bytes (big endian) 3s 3s +0%
keccakf1600x4_extract_bytes 3s 2s +50%
keccakf1600x4_permute 3s 4s -25%
make_hint 3s 4s -25%
mld_ct_cmask_neg_i32 3s 1s +200%
mld_ct_get_optblocker_i64 3s 3s +0%
mld_ct_get_optblocker_u8 3s 3s +0%
mld_value_barrier_u32 3s 4s -25%
poly_caddq_c 3s 4s -25%
poly_caddq_native_aarch64 3s 2s +50%
poly_decompose 3s 4s -25%
poly_invntt_tomont_native 3s 3s +0%
poly_ntt 3s 2s +50%
poly_pointwise_montgomery_native 3s 3s +0%
poly_reduce 3s 2s +50%
poly_sub 3s 5s -40%
poly_uniform_gamma1 3s 3s +0%
polyt1_pack 3s 3s +0%
polyveck_pack_eta 3s 3s +0%
polyveck_pack_t0 3s 2s +50%
polyveck_pack_w1 3s 4s -25%
polyveck_unpack_eta 3s 4s -25%
polyvecl_uniform_gamma1 3s 5s -40%
polyw1_pack 3s 3s +0%
polyz_pack 3s 1s +200%
rej_eta 3s 4s -25%
rej_eta_c 3s 4s -25%
shake128_absorb 3s 1s +200%
shake128_squeeze 3s 2s +50%
shake256_finalize 3s 4s -25%
shake256x4_squeezeblocks 3s 3s +0%
sign_signature_pre_hash_shake256 3s 4s -25%
sign_verify_pre_hash_internal 3s 2s +50%
sys_check_capability 3s 4s -25%
unpack_sig 3s 3s +0%
keccak_init 2s 1s +100%
keccakf1600_xor_bytes 2s 3s -33%
montgomery_reduce 2s 3s -33%
pack_pk 2s 6s -67%
poly_chknorm_native 2s 5s -60%
poly_ntt_c 2s 5s -60%
poly_pointwise_montgomery 2s 5s -60%
poly_shiftl 2s 4s -50%
polyvecl_permute_bitrev_to_custom 2s 2s +0%
polyvecl_unpack_eta 2s 4s -50%
polyvecl_unpack_z 2s 3s -33%
reduce32 2s 3s -33%
shake128_finalize 2s 1s +100%
shake128_init 2s 2s +0%
shake128_release 2s 3s -33%
shake128x4_absorb_once 2s 3s -33%
shake256_release 2s 2s +0%
unpack_pk 2s 4s -50%
keccak_finalize 1s 3s -67%
keccakf1600x4_xor_bytes 1s 1s +0%
mld_ct_cmask_nonzero_u8 1s 2s -50%
mld_ct_sel_int32 1s 2s -50%
mld_keccakf1600_extract_bytes 1s 3s -67%
poly_caddq 1s 3s -67%
poly_chknorm 1s 3s -67%
shake256_absorb 1s 1s +0%

@mkannwischer mkannwischer marked this pull request as ready for review February 7, 2026 03:31
@mkannwischer mkannwischer requested a review from a team as a code owner February 7, 2026 03:31
Copy link
Copy Markdown
Contributor

@hanno-becker hanno-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, very clean to integrate it into simpasm

@hanno-becker hanno-becker merged commit d06f9ad into main Feb 7, 2026
371 checks passed
@hanno-becker hanno-becker deleted the asm-size branch February 7, 2026 03:49
mkannwischer added a commit to pq-code-package/mlkem-native that referenced this pull request Feb 7, 2026
Add size information to function symbols.
Size information is added through a MLD_ASM_FN_SIZE macro (mapping to .size
for elf-targets and nothing otherwise) which gets added automatically by
autogen. The makes the assembly
functions show up with their correct size in the elf instead of always
having a zero size, making it easier to see how much space each
function takes up.

- Port of pq-code-package/mldsa-native#956

Co-authored-by: Anders Sonmark <Anders.Sonmark@axis.com>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants