Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rsa: use portable intrinsic for 64x64bit multiplication on Win ARM64 #20244

Closed
wants to merge 1 commit into from

Conversation

tomato42
Copy link
Contributor

@tomato42 tomato42 commented Feb 8, 2023

_umul128() is x86_64 (x64) only, while __umulh() works both on x64 and ARM64

fixes #20234

Checklist

@tomato42
Copy link
Contributor Author

tomato42 commented Feb 8, 2023

@slproweb @jbekkema could you check if this fixes #20234 for you?

@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Feb 8, 2023
@slproweb
Copy link

slproweb commented Feb 8, 2023

"Fixes" is a relative term in the case of software. A fix for one problem can unexpectedly break something else.

Does the code once again compile and produce binaries for the VC-WIN64-ARM target? Yes.

Do the binaries load and run on Windows ARM64 hardware? Yes. Tested on Win10 ARM64. openssl.exe loads and runs. I also ran a few select items from the test suite: test\bntest.exe, test\rsa_test.exe, test\rsa_mp_test.exe, test\rsa_sp800_56b_test.exe. All output reported "ok." I don't know whether any of those test executables actually tested the inlined function with the new intrinsic, but the various executables successfully ran to completion.

Is it still functionally correct across all platforms, especially supported platforms? I don't know. Win + ARM64 is largely unsupported. Looking at what Microsoft says about __umulh() to multiply two 128-bit integers vs. the existing code, the change seems logically correct to me for handling multiply overflow. I don't know what the policy is of putting #include statements in the middle of code instead of somewhere nearer to the top of the file. I also don't know if the code should, on Intel x64 use the original _umul128() for any specific performance gains associated with that intrinsic (e.g. one multiply operation) with a fallback to __umulh() for Win + ARM64 (i.e. two multiply operations) and the fallback to the C implementation for everything else.

@t8m t8m added branch: master Merge to master branch triaged: bug The issue/pr is/fixes a bug branch: 3.0 Merge to openssl-3.0 branch branch: 3.1 Merge to openssl-3.1 labels Feb 8, 2023
@tomato42
Copy link
Contributor Author

tomato42 commented Feb 8, 2023

@slproweb Thanks, I was asking about the compilation error specifically.

I'm the original author of the buggy code so I'm aware of the additional requirements of it. But good point about checking if the code is optimal; it's indeed not if the __umulh() is used on x64, msvc generates then two mul instructions (or imul and mul, more precisly) instead of one; _umul128 is necessary for optimal machine code on x64. Will fix it.

_umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
doesn't generate optimal code on x64
@tomato42 tomato42 changed the title rsa: use the portable intrinsic for 64x64bit multiplication rsa: use portable intrinsic for 64x64bit multiplication on Win ARM64 Feb 8, 2023
@t8m t8m added approval: review pending This pull request needs review by a committer approval: otc review pending This pull request needs review by an OTC member labels Feb 8, 2023
@t8m t8m added branch: 1.1.1 Merge to OpenSSL_1_1_1-stable branch and removed approval: otc review pending This pull request needs review by an OTC member labels Feb 8, 2023
Copy link
Member

@beldmit beldmit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@beldmit beldmit added approval: done This pull request has the required number of approvals and removed approval: review pending This pull request needs review by a committer labels Feb 8, 2023
@tomato42
Copy link
Contributor Author

tomato42 commented Feb 8, 2023

The buildbot/master:windows-win10-x86_64 failure looks like a problem with the machine, not the PR

@paulidale
Copy link
Contributor

My preference is for this version of the fix to go in.

@slproweb
Copy link

slproweb commented Feb 9, 2023

Nitpick: The most recent change that was introduced here makes the assumption that Microsoft will never experiment with other 64-bit architectures other than Intel x64 or ARM64.

https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#machine-types

Only lists a fraction of the hardware architectures Microsoft and Microsoft partners have experimented with over the years. Granted, only a select few ever made it as far as full VS toolchain support with real hardware running the Windows OS.

Everything non-Microsoft has the fallback position of the native C implementation. With the native C implementation, at least it'll compile but not run well. Throwing a warning about non-optimal code on all platforms that end up using the native C implementation seems better to me than just failing to compile altogether. But this has already been signed off on.

/EndNitpick

@tomato42
Copy link
Contributor Author

tomato42 commented Feb 9, 2023

That was intentional. Once you have such a system you will be at the best position to decide which option you want to pursue: use one of the existing intrinsics, use a new intrinsic, or fallback to pure C option.

The other platforms don't have the fallback not for new unexpected platforms, but rather for non-gcc, non-clang compilers that don't support 128 bit variables natively. So it will be used when it's both an uncommon platform and an uncommon compiler.

@slproweb
Copy link

slproweb commented Feb 9, 2023

Obviously a difference of opinion here but, IMO, code should aim to always compile successfully with the sole exception of security considerations, which I don't think applies here.

OpenSSL also has a no-asm option in Configure. Under a no-asm build, the code here still uses intrinsics. That certainly makes it more difficult to validate that the native C implementation is functionally correct on all platforms (especially MSVC), which could be problematic for things like FIPS validation.

Side note: There doesn't appear to be a CI test for no-asm.

The point is largely moot since this has already been accepted but not merged yet.

@openssl-machine
Copy link
Collaborator

24 hours has passed since 'approval: done' was set, but as this PR has been updated in that time the label 'approval: ready to merge' is not being automatically set. Please review the updates and set the label manually.

@toonetown
Copy link

Has there been any further discussion on whether this or #20235 is the preferable approach? It seems to me from reading through them that this one probably is - but both PRs have been open and approved for a few days now.

richardlau added a commit to RafaelGSS/node that referenced this pull request Feb 10, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

Refs: openssl/openssl#20244
richardlau added a commit to RafaelGSS/node that referenced this pull request Feb 10, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

Refs: openssl/openssl#20244
richardlau added a commit to RafaelGSS/node that referenced this pull request Feb 10, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

Refs: openssl/openssl#20244
richardlau added a commit to RafaelGSS/node that referenced this pull request Feb 10, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

Refs: openssl/openssl#20244
richardlau added a commit to RafaelGSS/node that referenced this pull request Feb 10, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

Refs: openssl/openssl#20244
@t8m t8m added approval: ready to merge The 24 hour grace period has passed, ready to merge and removed approval: done This pull request has the required number of approvals labels Feb 11, 2023
@t8m
Copy link
Member

t8m commented Feb 11, 2023

Merged to all branches. Thank you for your contribution.

@t8m t8m closed this Feb 11, 2023
openssl-machine pushed a commit that referenced this pull request Feb 11, 2023
_umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
doesn't generate optimal code on x64

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from #20244)

(cherry picked from commit 075652f)
openssl-machine pushed a commit that referenced this pull request Feb 11, 2023
_umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
doesn't generate optimal code on x64

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from #20244)

(cherry picked from commit 075652f)
openssl-machine pushed a commit that referenced this pull request Feb 11, 2023
_umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
doesn't generate optimal code on x64

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from #20244)
openssl-machine pushed a commit that referenced this pull request Feb 11, 2023
_umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
doesn't generate optimal code on x64

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from #20244)

(cherry picked from commit 075652f)
@t8m
Copy link
Member

t8m commented Feb 11, 2023

Obviously a difference of opinion here but, IMO, code should aim to always compile successfully with the sole exception of security considerations, which I don't think applies here.

OpenSSL also has a no-asm option in Configure. Under a no-asm build, the code here still uses intrinsics. That certainly makes it more difficult to validate that the native C implementation is functionally correct on all platforms (especially MSVC), which could be problematic for things like FIPS validation.

Side note: There doesn't appear to be a CI test for no-asm.

The point is largely moot since this has already been accepted but not merged yet.

@slproweb Any improvements are welcome. Please submit a PR if you have any changes you'd like to be merged.

@tomato42 tomato42 deleted the 128bit-mul-on-msc-arm64 branch February 11, 2023 14:03
kiyolee pushed a commit to kiyolee/openssl1_1-win-build that referenced this pull request Feb 11, 2023
_umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
doesn't generate optimal code on x64

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from openssl/openssl#20244)
kiyolee pushed a commit to kiyolee/openssl3-win-build that referenced this pull request Feb 11, 2023
_umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
doesn't generate optimal code on x64

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from openssl/openssl#20244)
RafaelGSS pushed a commit to RafaelGSS/node that referenced this pull request Feb 11, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

Refs: openssl/openssl#20244
RafaelGSS pushed a commit to RafaelGSS/node that referenced this pull request Feb 11, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

Refs: openssl/openssl#20244
RafaelGSS pushed a commit to RafaelGSS/node that referenced this pull request Feb 12, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

Refs: openssl/openssl#20244
richardlau added a commit to nodejs/node that referenced this pull request Feb 13, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

PR-URL: #46566
Refs: openssl/openssl#20244
Refs: https://mta.openssl.org/pipermail/openssl-announce/2023-February/000251.html
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Beth Griggs <bethanyngriggs@gmail.com>
richardlau added a commit to nodejs/node that referenced this pull request Feb 13, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

PR-URL: #46568
Refs: openssl/openssl#20244
Refs: https://mta.openssl.org/pipermail/openssl-announce/2023-February/000251.html
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Beth Griggs <bethanyngriggs@gmail.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
RafaelGSS pushed a commit to RafaelGSS/node that referenced this pull request Feb 13, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

Refs: openssl/openssl#20244
RafaelGSS pushed a commit to RafaelGSS/node that referenced this pull request Feb 13, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

Refs: openssl/openssl#20244
oraluben pushed a commit to oraluben/node that referenced this pull request Mar 14, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

PR-URL: nodejs/node#46568
Refs: openssl/openssl#20244
Refs: https://mta.openssl.org/pipermail/openssl-announce/2023-February/000251.html
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Beth Griggs <bethanyngriggs@gmail.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
oraluben pushed a commit to oraluben/node that referenced this pull request Mar 17, 2023
Original commit message:

    rsa: add msvc intrinsic for non x64 platforms

    _umul128() is x86_64 (x64) only, while __umulh() works everywhere, but
    doesn't generate optimal code on x64

PR-URL: nodejs/node#46568
Refs: openssl/openssl#20244
Refs: https://mta.openssl.org/pipermail/openssl-announce/2023-February/000251.html
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Beth Griggs <bethanyngriggs@gmail.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: ready to merge The 24 hour grace period has passed, ready to merge branch: master Merge to master branch branch: 1.1.1 Merge to OpenSSL_1_1_1-stable branch branch: 3.0 Merge to openssl-3.0 branch branch: 3.1 Merge to openssl-3.1 severity: fips change The pull request changes FIPS provider sources triaged: bug The issue/pr is/fixes a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VC-WIN64-ARM build fails in VS 2019 (OpenSSL 3.0.8)
7 participants