Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull new HQC implementation from upstream #1585

Merged
merged 14 commits into from
Nov 6, 2023
Merged

Pull new HQC implementation from upstream #1585

merged 14 commits into from
Nov 6, 2023

Conversation

SWilson4
Copy link
Member

@SWilson4 SWilson4 commented Oct 19, 2023

Run copy_from_upstream to pull the recently merged HQC updates from PQClean. Ignore Kyber and Dilithium updates as we don't yet want to move to the "standard" version.

This is unfortunately a bit of a "kitchen sink" PR: it contains

  1. a new version of HQC from Update HQC to 2023-04-30 submission PQClean/PQClean#512,
  2. updates to Falcon and Sphincs+ which landed in PQClean before the HQC code was merged, and
  3. support for the HQC custom PRNG for KAT checking.

Notably, this PR does not contain the latest Kyber and Dilithium updates from PQClean, as this would entail updating liboqs to be in sync with the "standard" track. If/when this is merged, we will be out of sync with PQClean.

When patching HQC, I made a number of minor changes to the upstream code, mostly to address undefined or implementation-defined behaviour and endianness issues. I also wrote a new modular reduction routine to address PQClean/PQClean#482; this can be found in vector.c. It might be easier to review my changes directly against the reference implementation; you can view the patches here.

Fixes #1319.
Fixes #995.

  • Does this PR change the input/output behaviour of a cryptographic algorithm (i.e., does it change known answer test values)? (If so, a version bump will be required from x.y.z to x.(y+1).0.)
  • Does this PR change the list of algorithms available -- either adding, removing, or renaming? Does this PR otherwise change an API? (If so, PRs in fully supported downstream projects dependent on these, i.e., oqs-provider and OQS-OpenSSH will also need to be ready for review and merge by the time this is merged.)

@SWilson4 SWilson4 marked this pull request as ready for review October 24, 2023 16:29
Copy link
Member

@baentsch baentsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting into the kitchen (particularly its sink :) and moving this upgrade through CI @SWilson4 ! I added single comments, one of which I feel strongly about (as usual :) The version of HQC does not seem to have been updated. Am I right in this understanding? If so, can we please fix this before merge, either by patch or upstream update?

@baentsch
Copy link
Member

Oh, one more thing: This PR needs to update the top-level CMakeLists.txtto have "OQS_VERSION_TEXT" read "0.10.0-dev".

@SWilson4
Copy link
Member Author

The version of HQC does not seem to have been updated. Am I right in this understanding? If so, can we please fix this before merge, either by patch or upstream update?

Oh, one more thing: This PR needs to update the top-level CMakeLists.txtto have "OQS_VERSION_TEXT" read "0.10.0-dev".

Both should be fixed now. I also incremented the SOVERSION, as we did when doing a similar merge for McEliece.

@SWilson4
Copy link
Member Author

Pardon the double force-push; I did the rebase before #1595 was merged, so it needed to be done again to avoid a merge conflict with main.

src/common/rand/rand.h Outdated Show resolved Hide resolved
@baentsch
Copy link
Member

baentsch commented Nov 1, 2023

@SWilson4 Please feel free to re-request my review if you feel the changes needed to get CI to pass warrant another glance. Otherwise, Thanks again for this "deep dive"!

@bhess bhess self-requested a review November 1, 2023 16:50
@baentsch
Copy link
Member

baentsch commented Nov 3, 2023

@SWilson4 FYI and FWIW, I was intrigued by the single CI failure and ran some experiments on the CCI MacOS instance via SSH: Result: The shared lib config option -DBUILD_SHARED_LIBS=ON reliably leads to all KEM KATs being incorrect. The following config options are irrelevant: OQS_DIST_BUILD, OQS_USE_OPENSSL; I even compiled using gcc-12 instead of clang-14: To no avail, if liboqs is built shared, the KEM KAT tests fail reliably on that x64 machine. I compared the exact same config on my own x64 machine (not MacOS, but also clang-14 and all CPU extensions reported identically) and all KATs are correct there. I then had the hunch that the shared lib concept of MacOS may play in this here and tested on the M1: And indeed the same config also fails there -- different architecture, different compiler, same shared library concept (MacOS). I still don't understand what's happening here.... Any MacOS experts around? (@dstebila ?)

@SWilson4
Copy link
Member Author

SWilson4 commented Nov 3, 2023

@SWilson4 FYI and FWIW, I was intrigued by the single CI failure and ran some experiments on the CCI MacOS instance via SSH: Result: The shared lib config option -DBUILD_SHARED_LIBS=ON reliably leads to all KEM KATs being incorrect. The following config options are irrelevant: OQS_DIST_BUILD, OQS_USE_OPENSSL; I even compiled using gcc-12 instead of clang-14: To no avail, if liboqs is built shared, the KEM KAT tests fail reliably on that x64 machine. I compared the exact same config on my own x64 machine (not MacOS, but also clang-14 and all CPU extensions reported identically) and all KATs are correct there. I then had the hunch that the shared lib concept of MacOS may play in this here and tested on the M1: And indeed the same config also fails there -- different architecture, different compiler, same shared library concept (MacOS). I still don't understand what's happening here.... Any MacOS experts around? (@dstebila ?)

I think I know what's happening, but I don't know why it would be caused by this particular config... if you look at the failed results, the computed KAT hashes are different at each run. This leads me to believe that either

  1. the system RNG is being used instead of one of the deterministic PRNGs or
  2. the deterministic PRNGs are being used but not seeded properly.

If, for instance, a deterministic PRNG were used with an uninitialized buffer it might explain the apparent randomness. However, the former possibility seems more likely to me. From the point of view of the non-HQC KATs, the only thing that's changed (other than being wrapped in an if/else block) is the build configuration for kat_kem: see https://github.com/open-quantum-safe/liboqs/pull/1585/files#r1381665498.

@SWilson4
Copy link
Member Author

SWilson4 commented Nov 3, 2023

the system RNG is being used instead of one of the deterministic PRNGs

I just stepped through the code in a debugger on the M1 machine and confirmed that this is indeed the case. It's very confusing: the NIST DRBG is indeed set as the randombytes algorithm and used for the first call (from the kat_kem file to generate the seed). On subsequent calls (from the Kyber code) the OpenSSL RNG is used.

@baentsch
Copy link
Member

baentsch commented Nov 3, 2023

On subsequent calls (from the Kyber code) the OpenSSL RNG is used.

If the code is built without OpenSSL the same issue appears. What RNG is used then? Why is this dependent on the linkage mode of the library??

add_executable(kat_kem kat_kem.c)
target_link_libraries(kat_kem PRIVATE ${API_TEST_DEPS})
# KAT KEM needs to call the internal SHA3 functions directly, hence the extra dependencies
add_executable(kat_kem kat_kem.c ${COMMON_OBJS})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm -- this line makes me wonder: If one adds all COMMON_OBJS and links against liboqs (line below, API_TEST_DEPS) which symbols "win"? The ones from the objects list or the ones from the library?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm -- this line makes me wonder: If one adds all COMMON_OBJS and links against liboqs (line below, API_TEST_DEPS) which symbols "win"? The ones from the objects list or the ones from the library?

Maybe the "winner" is picked differently on Mac vs Linux...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this theory: oqs_randombytes_algorithm is a static function pointer in an object file; Linux keeps using that throughout the code after it has been initialized by the call in kat_kem.c. MacOS in turn "overloads" it with a new one when loading the library for all further invocations of library functions. I don't quite understand why the lib doesn't get loaded right at the moment OQS_randombytes_switch_algorithm gets called, though.... So, just a wild theory, still no idea how to fix.

@SWilson4
Copy link
Member Author

SWilson4 commented Nov 3, 2023

If the code is built without OpenSSL the same issue appears. What RNG is used then?

In that case the system RNG (arc4random for Apple) is used.

* fix kat_kem linkage

* remove armhf CI support

* Revert "remove armhf CI support"

This reverts commit af759bb.
@SWilson4
Copy link
Member Author

SWilson4 commented Nov 6, 2023

Thanks very much to @baentsch for getting CI to pass. I'm going to go ahead and merge into main, since there were no HQC-related changes after the approving reviews, and the changes since then are largely authored by one of the approving reviewers.

As a final check, I ran the constant time tests locally just now and HQC passed with flying colours. Hopefully this will also be confirmed in the weekly run.

@SWilson4 SWilson4 merged commit aeac3a4 into main Nov 6, 2023
50 checks passed
@dstebila dstebila deleted the sw-hqc-pull branch November 28, 2023 15:13
pi-314159 added a commit to open-quantum-safe/boringssl that referenced this pull request Dec 2, 2023
SWilson4 added a commit that referenced this pull request Dec 15, 2023
* Update Sphincs+ PQClean patch

* Don't apply PQClean Dilithium and Kyber patches

* Run copy_from_upstream; don't apply Dilithium and Kyber changes

* Run HQC KATs with custom PRNG

* Satisfy astyle

* Add licence for common code

* Fix CI build errors

* Update HQC version, OQS version, and SOVERSION

* Move HQC PRNG into test file

* Satisfy astyle

* Fix SHA3 link error

* Reset HQC issues/passes

* fixup! Fix SHA3 link error

* fix kat_kem linkage to make HQC PR pass CI (#1601)

* fix kat_kem linkage

* remove armhf CI support

* Revert "remove armhf CI support"

This reverts commit af759bb.

---------

Co-authored-by: Michael Baentsch <57787676+baentsch@users.noreply.github.com>
SWilson4 added a commit that referenced this pull request Dec 22, 2023
* Update Sphincs+ PQClean patch

* Don't apply PQClean Dilithium and Kyber patches

* Run copy_from_upstream; don't apply Dilithium and Kyber changes

* Run HQC KATs with custom PRNG

* Satisfy astyle

* Add licence for common code

* Fix CI build errors

* Update HQC version, OQS version, and SOVERSION

* Move HQC PRNG into test file

* Satisfy astyle

* Fix SHA3 link error

* Reset HQC issues/passes

* fixup! Fix SHA3 link error

* fix kat_kem linkage to make HQC PR pass CI (#1601)

* fix kat_kem linkage

* remove armhf CI support

* Revert "remove armhf CI support"

This reverts commit af759bb.

---------

Co-authored-by: Michael Baentsch <57787676+baentsch@users.noreply.github.com>
SWilson4 added a commit that referenced this pull request Jan 19, 2024
* Update Sphincs+ PQClean patch

* Don't apply PQClean Dilithium and Kyber patches

* Run copy_from_upstream; don't apply Dilithium and Kyber changes

* Run HQC KATs with custom PRNG

* Satisfy astyle

* Add licence for common code

* Fix CI build errors

* Update HQC version, OQS version, and SOVERSION

* Move HQC PRNG into test file

* Satisfy astyle

* Fix SHA3 link error

* Reset HQC issues/passes

* fixup! Fix SHA3 link error

* fix kat_kem linkage to make HQC PR pass CI (#1601)

* fix kat_kem linkage

* remove armhf CI support

* Revert "remove armhf CI support"

This reverts commit af759bb.

---------

Co-authored-by: Michael Baentsch <57787676+baentsch@users.noreply.github.com>
SWilson4 added a commit that referenced this pull request Jan 22, 2024
* Update Sphincs+ PQClean patch

* Don't apply PQClean Dilithium and Kyber patches

* Run copy_from_upstream; don't apply Dilithium and Kyber changes

* Run HQC KATs with custom PRNG

* Satisfy astyle

* Add licence for common code

* Fix CI build errors

* Update HQC version, OQS version, and SOVERSION

* Move HQC PRNG into test file

* Satisfy astyle

* Fix SHA3 link error

* Reset HQC issues/passes

* fixup! Fix SHA3 link error

* fix kat_kem linkage to make HQC PR pass CI (#1601)

* fix kat_kem linkage

* remove armhf CI support

* Revert "remove armhf CI support"

This reverts commit af759bb.

---------

Co-authored-by: Michael Baentsch <57787676+baentsch@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update HQC. Known non-constant time behaviour in HQC
4 participants