lto = "fat" causes doctest to generate invalid code for Apple M1 (and potentially x86) #116941

IceTDrinker · 2023-10-19T13:09:16Z

We have this code in our https://github.com/zama-ai/tfhe-rs project on commit f1c21888a762ddf9de017ae52dc120c141ec9c02 from tfhe/docs/how_to/compress.md line 44 and beyond:

use tfhe::prelude::*;
use tfhe::{
    generate_keys, set_server_key, ClientKey, CompressedServerKey, ConfigBuilder, FheUint8,
};

fn main() {
    let config = ConfigBuilder::all_disabled()
        .enable_default_integers()
        .build();

    let cks = ClientKey::generate(config);
    let compressed_sks = CompressedServerKey::new(&cks);

    println!(
        "compressed size  : {}",
        bincode::serialize(&compressed_sks).unwrap().len()
    );

    let sks = compressed_sks.decompress();

    println!(
        "decompressed size: {}",
        bincode::serialize(&sks).unwrap().len()
    );

    set_server_key(sks);

    let clear_a = 12u8;
    let a = FheUint8::try_encrypt(clear_a, &cks).unwrap();

    let c = a + 234u8;
    let decrypted: u8 = c.decrypt(&cks);
    assert_eq!(decrypted, clear_a.wrapping_add(234));
}

I expected to see this happen: running the doctest with the following command should work (note that we modify the release profile to have lto = "fat" enabled):

RUSTFLAGS="-C target-cpu=native" cargo +nightly-2023-10-17 test --profile release --doc --features=aarch64-unix,boolean,shortint,integer,internal-keycache -p tfhe -- test_user_docs::how_to_compress

Instead, this happened: the program crashes, compiling the same code in a separate example and the same cargo configuration results in an executable that works. Turning LTO off also makes a doctest that compiles properly, indicating LTO is at fault or part of the problem when combined with doctests.

It has been happening randomly for doctests on a lot of Rust versions but we could not identify what the issue was, looks like enabling LTO creates a miscompile where a value that is provably 0 (as it's never modified by the code) is asserted to be != 0 and crashes the program, sometimes different things error out, it looks like the program is reading at the wrong location on the stack. The value being asserted != 0 is in https://github.com/zama-ai/tfhe-rs/blob/f1c21888a762ddf9de017ae52dc120c141ec9c02/tfhe/src/core_crypto/algorithms/ggsw_encryption.rs#L551

Unfortunately we are not able to minify this issue at the moment as it's not happening reliably across doctests.

Meta

rustc --version --verbose:

rustc 1.75.0-nightly (49691b1f7 2023-10-16)
binary: rustc
commit-hash: 49691b1f70d71dd7b8349c332b7f277ee527bf08
commit-date: 2023-10-16
host: aarch64-apple-darwin
release: 1.75.0-nightly
LLVM version: 17.0.2

Unfortunately on nightly (used to recover the doctest binaries via RUSTDOCFLAGS="-Z unstable-options --persist-doctests doctestbins") only exhibits the crash for the parallel version of an encryption algorithm used with rayon (on current stable we can also get the crash with a serial algorithm but we don't seem to be able to recover the doctest binary).

doctest_miscompile.zip
The archive contains the objdump --disassemble for the code compiled as an example (running fine) and the code compiled as a doctest exhibiting the miscompilation, if needed I can provide the binaries, but I would understand if nobody would want to run a binary coming from a bug report.

objdump --version
Apple LLVM version 14.0.3 (clang-1403.0.22.14.1)
  Optimized build.
  Default target: arm64-apple-darwin22.5.0
  Host CPU: apple-m1l

  Registered Targets:
    aarch64    - AArch64 (little endian)
    aarch64_32 - AArch64 (little endian ILP32)
    aarch64_be - AArch64 (big endian)
    arm        - ARM
    arm64      - ARM64 (little endian)
    arm64_32   - ARM64 (little endian ILP32)
    armeb      - ARM (big endian)
    thumb      - Thumb
    thumbeb    - Thumb (big endian)
    x86        - 32-bit X86: Pentium-Pro and above
    x86-64     - 64-bit X86: EM64T and AMD64

Here is a snippet of a backtrace with two threads erroring on two different issues (while there is no problem having the same code compiled as an example).

Backtrace

stack backtrace:
   0:        0x102712f6c - thread '<<unnamed>std' panicked at tfhe/src/core_crypto/algorithms/ggsw_encryption.rs:551:::5sys_common:
::assertion failed: ciphertext_modulus.is_compatible_with_native_modulus()backtrace
::_print::DisplayBacktrace as core::fmt::Display>::fmt::h06ea57ce7b13512d
   1:        0x10268b4f8 - core::fmt::write::h4d15d254ca20c331
   2:        0x1026c6a68 - std::io::Write::write_fmt::hfdc8b2852a9a03fa
   3:        0x102715ea0 - std::sys_common::backtrace::print::h139bbaa51f48014c
   4:        0x102715a08 - std::panicking::default_hook::{{closure}}::hbbb7d85a61092397
   5:        0x1027157cc - std::panicking::default_hook::hb0db088803baef11
   6:        0x102717234 - std::panicking::rust_panic_with_hook::h78dc274574606137
   7:        0x102716da8 - std::panicking::begin_panic_handler::{{closure}}::h2905be29dbe9281c
   8:        0x102716c88 - std::sys_common::backtrace::__rust_end_short_backtrace::h2a15f4fd2d64df91
   9:        0x102716c7c - _rust_begin_unwind
  10:        0x1027fe624 - core::panicking::panic_fmt::hd8e61ff6f38230f9
  11:        0x1027fe7b0 - core::panicking::panic::h4a945e52b5fb1050
  12:        0x1027990bc - tfhe::core_crypto::algorithms::glwe_encryption::encrypt_seeded_glwe_ciphertext_assign_with_existing_generator::hb32b93df2aa13c6e
  13:        0x1027d8d44 - <rayon::iter::for_each::ForEachConsumer<F> as rayon::iter::plumbing::Folder<T>>::consume_iter::h6b9d6bce496a26b2
  14:        0x10277099c - rayon::iter::plumbing::Producer::fold_with::h3252c105ae5580f0
  15:        0x10278c92c - rayon::iter::plumbing::bridge_producer_consumer::helper::h516df06807eeed76
  16:        0x10271ff70 - rayon_core::join::join_context::{{closure}}::h7ecf44f403b2e94c
  17:        0x102729d00 - rayon_core::registry::in_worker::hb2d005d9f62ec9b8
  18:        0x10278c918 - rayon::iter::plumbing::bridge_producer_consumer::helper::h516df06807eeed76
  19:        0x102792d0c - <<rayon::iter::map::Map<I,F> as rayon::iter::IndexedParallelIterator>::with_producer::Callback<CB,F> as rayon::iter::plumbing::ProducerCallback<T>>::callback::h282ea6fb42ca6c2b
  20:        0x10276aaa0 - <<rayon::iter::zip::Zip<A,B> as rayon::iter::IndexedParallelIterator>::with_producer::CallbackB<CB,A> as rayon::iter::plumbing::ProducerCallback<ITEM>>::callback::h6c6ab19b4791d17e
  21:        0x1027dcc88 - <<rayon::iter::enumerate::Enumerate<I> as rayon::iter::IndexedParallelIterator>::with_producer::Callback<CB> as rayon::iter::plumbing::ProducerCallback<I>>::callback::h62504345ff3d393a
  22:        0x10278f38c - rayon::iter::plumbing::bridge::h142cac5b932df279
  23:        0x1027de84c - rayon::iter::plumbing::Producer::fold_with::hda6c429fb67861a6
  24:        0x10278b204 - rayon::iter::plumbing::bridge_producer_consumer::helper::ha97da0be53d3520b
  25:        0x1027930fc - <<rayon::iter::map::Map<I,F> as rayon::iter::IndexedParallelIterator>::with_producer::Callback<CB,F> as rayon::iter::plumbing::ProducerCallback<T>>::callback::h5caece096ea77aa2
  26:        0x102768cdc - <<rayon::iter::zip::Zip<A,B> as rayon::iter::IndexedParallelIterator>::with_producer::CallbackA<CB,B> as rayon::iter::plumbing::ProducerCallback<ITEM>>::callback::h9c59859a5ada9da8
  27:        0x102790548 - rayon::iter::plumbing::bridge::h691ef483cd06a966
  28:        0x1027d896c - tfhe::core_crypto::algorithms::ggsw_encryption::par_encrypt_constant_seeded_ggsw_ciphertext_with_existing_generator::h1092854bcdddc1c5
  29:        0x1027d8540 - <rayon::iter::for_each::ForEachConsumer<F> as rayon::iter::plumbing::Folder<T>>::consume_iter::h58460779da245a1d
  30:        0x102771604 - rayon::iter::plumbing::Producer::fold_with::h5c2dab692eefc651
  31:        0x10278a424 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  32:        0x102759bec - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::he14a52c10f982320
  33:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  34:        0x10271ec34 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  35:        0x1027292dc - rayon_core::registry::in_worker::h72ac659d0872c7bc
  36:        0x10278a410 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  37:        0x102759bec - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::he14a52c10f982320
  38:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  39:        0x10280004c - rayon_core::join::join_recover_from_panic::hac430d1fb14e684b
  40:        0x10271eb10 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  41:        0x1027292dc - rayon_core::registry::in_worker::h72ac659d0872c7bc
  42:        0x10278a410 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  43:        0x10271eac8 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  44:        0x1027292dc - rayon_core::registry::in_worker::h72ac659d0872c7bc
  45:        0x10278a410 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  46:        0x1027306d4 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  47:        0x102750400 - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::h5752c5eaefb098bd
  48:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  49:        0x1026a9300 - rayon_core::registry::ThreadBuilder::run::h03f0186f2f91b865
  50:        0x1026b1ee4 - std::sys_common::backtrace::__rust_begin_short_backtrace::hf857650a9dcd5e44
  51:        0x1026ac8c8 - core::ops::function::FnOnce::call_once{{vtable.shim}}::heab0ff5ef27f89d0
  52:        0x1027183c4 - std::sys::unix::thread::Thread::new::thread_start::h2ab8753089ede7d0
  53:        0x19832bfa8 - __pthread_joiner_wake
stack backtrace:
   0:        0x102712f6c - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h06ea57ce7b13512d
   1:        0x10268b4f8 - core::fmt::write::h4d15d254ca20c331
   2:        0x1026c6a68 - std::io::Write::write_fmt::hfdc8b2852a9a03fa
   3:        0x102715ea0 - std::sys_common::backtrace::print::h139bbaa51f48014c
   4:        0x102715a08 - std::panicking::default_hook::{{closure}}::hbbb7d85a61092397
   5:        0x1027157cc - std::panicking::default_hook::hb0db088803baef11
   6:        0x102717234 - std::panicking::rust_panic_with_hook::h78dc274574606137
   7:        0x102716da8 - std::panicking::begin_panic_handler::{{closure}}::h2905be29dbe9281c
   8:        0x102716c88 - std::sys_common::backtrace::__rust_end_short_backtrace::h2a15f4fd2d64df91
   9:        0x102716c7c - _rust_begin_unwind
  10:  thread ' <unnamed> ' panicked at  /rustc/49691b1f70d71dd7b8349c332b7f277ee527bf08/library/core/src/num/mod.rs : 1166 :0x51027fe624:
 - attempt to calculate the remainder with a divisor of zerocore
::panicking::panic_fmt::hd8e61ff6f38230f9
  11:        0x1027fe7b0 - core::panicking::panic::h4a945e52b5fb1050
  12:        0x1027990bc - tfhe::core_crypto::algorithms::glwe_encryption::encrypt_seeded_glwe_ciphertext_assign_with_existing_generator::hb32b93df2aa13c6e
  13:        0x1027d8d44 - <rayon::iter::for_each::ForEachConsumer<F> as rayon::iter::plumbing::Folder<T>>::consume_iter::h6b9d6bce496a26b2
  14:        0x10277099c - rayon::iter::plumbing::Producer::fold_with::h3252c105ae5580f0
  15:        0x10278c92c - rayon::iter::plumbing::bridge_producer_consumer::helper::h516df06807eeed76
  16:        0x102756c50 - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::hb4b2cce923b187bc
  17:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  18:        0x10280004c - rayon_core::join::join_recover_from_panic::hac430d1fb14e684b
  19:        0x10271eb10 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  20:        0x1027292dc - rayon_core::registry::in_worker::h72ac659d0872c7bc
  21:        0x10278a410 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  22:        0x102759bec - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::he14a52c10f982320
  23:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  24:        0x10280004c - rayon_core::join::join_recover_from_panic::hac430d1fb14e684b
  25:        0x10271eb10 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  26:        0x1027292dc - rayon_core::registry::in_worker::h72ac659d0872c7bc
  27:        0x10278a410 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  28:        0x102759bec - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::he14a52c10f982320
  29:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  30:        0x1026a9300 - rayon_core::registry::ThreadBuilder::run::h03f0186f2f91b865
  31:        0x1026b1ee4 - std::sys_common::backtrace::__rust_begin_short_backtrace::hf857650a9dcd5e44
  32:        0x1026ac8c8 - core::ops::function::FnOnce::call_once{{vtable.shim}}::heab0ff5ef27f89d0
  33:        0x1027183c4 - std::sys::unix::thread::Thread::new::thread_start::h2ab8753089ede7d0
  34:        0x19832bfa8 - __pthread_joiner_wake

We have also seen some flaky doctests on x86_64 and could not narrow down the issue, we have turned off LTO for all of our doctests for now and we will monitor how things evolve, the reason for the suspicion of an issue on x86 as well is that M1 builds have been running with LTO off for months and have never exhibited the flaky doctest we saw on x86_64, though given the compiled code in that case is significantly different (intrinsics usage being one factor) we can't yet be sure a similar issue is happening on x86_64.

Cheers

The text was updated successfully, but these errors were encountered:

Noratrieb · 2023-10-19T16:04:24Z

have you run the tests with miri to make sure that no UB is happening?

IceTDrinker · 2023-10-19T16:07:04Z

Hello @Nilstrieb not yet, I guess it's the occasion to use it, I'll report back

IceTDrinker · 2023-10-19T16:08:10Z

Should I try it on the example reproducing the doctest or is there a way to run it on doctests ?

Noratrieb · 2023-10-19T16:29:12Z

I forgot whether Miri runs doctests. But you can just extract the doctest into a binary and run that.

IceTDrinker · 2023-10-20T11:25:32Z

Hello @Nilstrieb no undefined behavior found by MIRI, it seems rayon does not terminate its threads so MIRI detects that but the doctest taken out as an example is UB free (I adapted our crypto parameters to be very small but still trigger the crashing doctest).

Cheers

Noratrieb · 2023-10-20T11:44:51Z

adding I-unsound as this looks like a miscompilation.
The next step would be to minimize the issue and create a smaller reproducer.

IceTDrinker · 2023-10-20T11:47:48Z

agreed, for now it's a bit hard as we used to hit that a bit randomly until we identified it seemed linked to doctests + LTO, we will try to find a way to minify it, in the meantime the assembly in the original report should already contain the miscompiled code (though I understand it's likely way too big for a reasonable analysis)

Cheers

nikic · 2023-10-20T12:45:36Z

FWIW, a pattern I've seen a few times with doctests + fat lto is that doctests are unoptimized, so you get optimized IR from LTO fed into the backend with disabled optimization. This can expose bugs in FastISel that we don't otherwise hit.

The key to reproducing it outside doc tests may be to use an lto=fat opt-level=0 configuration.

IceTDrinker · 2023-10-20T12:48:00Z

FWIW, a pattern I've seen a few times with doctests + fat lto is that doctests are unoptimized, so you get optimized IR from LTO fed into the backend with disabled optimization. This can expose bugs in FastISel that we don't otherwise hit.

The key to reproducing it outside doc tests may be to use an lto=fat opt-level=0 configuration.

Thanks will give it a shot, I'm surprised doctest don't honor opitmization/configuration from the Cargo profile though 🤔 any reason for this? If the various parts are not supposed to have mismatched opt levels then yes I can see how some hypothesis done would break at the junction of the optimized code and unoptimized one

IceTDrinker · 2023-10-20T15:41:42Z

For the doctest as an example binary with the lto=fat opt-level=0 configuration I cannot get it to crash

IceTDrinker · 2023-10-23T09:42:42Z

setting the opt-level=0 in release allows the doctest to pass as well, so a mixing of LTO and non 0 opt level seems to trigger the issue for the doctest, trying to minimize the repro case

IceTDrinker · 2023-10-23T09:44:34Z

setting opt-level=1 in release with LTO=fat crashes the doc test

IceTDrinker · 2023-10-23T13:23:37Z

Hello again, some news, I have a state where I'm able to trigger the crash at will with a single line being uncommented.

the repro branch is here https://github.com/zama-ai/tfhe-rs/tree/am/doctest_bug

MIRI is still OK on the doctest taken as an example.

The interesting part is in here, the line below // UNCOMMENT TO PRODUCE THE MISCOMPILE will cause the doctest to crash when not commented, i.e. that function call has the issue, when commented out and inlining the function (i.e. the rest of the doctest below) the execution goes well. I will attach the dump of the assembly for both.

doctest_minified.zip

nok_function_call.S has both the call to the function and the inlined version
ok_function_inlined.S only has the inlined version

use tfhe::core_crypto::prelude::*;
use tfhe::shortint::engine::ShortintEngine;
use tfhe::shortint::parameters::PARAM_MESSAGE_2_CARRY_2_KS_PBS;
use tfhe::shortint::server_key::{MaxDegree, ShortintCompressedBootstrappingKey};
use tfhe::shortint::{
    ClientKey as ShortintClientKey, CompressedServerKey as ShortintCompressedServerKey,
};

fn main() {
    {
        let cks = ShortintClientKey::new(PARAM_MESSAGE_2_CARRY_2_KS_PBS);
        // let compressed_sks = ShortintCompressedServerKey::new(&cks);
        let mut engine = ShortintEngine::new();
        // let compressed_sks = engine.new_compressed_server_key(&cks).unwrap();

        // Plaintext Max Value
        let max_value = cks.parameters.message_modulus().0 * cks.parameters.carry_modulus().0 - 1;

        // The maximum number of operations before we need to clean the carry buffer
        let max_degree = MaxDegree(max_value);
        // UNCOMMENT TO PRODUCE THE MISCOMPILE
        let compressed_sks = engine.new_compressed_server_key_with_max_degree(&cks, max_degree);

        // THIS BELOW IS THE SAME AS THE ABOVE FUNCTION INLINED
        let compressed_sks = {
            let bootstrapping_key = match cks.parameters.pbs_parameters().unwrap() {
                tfhe::shortint::PBSParameters::PBS(pbs_params) => {
                    let bootstrapping_key = allocate_and_generate_new_seeded_lwe_bootstrap_key(
                        &cks.small_lwe_secret_key,
                        &cks.glwe_secret_key,
                        pbs_params.pbs_base_log,
                        pbs_params.pbs_level,
                        pbs_params.glwe_modular_std_dev,
                        pbs_params.ciphertext_modulus,
                        &mut engine.seeder,
                    );

                    ShortintCompressedBootstrappingKey::Classic(bootstrapping_key)
                }
                tfhe::shortint::PBSParameters::MultiBitPBS(pbs_params) => {
                    let bootstrapping_key =
                        par_allocate_and_generate_new_seeded_lwe_multi_bit_bootstrap_key(
                            &cks.small_lwe_secret_key,
                            &cks.glwe_secret_key,
                            pbs_params.pbs_base_log,
                            pbs_params.pbs_level,
                            pbs_params.glwe_modular_std_dev,
                            pbs_params.grouping_factor,
                            pbs_params.ciphertext_modulus,
                            &mut engine.seeder,
                        );

                    ShortintCompressedBootstrappingKey::MultiBit {
                        seeded_bsk: bootstrapping_key,
                        deterministic_execution: pbs_params.deterministic_execution,
                    }
                }
            };

            // Creation of the key switching key
            let key_switching_key = allocate_and_generate_new_seeded_lwe_keyswitch_key(
                &cks.large_lwe_secret_key,
                &cks.small_lwe_secret_key,
                cks.parameters.ks_base_log(),
                cks.parameters.ks_level(),
                cks.parameters.lwe_modular_std_dev(),
                cks.parameters.ciphertext_modulus(),
                &mut engine.seeder,
            );

            // Pack the keys in the server key set:
            ShortintCompressedServerKey {
                key_switching_key,
                bootstrapping_key,
                message_modulus: cks.parameters.message_modulus(),
                carry_modulus: cks.parameters.carry_modulus(),
                max_degree,
                ciphertext_modulus: cks.parameters.ciphertext_modulus(),
                pbs_order: cks.parameters.encryption_key_choice().into(),
            }
        };
    }

    println!("MIRI run done");
}

run MIRI

MIRIFLAGS="-Zmiri-disable-isolation" RUSTFLAGS="-C target-cpu=native" cargo +nightly-2023-10-17 miri run --release --example debug_minify --features=seeder_unix,shortint -p tfhe

Run doctest

RUST_BACKTRACE=full RUSTFLAGS="-C target-cpu=native" RUSTDOCFLAGS="-Z unstable-options --persist-doctests doctestbins" cargo +nightly-2023-10-17 test --profile release --doc --features=seeder_unix,shortint -p tfhe --test_user_docs::how_to_compress

IceTDrinker · 2023-10-23T13:39:27Z

still feels like the wrong part of the stack gets read in the function where a value should be a 0 and something else is read from who knows where

edit: will try to minify some more

IceTDrinker · 2023-10-23T13:49:06Z

With some logging

from the inlined non bugged version

allocate_and_generate_new_seeded_lwe_bootstrap_key=CiphertextModulus(2^64)
generate_seeded_lwe_bootstrap_key=CiphertextModulus(2^64)
encrypt_constant_seeded_ggsw_ciphertext_with_existing_generator=CiphertextModulus(2^64)
encrypt_constant_seeded_ggsw_ciphertext_with_existing_generator=CiphertextModulus(2^64)
encrypt_constant_seeded_ggsw_ciphertext_with_existing_generator=CiphertextModulus(2^64)
encrypt_constant_seeded_ggsw_ciphertext_with_existing_generator=CiphertextModulus(2^64)
MIRI run done

Edit: the "MIRI run done" was a check for the MIRI example, I share the code between both, both runs here are normal runs

from the function call bugged version

new_compressed_server_key_with_max_degree=CiphertextModulus(2^64)
allocate_and_generate_new_seeded_lwe_bootstrap_key=CiphertextModulus(2^64)
generate_seeded_lwe_bootstrap_key=CiphertextModulus(2^64)
encrypt_constant_seeded_ggsw_ciphertext_with_existing_generator=CiphertextModulus(2^64)
encrypt_constant_seeded_ggsw_ciphertext_with_existing_generator=CiphertextModulus(79395112631681133340136570880)

looks like the first iteration of a loop is fine but then the data gets corrupted

here the native modulus 2^64 is encoded as 0 in a u128, so the 2^64 are valid, but the 79395112631681133340136570880 is random data and seems to be changing from run to run

IceTDrinker · 2023-10-23T15:14:04Z

looks similar in spirit to #112548 with the sensitivity to opt levels, though I'm not familiar with the mir opt level

IceTDrinker · 2023-10-24T11:42:50Z

Hello, minified the example to an iterator call returning corrupted data when a specific feature (on which the code does not depend) is enabled. Disabling said feature makes the code run properly, changing the iterator to be an immutable iter does not cause the issue (https://github.com/zama-ai/tfhe-rs/blob/73bf8af9ec7eca7b36f016b2bbfeccfd3b1ac7d2/tfhe/src/lib.rs#L103)

The iterator in question is a wrapping lending iterator that is defined in https://github.com/zama-ai/tfhe-rs/blob/73bf8af9ec7eca7b36f016b2bbfeccfd3b1ac7d2/tfhe/src/core_crypto/commons/traits/contiguous_entity_container.rs#L326, immutable variant is here https://github.com/zama-ai/tfhe-rs/blob/73bf8af9ec7eca7b36f016b2bbfeccfd3b1ac7d2/tfhe/src/core_crypto/commons/traits/contiguous_entity_container.rs#L127

Available here https://github.com/zama-ai/tfhe-rs/tree/am/doctest_bug_minify

Run the doctest and crash it :

RUST_BACKTRACE=full RUSTFLAGS="-C target-cpu=native" cargo +nightly-2023-10-17 test --profile release --doc \
                --features=shortint -p tfhe \
                -- test_user_docs::how_to_compress

Run the doctest and does not crash :

RUST_BACKTRACE=full RUSTFLAGS="-C target-cpu=native" cargo +nightly-2023-10-17 test --profile release --doc \
                -p tfhe \
                -- test_user_docs::how_to_compress

Tried to take the code out to a different repo, it did not repro

Let me know if there is something more I can do, but here I can't seem to minify it anymore than that at the moment

nikic · 2023-10-24T12:27:29Z

I can reproduce the crash on aarch64-linux after also enabling the seeder_unix feature.

I'm not sure how to debug though -- it doesn't seem like rustdoc has any support for that at all? -vv does nothing and neither does -C save-temps. How do you even get the executable it generates?

Passing -Cllvm-args=-global-isel=0 to rustdoc does fix the issue, so this looks like another "passing O3 IR to O0 backend" issue.

IceTDrinker · 2023-10-24T12:29:50Z

I do the following (notice RUSTDOCFLAGS) :

RUST_BACKTRACE=full RUSTFLAGS="-C target-cpu=native" RUSTDOCFLAGS="-Z unstable-options --persist-doctests doctestbins" cargo +nightly-2023-10-17 test --profile release --doc \
                --features=shortint -p tfhe \
                -- test_user_docs::how_to_compress

then

find doctestbins -type f -name '*out'

and copy the only executable found in there

Yes I agree with the O3 thing, I just don't quite get why rustdoc does not use the configuration from the cargo profile provided in the command line

Edit: I'm guessing it's not an easy problem and there may be a reason for this

IceTDrinker · 2023-10-24T13:44:20Z

should there be a rustdoc specific bug report somewhere ? I have to say I posted here mainly because I found an old issue that looked similar

nikic · 2023-10-25T12:33:46Z

Thanks, using --persist-doctests worked!

I've used this LLVM patch (https://gist.github.com/nikic/7fd69aef8f3bb8401db508e5ff08324d) to identify _ZN4core3ops8function6FnOnce9call_once17h881b5e27c390a63eE.llvm.424339700988180217 as the miscompiled function. This is the extracted IR: https://gist.github.com/nikic/ad91e65c3332717d2e0855b4bf81734f

should there be a rustdoc specific bug report somewhere ? I have to say I posted here mainly because I found an old issue that looked similar

Yes, I think that would be a good idea. I think this is probably a cargo bug, as -C opt-level doesn't get passed to rustdoc.

IceTDrinker · 2023-10-25T12:37:29Z

Thanks a lot 🙏 and great that you could find the faulty function!

should there be a rustdoc specific bug report somewhere ? I have to say I posted here mainly because I found an old issue that looked similar

Yes, I think that would be a good idea. I think this is probably a cargo bug, as -C opt-level doesn't get passed to rustdoc.

Alright then you advise opening an issue on https://github.com/rust-lang/cargo to let them know of the opt level not being forwarded ?

nikic · 2023-10-25T13:01:38Z

Upstream issue: llvm/llvm-project#70207

Alright then you advise opening an issue on https://github.com/rust-lang/cargo to let them know of the opt level not being forwarded ?

Yeah. It might be intentional, but it seems suspicious to forward -C lto=fat but not -C opt-level=3.

apiraino · 2023-10-26T10:44:29Z

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-high

Update to LLVM 17.0.4 Fixes rust-lang#116668. Fixes rust-lang#116941. Fixes rust-lang#116976. r? `@cuviper`

- following merge of 17.0.4 in rust stable the bug uncovered by lto on aarch64 has been fixed rust-lang/rust#116941 so we remove the hard coded override

- following merge of 17.0.4 in rust stable the bug uncovered by lto on aarch64 has been fixed rust-lang/rust#116941 so we remove the hard coded override - update nightly toolchain to have fixed LLVM as well - fix lints linked to latest nightly

- following merge of 17.0.4 in rust stable the bug uncovered by lto on aarch64 has been fixed rust-lang/rust#116941 so we remove the hard coded override

IceTDrinker added the C-bug Category: This is a bug. label Oct 19, 2023

rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 19, 2023

rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Oct 20, 2023

apiraino added the E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example label Oct 23, 2023

nikic added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Oct 24, 2023

nikic self-assigned this Oct 25, 2023

IceTDrinker mentioned this issue Oct 25, 2023

Cargo apparently does not forward opt level to doctest rust-lang/cargo#12877

Closed

rustbot added P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Oct 26, 2023

nikic mentioned this issue Oct 31, 2023

Update to LLVM 17.0.4 #117436

Merged

bors added a commit to rust-lang-ci/rust that referenced this issue Nov 1, 2023

Auto merge of rust-lang#117436 - nikic:update-llvm-16, r=cuviper

252af66

Update to LLVM 17.0.4 Fixes rust-lang#116668. Fixes rust-lang#116941. Fixes rust-lang#116976. r? `@cuviper`

bors closed this as completed in d1611e3 Nov 1, 2023

3tilley pushed a commit to 3tilley/rust that referenced this issue Nov 1, 2023

Auto merge of rust-lang#117436 - nikic:update-llvm-16, r=cuviper

b760196

Update to LLVM 17.0.4 Fixes rust-lang#116668. Fixes rust-lang#116941. Fixes rust-lang#116976. r? `@cuviper`

IceTDrinker mentioned this issue Nov 30, 2023

chore(ci): re-enable release profile for doctest zama-ai/tfhe-rs#717

Closed

IceTDrinker mentioned this issue Nov 30, 2023

chore(ci): re-enable release profile for doctest zama-ai/tfhe-rs#721

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lto = "fat" causes doctest to generate invalid code for Apple M1 (and potentially x86) #116941

lto = "fat" causes doctest to generate invalid code for Apple M1 (and potentially x86) #116941

IceTDrinker commented Oct 19, 2023 •

edited

Loading

Noratrieb commented Oct 19, 2023

IceTDrinker commented Oct 19, 2023

IceTDrinker commented Oct 19, 2023

Noratrieb commented Oct 19, 2023

IceTDrinker commented Oct 20, 2023 •

edited

Loading

Noratrieb commented Oct 20, 2023

IceTDrinker commented Oct 20, 2023

nikic commented Oct 20, 2023

IceTDrinker commented Oct 20, 2023

IceTDrinker commented Oct 20, 2023 •

edited

Loading

IceTDrinker commented Oct 23, 2023

IceTDrinker commented Oct 23, 2023

IceTDrinker commented Oct 23, 2023 •

edited

Loading

IceTDrinker commented Oct 23, 2023 •

edited

Loading

IceTDrinker commented Oct 23, 2023 •

edited

Loading

IceTDrinker commented Oct 23, 2023

IceTDrinker commented Oct 24, 2023

nikic commented Oct 24, 2023

IceTDrinker commented Oct 24, 2023 •

edited

Loading

IceTDrinker commented Oct 24, 2023

nikic commented Oct 25, 2023

IceTDrinker commented Oct 25, 2023

nikic commented Oct 25, 2023

apiraino commented Oct 26, 2023

lto = "fat" causes doctest to generate invalid code for Apple M1 (and potentially x86) #116941

lto = "fat" causes doctest to generate invalid code for Apple M1 (and potentially x86) #116941

Comments

IceTDrinker commented Oct 19, 2023 • edited Loading

Meta

Noratrieb commented Oct 19, 2023

IceTDrinker commented Oct 19, 2023

IceTDrinker commented Oct 19, 2023

Noratrieb commented Oct 19, 2023

IceTDrinker commented Oct 20, 2023 • edited Loading

Noratrieb commented Oct 20, 2023

IceTDrinker commented Oct 20, 2023

nikic commented Oct 20, 2023

IceTDrinker commented Oct 20, 2023

IceTDrinker commented Oct 20, 2023 • edited Loading

IceTDrinker commented Oct 23, 2023

IceTDrinker commented Oct 23, 2023

IceTDrinker commented Oct 23, 2023 • edited Loading

IceTDrinker commented Oct 23, 2023 • edited Loading

IceTDrinker commented Oct 23, 2023 • edited Loading

IceTDrinker commented Oct 23, 2023

IceTDrinker commented Oct 24, 2023

nikic commented Oct 24, 2023

IceTDrinker commented Oct 24, 2023 • edited Loading

IceTDrinker commented Oct 24, 2023

nikic commented Oct 25, 2023

IceTDrinker commented Oct 25, 2023

nikic commented Oct 25, 2023

apiraino commented Oct 26, 2023

IceTDrinker commented Oct 19, 2023 •

edited

Loading

IceTDrinker commented Oct 20, 2023 •

edited

Loading

IceTDrinker commented Oct 20, 2023 •

edited

Loading

IceTDrinker commented Oct 23, 2023 •

edited

Loading

IceTDrinker commented Oct 23, 2023 •

edited

Loading

IceTDrinker commented Oct 23, 2023 •

edited

Loading

IceTDrinker commented Oct 24, 2023 •

edited

Loading