Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spurious build error "missing preprocessor_successor.h" #242

Closed
kaspar030 opened this issue Feb 5, 2024 · 19 comments · Fixed by #246
Closed

spurious build error "missing preprocessor_successor.h" #242

kaspar030 opened this issue Feb 5, 2024 · 19 comments · Fixed by #246

Comments

@kaspar030
Copy link
Contributor

error: failed to run custom build command for `riot-sys v0.7.9 (https://github.com/RIOT-OS/rust-riot-sys#b4bd4bde)`

Caused by:
  process didn't exit successfully: `/data/riotbuild/riotbase/examples/rust-hello-world/bin/native/target/release/build/riot-sys-8549200ee379e3d6/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-env-changed=BUILDING_RIOT_RS
  cargo:rerun-if-env-changed=RIOT_CC
  cargo:rerun-if-env-changed=RIOT_CFLAGS
  cargo:rerun-if-env-changed=RIOT_COMPILE_COMMANDS_JSON
  cargo:rerun-if-changed=/data/riotbuild/riotbase/examples/rust-hello-world/bin/native/cargo-compile-commands.json
  cargo:rerun-if-env-changed=RIOT_USEMODULE
  cargo:CC=clang
  cargo:CFLAGS=-DDEVELHELP -Werror -Wall -Wextra -pedantic -g3 -Og -U_FORTIFY_SOURCE "-std=gnu11" -m32 -fstack-protector-all -ffunction-sections -fdata-sections -DDEBUG_ASSERT_VERBOSE "-DRIOT_APPLICATION=\"hello-world\"" "-DBOARD_NATIVE=\"native\"" "-DRIOT_BOARD=BOARD_NATIVE" "-DCPU_NATIVE=\"native\"" "-DRIOT_CPU=CPU_NATIVE" "-DMCU_NATIVE=\"native\"" "-DRIOT_MCU=MCU_NATIVE" -fwrapv -Wstrict-overflow -fno-common -ffunction-sections -fdata-sections -Wall -Wextra -Wmissing-include-dirs -fno-delete-null-pointer-checks -fdiagnostics-color -Wstrict-prototypes -Wold-style-definition -gz "-Wformat=2" -Wformat-overflow -Wformat-truncation "-fmacro-prefix-map=/data/riotbuild/riotbase/=" -Wcast-align -include /data/riotbuild/riotbase/examples/rust-hello-world/bin/native/riotbuild/riotbuild.h -Wno-unknown-warning-option -DNATIVE_INCLUDES -I/data/riotbuild/riotbase/boards/native/include -I/data/riotbuild/riotbase/core/lib/include -I/data/riotbuild/riotbase/core/include -I/data/riotbuild/riotbase/drivers/include -I/data/riotbuild/riotbase/cpu/native/include -I/data/riotbuild/riotbase/sys/include -DMODULE_AUTO_INIT -DMODULE_BOARD -DMODULE_BOARD_COMMON_INIT -DMODULE_CORE -DMODULE_CORE_IDLE_THREAD -DMODULE_CORE_INIT -DMODULE_CORE_LIB -DMODULE_CORE_MSG -DMODULE_CORE_PANIC -DMODULE_CORE_THREAD -DMODULE_CPU -DMODULE_LIBC -DMODULE_NATIVE_DRIVERS -DMODULE_PERIPH -DMODULE_PERIPH_COMMON -DMODULE_PERIPH_GPIO -DMODULE_PERIPH_GPIO_LINUX -DMODULE_PERIPH_INIT -DMODULE_PERIPH_INIT_GPIO -DMODULE_PERIPH_INIT_GPIO_LINUX -DMODULE_PERIPH_INIT_LED0 -DMODULE_PERIPH_INIT_LED1 -DMODULE_PERIPH_INIT_LED2 -DMODULE_PERIPH_INIT_LED3 -DMODULE_PERIPH_INIT_LED4 -DMODULE_PERIPH_INIT_LED5 -DMODULE_PERIPH_INIT_LED6 -DMODULE_PERIPH_INIT_LED7 -DMODULE_PERIPH_INIT_LEDS -DMODULE_PERIPH_INIT_PM -DMODULE_PERIPH_INIT_UART -DMODULE_PERIPH_PM -DMODULE_PERIPH_UART -DMODULE_PREPROCESSOR -DMODULE_PREPROCESSOR_SUCCESSOR -DMODULE_STDIO_NATIVE -DMODULE_SYS
  cargo:rerun-if-changed=riot-bindgen.h

  --- stderr
  /data/riotbuild/riotbase/sys/include/auto_init_utils.h:35:10: fatal error: 'preprocessor_successor.h' file not found
  thread 'main' panicked at /data/riotbuild/.cargo/git/checkouts/rust-riot-sys-d12733b89271907c/b4bd4bd/build.rs:224:10:
  Unable to generate bindings: ClangDiagnostic("/data/riotbuild/riotbase/sys/include/auto_init_utils.h:35:10: fatal error: 'preprocessor_successor.h' file not found\n")
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
make: *** [/data/riotbuild/riotbase/makefiles/cargo-targets.inc.mk:53: /data/riotbuild/riotbase/examples/rust-hello-world/bin/native/target/i686-unknown-linux-gnu/release/librust_hello_world.a] Error 101
make[1]: Leaving directory '/home/runner/work/riotdocker/riotdocker/RIOT/examples/rust-hello-world'
make[1]: *** [/home/runner/work/riotdocker/riotdocker/RIOT/makefiles/docker.inc.mk:351: ..in-docker-container] Error 2
failed!
@kaspar030
Copy link
Contributor Author

So this is what I found out so far:

So the file (preprocessor_successor.h) gets created in sys/preprocessor/Makefile.include (or better, the rule for it).
There are two modules, preprocessor and preprocessor_successor. One actually guards the includes. auto_init depends on both.
There's some special treatment of auto_init in dependency_resolution.mk:

https://github.com/RIOT-OS/RIOT/blob/8bf61336a293bda3ccb845131a12cb7617c9f52c/makefiles/dependency_resolution.inc.mk#L40-L46

but, all of auto_init, preprocessor and preprocessor_successor show in native examples/rust-hello-world make info-modules.

hmm

@chrysn
Copy link
Member

chrysn commented Feb 5, 2024

IIRC this is not exclusive to riotdocker, but just only happens when building on GitHub infrastructure (of which riotdocker is one major user as regular CI is on murdock); could be wrong though, failed to find the other build failures so far.

@kaspar030
Copy link
Contributor Author

Agreed. So, what's special about github workers? What comes to mind is that they're comparatively slow, and that they build from a clean slate every time...

@chrysn
Copy link
Member

chrysn commented Feb 5, 2024

I still hope to find a "rational" explanation, but last time I was looking into that I had the suspicion that they might be running on some file system that isn't as strict about operations sequencing as Make would need it.

Right now my next attempt would be to make riot-wrapperssys explicitly wait for this particular file to become present, and to add some debug output around the production of preprocessor_successor.h.

@kaspar030
Copy link
Contributor Author

hm, TBH, github workers being used as CI for half the world, I wouldn't count on the FS being incompatible with make.
I'm trying to repro "at home".

@kaspar030
Copy link
Contributor Author

What I find odd is that a Makefile.include, which is included in every sub-make, creates a rule for a file.

@chrysn
Copy link
Member

chrysn commented Feb 5, 2024

Attempting things at #243 -- it adds up to 20s of sleep (finally failing) before accepting that preprocessor_successor.h is really not there.

chrysn added a commit to chrysn-pull-requests/riotdocker that referenced this issue Feb 5, 2024
@chrysn
Copy link
Member

chrysn commented Feb 5, 2024

I'm trying to repro "at home".

Having a reproduced setup would be awesome. I've been burning through 2x20 minutes of CI time because the Docker image gets rebuilt every time even though the rules have not changed (and I'd need to try out a few things more to see why the upgrade to the debug branch doesn't take).

@kaspar030
Copy link
Contributor Author

Having a reproduced setup would be awesome.

... but I can't catch it locally :(

chrysn added a commit to chrysn-pull-requests/riotdocker that referenced this issue Feb 5, 2024
chrysn added a commit to chrysn-pull-requests/riotdocker that referenced this issue Feb 5, 2024
chrysn added a commit to RIOT-OS/rust-riot-sys that referenced this issue Feb 5, 2024
@chrysn
Copy link
Member

chrysn commented Feb 5, 2024

Waaah, all the time I've been testing and 01750a5e [edit: link fixed; branch is https://github.com/RIOT-OS/rust-riot-sys/tree/debug-preprocessor_successor-race] was not committed yet -- still, the last test wen through. I don't see the old runs, but this must have been the first in ~8 builds that went through. Actually pushed the race resolving sleep loop now, retrying ... if it passes on first try, it's either externally stateful (load of ... something?) or really fixed.

@chrysn
Copy link
Member

chrysn commented Feb 5, 2024

That did pass on first try. Re-running #241 now. If that does pass, I'll assume there is some time-of-day-ish influence and let it be for now until the problem resurfaces. If that does not pass, I'll give this here a re-run (possibly with some extra debug output), and consider it preliminarily successful (in which case the next step is cleaning up the race workaround so that it is less brittle).

chrysn added a commit to RIOT-OS/rust-riot-sys that referenced this issue Feb 5, 2024
@chrysn
Copy link
Member

chrysn commented Feb 5, 2024

The #241 rerun did fail, so let's try with a more verbose #243...

[edit: keeping a tally] So far counted 3 successes on 243 and 2 failures on 241 up to 19:54CET.

[edit] I'm a bit discouraged in the approach because I don't see the "preprocessor_successor.h not found,sleeping suspecting a race condition" warnings, but those may be silenced because it's not a local package.

[edit] Did more tests: Indeed build system warnings from git-included modules are only shown when the whole thing fails.

chrysn added a commit to RIOT-OS/rust-riot-sys that referenced this issue Feb 5, 2024
@chrysn
Copy link
Member

chrysn commented Feb 5, 2024

So far, this looks good. I'm running one more test with

diff --git a/build.rs b/build.rs
index 05c216f..eead5d6 100644
--- a/build.rs
+++ b/build.rs
@@ -185,6 +185,32 @@ fn main() {
         })
         .collect();
 
+    // Debug helper for https://github.com/RIOT-OS/riotdocker/issues/242
+    let preprocessor_include = cflags
+        .iter()
+        .filter_map(|f| f.strip_prefix("-I"))
+        .filter(|f| f.ends_with("/preprocessor"))
+        .next();
+    if let Some(preprocessor_include) = preprocessor_include {
+        let filename = format!("{}/preprocessor_successor.h", preprocessor_include);
+        let mut countdown: usize = 10;
+        loop {
+            if std::path::PathBuf::from(&filename).exists() {
+                if countdown != 10 {
+                    panic!("File reappeared. I would have caught a race condition, but chose not to for visibility");
+                }
+                break;
+            }
+            countdown = countdown.checked_sub(1).unwrap_or_else(|| {
+                panic!("Preprocessor include path {} was present but preprocessor_successor.h did not show up there after 20 seconds", filename);
+            });
+
+            // See also https://github.com/RIOT-OS/riotdocker/issues/242
+            println!("cargo:warning=preprocessor_successor.h not found,sleeping suspecting a race condition");
+            std::thread::sleep(std::time::Duration::from_secs(2));
+        }
+    }
+
     let bindings = builder()
         .header("riot-bindgen.h")
         .clang_args(&cflags)

which, while failing, will clearly state that there was a race condition. If that happens, I'll remove the panic again, and leave this workaround in until we manage to reproduce this on a system that we can debug.

[edit: Great, now this passed, ie. the race condition did not occur. Retrying the build hoping for a failure -- both "yes I avoided the race condition" and "no race condition, the file never showed" would be valuable, but passing at this point is just annoying...]

@kaspar030
Copy link
Contributor Author

[edit: Great, now this passed, ie. the race condition did not occur. Retrying the build hoping for a failure -- both "yes I avoided the race condition" and "no race condition, the file never showed" would be valuable, but passing at this point is just annoying...]

uff...

I tried reproducing this locally, did probably 500 "buildtest" runs, trying to inch closer to how the actions start the build. Tried "make --shuffle". "make -j". "make -j1". delaying both creating the preprocessor.h and using it, surrounding it with printouts. ... It didn't fail a single time. (I'm open to the "file system buggy" idea now. 😄 )

@chrysn
Copy link
Member

chrysn commented Feb 6, 2024

Hah, found it! It's about the sequence of files in a directory listing.

This was fixed in RIOT-OS/rust-riot-sys#36, which comes back to haunt us because riotdocker builds with the last stable RIOT version (currently 2023.10-branch).

Shall we backport the riot-sys update, or just bump the branch we're using to test the images, given the 2024.01 release is imminent?

@kaspar030
Copy link
Contributor Author

I would say, bump branch!

@chrysn
Copy link
Member

chrysn commented Feb 6, 2024

So far these have been "increment the branch and declare that we're building the next tag of CI systems". Can we do that already, or do we first set the RIOT branch to 2024.01 but still keep building for the 2024.01 release?

@kaspar030
Copy link
Contributor Author

Wow, I'm confused about this every time. 😏
I think we can update the branch but still build for 2024.01. As soon as 2024.01 is released, we can raise the version tag.

@kaspar030
Copy link
Contributor Author

I've opened a PR that bumps the branch: #246

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants