Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/test-sysroot.js intermittently failing on s390x #2527

Open
smcv opened this issue Jan 25, 2022 · 5 comments
Open

tests/test-sysroot.js intermittently failing on s390x #2527

smcv opened this issue Jan 25, 2022 · 5 comments

Comments

@smcv
Copy link
Contributor

smcv commented Jan 25, 2022

The unit test tests/test-sysroot.js seems to be intermittently failing on Debian's s390x port since October (2021.5). It doesn't always fail, and after failing, it consistently succeeds when the build is retried.

This is happening in a transient chroot environment on autobuilders that are not accessible to ordinary Debian developers, so I am unable to get any information about the failed builds beyond what's in the logs.

The failing assertion is this one:

/// TEST: We can delete the deployment, going back to empty
sysroot.write_deployments([], null);

print("OK empty deployments");

assertEquals(deploymentPath.query_exists(null), false);

I have never had any success with taking s390x-specific issues to Debian's s390x architecture porting team (which might in fact not contain any people), but I hear several ostree developers now work for an IBM subsidiary, so perhaps someone there is better-placed than me to know about s390x-specific issues or see whether this is reproducible in a development environment?

We've seen this with gjs 1.68.4 and 1.70.0. Full logs for some recent versions: 2022.1, 2021.6

@dbnicholson
Copy link
Member

That's interesting. So, the call to ostree_sysroot_write_deployments succeeds, but it's either not cleaning up the old deployments or g_file_query_exists is lying. Or maybe deploymentPath isn't what's expected?

Pursuing the g_file_query_exists is lying angle, it appears that GIO uses either statx or lstat preferring statx if it was available at build time. Perhaps 2021.5 is when statx started being used in GIO and it's flaky on the s390x builder? A way to cross check is to use g_file_test, which uses access to test existence. You could try adding this to the test:

diff --git a/tests/test-sysroot.js b/tests/test-sysroot.js
index d4f67ef4..d9a78dc3 100755
--- a/tests/test-sysroot.js
+++ b/tests/test-sysroot.js
@@ -93,6 +93,8 @@ sysroot.write_deployments([], null);
 
 print("OK empty deployments");
 
+print("Deployment path: " + deploymentPath.get_path());
+assertEquals(GLib.file_test(deploymentPath.get_path(), GLib.FileTest.EXISTS), false);
 assertEquals(deploymentPath.query_exists(null), false);
 
 //// Ok, redeploy, then add a new revision upstream and pull it

And here's a hack to get a little more info about cleaning up deployments:

diff --git a/src/libostree/ostree-sysroot-cleanup.c b/src/libostree/ostree-sysroot-cleanup.c
index 3471cac7..9ca7fcc6 100644
--- a/src/libostree/ostree-sysroot-cleanup.c
+++ b/src/libostree/ostree-sysroot-cleanup.c
@@ -325,8 +325,12 @@ cleanup_old_deployments (OstreeSysroot       *self,
       g_autofree char *deployment_path = ostree_sysroot_get_deployment_dirpath (self, deployment);
 
       if (g_hash_table_lookup (active_deployment_dirs, deployment_path))
-        continue;
+        {
+          g_print ("Skipping cleanup of active deployment %s\n", deployment_path);
+          continue;
+        }
 
+      g_print ("Cleaning up deployment %s\n", deployment_path);
       if (!_ostree_sysroot_rmrf_deployment (self, deployment, cancellable, error))
         return FALSE;
     }

@smcv
Copy link
Contributor Author

smcv commented Jan 26, 2022

Debian is currently rebuilding half the archive to recover from a binutils regression, so I am probably not going to be able to test this until the autobuilders recover, sorry.

Because this is intermittent, I can't know that a successful build is really a success, and because the autobuilders are production infrastructure, I can't just keep hitting rebuild. I'll try doing manual builds on a s390x "porter box" when I get a chance, but there's no guarantee that that will match the autobuilder's behaviour.

Perhaps 2021.5 is when statx started being used in GIO and it's flaky on the s390x builder?

Use of statx seems to have been new in 2.66.x, and we had several consecutive successful builds of ostree on s390x after 2.66.x was introduced, so I think it's probably not that... but because it's intermittent, I can't be sure.

That's interesting. So, the call to ostree_sysroot_write_deployments succeeds, but it's either not cleaning up the old deployments or g_file_query_exists is lying. Or maybe deploymentPath isn't what's expected?

In some older builds, like 2021.1-1, we seem to have had other tests failing when they asserted that a directory should not exist, but it did - and those assertions were in shell scripts using test -d, so probably not statx? (But I don't know, maybe bash genuinely does use statx for builtins.)

@dbnicholson
Copy link
Member

Oh, I didn't mean to try to debug it right now. I can imagine that s390x debugging is nowhere near the top of your queue. Just that if you do get around to it, it would be helpful to try to narrow down the issue.

@nikita-dubrovskii
Copy link
Contributor

nikita-dubrovskii commented Jan 31, 2022

Hi all, i've tried make && make check many times on Fedora35:

  • GLIB version 2.70.3
  • GNU C Library (GNU libc) stable release version 2.34.
  • Linux 5.15.17-200.fc35.s390x

and wasn't able to reproduce the issue:

PASS: tests/test-sysroot.js 1 test-sysroot

============================================================================
Testsuite summary for libostree 2022.2
============================================================================
# TOTAL: 1005
# PASS:  962
# SKIP:  43
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0

raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Mar 12, 2022
@smcv
Copy link
Contributor Author

smcv commented Dec 6, 2022

I was unable to reproduce this on the Debian-developer-accessible s390x that is meant to be the closest thing there is to being able to access an autobuilder interactively (build + tests succeeded in 2/2 attempts), but 2022.7 failed in this way on 3/3 attempts on Debian's official s390x autobuilders, so there might be something about Debian's official autobuilder infrastructure that makes this test more likely to fail.

My ability to debug that is extremely limited, because only sysadmins have any sort of interactive access to the autobuilder machines, so this is unlikely to go further without someone else picking this up.

raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Dec 13, 2022
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Sep 14, 2023
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Sep 14, 2023
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Sep 14, 2023
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Sep 14, 2023
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Sep 21, 2023
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Nov 9, 2023
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Nov 9, 2023
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Nov 9, 2023
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Dec 14, 2023
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
raspbian-autopush pushed a commit to raspbian-packages/ostree that referenced this issue Jan 9, 2024
This test regularly fails on the buildds, but I cannot reproduce the
failure on a porterbox.

Bug: ostreedev/ostree#2527
Bug-Debian: https://bugs.debian.org/1025532
Forwarded: not-needed

Gbp-Pq: Topic debian
Gbp-Pq: Name test-sysroot-Skip-on-s390x-by-default.patch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants