machine: Volume ops test: statfs /private/tmp/ci/ginkgoNNN: no such file or directory #22569

edsantiago · 2024-05-01T20:09:31Z

Not quite as frequent or annoying as #22551, but still causing wasted runs:

run basic podman commands
  Volume ops
....
  Trying to pull quay.io/libpod/alpine_nginx:latest...
  ...
  Writing manifest to image destination
  WARNING: image platform (linux/amd64) does not match the expected platform (linux/arm64)
  Error: statfs /private/tmp/ci/ginkgo630638030: no such file or directory

darwin : machine-mac podman darwin rootless host sqlite
- PR ExitWithError() - pod_xxx tests #22552
  - 05-01 12:16 in run basic podman commands Volume ops
- PR ExitWithError() - yet more low-hanging fruit #22489
  - 04-24 17:00 in run basic podman commands Volume ops
  - 04-24 12:06 in run basic podman commands Volume ops
- PR ExitWithError() - more low-hanging fruit #22486
  - 04-24 10:09 in run basic podman commands Volume ops
- PR fix(deps): update module github.com/docker/docker to v26.1.0+incompatible #22461
  - 04-22 18:43 in run basic podman commands Volume ops

x	x	x	x	x	x
machine-mac(5)	podman(5)	darwin(5)	rootless(5)	host(5)	sqlite(5)

The text was updated successfully, but these errors were encountered:

edsantiago · 2024-05-02T20:24:51Z

@cevich is there any chance whatsoever that https://github.com/containers/podman/blob/c9644ebccf14309a77769cba00833cd139509e4a/contrib/cirrus/mac_cleanup.sh is getting invoked in the middle of a running CI job? I just can't understand this bug and am grasping at straws.

Luap99 · 2024-05-03T11:15:10Z

Unlikely, there is a extra level of indirectness here given the dir is mounted in the machine VM. As such maybe the machine mount failed silently?

cevich · 2024-05-03T19:21:36Z

is there any chance whatsoever that

What Paul said. And the Mac's are single-task/single-user. Is it possible the running VM really is x86_64 via some emulation and/or is the pull command specifying --arch or --platform (just double-checking)?

Sorry no hack/get_ci_vm.sh support here, that's just way to complex with this environment to do safely. But in case it helps and is supported (I never checked), the re-run in terminal may be an option (with cleanup temporarily disabled).

Otherwise, there is a way to isolate (for a few hours) one of the Macs and dedicate it to servicing a single PR. In that PR, the end-of-task cleanup could be disabled, so that a human may ssh in and check out the state of things. This is all manual, and a bit of a chore to pull off, but it's technically possible.

Luap99 · 2024-05-15T11:29:05Z

@edsantiago Not sure if you are testing machine in your non flake retry testing PR but if you do could you give this a go:

diff --git a/pkg/machine/apple/apple.go b/pkg/machine/apple/apple.go
index 93201407e..04db7638b 100644
--- a/pkg/machine/apple/apple.go
+++ b/pkg/machine/apple/apple.go
@@ -124,7 +124,7 @@ func GenerateSystemDFilesForVirtiofsMounts(mounts []machine.VirtIoFs) ([]ignitio
        mountPrep.Add("Service", "Type", "oneshot")
        mountPrep.Add("Service", "ExecStartPre", "chattr -i /")
        mountPrep.Add("Service", "ExecStart", "mkdir -p '%f'")
-       mountPrep.Add("Service", "ExecStopPost", "chattr +i /")
+       // mountPrep.Add("Service", "ExecStopPost", "chattr +i /")
 
        mountPrep.Add("Install", "WantedBy", "remote-fs.target")
        mountPrepFile, err := mountPrep.ToString()

edsantiago · 2024-05-15T11:35:15Z

Oops, no, I long ago disabled machine tests in #17831. I will look into reenabling this one.

FWIW here's the current flake list. I don't think there's any useful info in this list, i.e., I haven't seen any logs that look different or provide interesting new data, but am posting anyway.

darwin : machine-mac podman darwin rootless host sqlite
- PR libpod: wait for healthy on main thread #22658
  - 05-13 16:18 in run basic podman commands Volume ops
- PR ExitWithError() - s files #22582
  - 05-02 20:13 in run basic podman commands Volume ops
- PR ExitWithError() - pod_xxx tests #22552
  - 05-01 12:16 in run basic podman commands Volume ops
- PR ExitWithError() - yet more low-hanging fruit #22489
  - 04-24 17:00 in run basic podman commands Volume ops
  - 04-24 12:06 in run basic podman commands Volume ops
- PR ExitWithError() - more low-hanging fruit #22486
  - 04-24 10:09 in run basic podman commands Volume ops
- PR fix(deps): update module github.com/docker/docker to v26.1.0+incompatible #22461
  - 04-22 18:43 in run basic podman commands Volume ops

x	x	x	x	x	x
machine-mac(7)	podman(7)	darwin(7)	rootless(7)	host(7)	sqlite(7)

Luap99 · 2024-05-15T11:46:07Z

The alternative is I instrument the tests to do some checks. Basically I it would have to ssh into the machine VM and run systemctl status on all the mount units. I think the race here is the most likely cause.

One interesting point would be the new machine init with volume test, if this never fails then I am sure this is a race due the parallel running chattr -i and chattr +i in different units. Reason this tests mounts only one path so there cannot be a race, however the default volumes are several paths thus the chance for the race.

…bugging containers#22569 Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago added flakes Flakes from Continuous Integration macos MacOS (OSX) related machine labels May 1, 2024

edsantiago added a commit to edsantiago/libpod that referenced this issue May 15, 2024

EXPERIMENTAL: reenable Mac machine test, with patch from Paul, for de…

4c080ee

…bugging containers#22569 Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago added a commit to edsantiago/libpod that referenced this issue May 15, 2024

EXPERIMENTAL: reenable Mac machine test, with patch from Paul, for de…

12059b3

…bugging containers#22569 Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago added a commit to edsantiago/libpod that referenced this issue May 23, 2024

EXPERIMENTAL: reenable Mac machine test, with patch from Paul, for de…

6bf01be

…bugging containers#22569 Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago added a commit to edsantiago/libpod that referenced this issue Jun 3, 2024

EXPERIMENTAL: reenable Mac machine test, with patch from Paul, for de…

7910900

…bugging containers#22569 Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago added a commit to edsantiago/libpod that referenced this issue Jun 3, 2024

EXPERIMENTAL: reenable Mac machine test, with patch from Paul, for de…

8672231

…bugging containers#22569 Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago added a commit to edsantiago/libpod that referenced this issue Jun 5, 2024

EXPERIMENTAL: reenable Mac machine test, with patch from Paul, for de…

28882ca

…bugging containers#22569 Signed-off-by: Ed Santiago <santiago@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

machine: Volume ops test: statfs /private/tmp/ci/ginkgoNNN: no such file or directory #22569

machine: Volume ops test: statfs /private/tmp/ci/ginkgoNNN: no such file or directory #22569

edsantiago commented May 1, 2024

edsantiago commented May 2, 2024

Luap99 commented May 3, 2024

cevich commented May 3, 2024

Luap99 commented May 15, 2024

edsantiago commented May 15, 2024

Luap99 commented May 15, 2024

machine: Volume ops test: statfs /private/tmp/ci/ginkgoNNN: no such file or directory #22569

machine: Volume ops test: statfs /private/tmp/ci/ginkgoNNN: no such file or directory #22569

Comments

edsantiago commented May 1, 2024

edsantiago commented May 2, 2024

Luap99 commented May 3, 2024

cevich commented May 3, 2024

Luap99 commented May 15, 2024

edsantiago commented May 15, 2024

Luap99 commented May 15, 2024