update images #311

edsantiago · 2023-11-06T16:07:28Z

Signed-off-by: Ed Santiago santiago@redhat.com

github-actions · 2023-11-06T17:15:45Z

Cirrus CI build successful. Found built image names and IDs:

Stage	Image Name	`IMAGE_SUFFIX`
base	debian	`do-not-use`
base	fedora	`do-not-use`
base	fedora-aws	`do-not-use`
base	fedora-aws-arm64	`do-not-use`
base	image-builder	`do-not-use`
base	prior-fedora	`do-not-use`
cache	build-push	`c20231106t160529z-f39f38d13`
cache	debian	`c20231106t160529z-f39f38d13`
cache	fedora	`c20231106t160529z-f39f38d13`
cache	fedora-aws	`c20231106t160529z-f39f38d13`
cache	fedora-netavark	`c20231106t160529z-f39f38d13`
cache	fedora-netavark-aws-arm64	`c20231106t160529z-f39f38d13`
cache	fedora-podman-aws-arm64	`c20231106t160529z-f39f38d13`
cache	fedora-podman-py	`c20231106t160529z-f39f38d13`
cache	prior-fedora	`c20231106t160529z-f39f38d13`
cache	rawhide	`c20231106t160529z-f39f38d13`
cache	win-server-wsl	`c20231106t160529z-f39f38d13`

Requires emergency override of containers.conf SNAFU with zstd:chunked containers/common#1730 Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago · 2023-11-09T21:09:49Z

@cevich if you have a spare moment could you look at the fedora-aws Base Image failure please?

==> fedora-aws: Stopping the source instance...
    fedora-aws: Stopping instance
==> fedora-aws: Waiting for the instance to stop...
==> fedora-aws: Error waiting for instance to stop: ResourceNotReady: exceeded wait attempts
==> fedora-aws: Provisioning step had errors: Running the cleanup provisioner, if present...
==> fedora-aws: Terminating the source AWS instance...
==> fedora-aws: ResourceNotReady: failed waiting for successful resource state

I can't find the string "Waiting for the instance to stop" anywhere in the likely source trees, so I have no idea what is running or what the bug is.

FWIW, the python-3.12 bug is a red herring, my last build threw the same error, but worked anyway.

cevich · 2023-11-10T18:40:12Z

The podman-py stuff I believe Urvashi sorted out. There's an actual bug in pylint and she found a workaround.

The error you got is coming from Packer. I've seen similar things before, it looks like a flake to me. It probably orphaned a VM (we can worry about that later). I restarted the task and will keep an eye on it as I'm able today...

cevich · 2023-11-10T19:00:05Z

...uggg. Amazon is having a bad day, re-running again...

edsantiago · 2023-11-10T19:30:33Z

It doesn't seem to be a flake. I restarted it four times yesterday.

cevich · 2023-11-10T20:04:26Z

I don't think we've changed the packer version recently, so it must be something on the Amazon side. Perhaps triggering a bug in packer.

In my last attempt, I found the line:
fedora-aws: Instance ID: i-0ac23dd69f36c7d41

I looked that instance up on the AWS EC2 console, and it shows the status as "terminated" - which is correct.

I found a few other instances in a "stopped" state, that shouldn't happen.

If you'd like to try figuring and bumping up the packer timeout, that may get you past the hump. Otherwise we may need a newer version (which may not accept our current cloud.yml).

cevich · 2023-11-10T20:10:02Z

Looking again:

==> fedora-aws: Waiting for the instance to stop...
==> fedora-aws: Error waiting for instance to stop: ResourceNotReady: exceeded wait attempts

I bet amazon changed some timings on their end. Such that (for example) it tries an ACPI shutdown, waits, tries again, waits, then "yanks the plug". If the timings of any of that collide with what packer is expecting, we'd get this problem.

It's highly-likely there's a timeout setting for this, probably needs to be added to the cloud.yml. Sometimes they (HashiCorp) do it as a CLI option or by env. vars. But I doubt it in this case.

cevich · 2023-11-10T20:14:57Z

If we need to dig deeper, there are options here as well. AWS keeps a log of basically every API request per user. So it's pretty easy to see if and when the request came in. In this case, it does look like a StopInstance is being received, runs for ~10 minutes, then there's a TerminateInstance call.

edsantiago · 2023-11-14T16:10:15Z

Closing in favor of #312. Hoping all these timeouts and errors go away.

cevich mentioned this pull request Nov 7, 2023

Update dependency containers/automation_images to v20231004 containers/podman-py#335

Merged

1 task

edsantiago force-pushed the f39_bump branch 2 times, most recently from 5e51325 to d33ecc1 Compare November 9, 2023 15:08

update images

2b60f12

Requires emergency override of containers.conf SNAFU with zstd:chunked containers/common#1730 Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago force-pushed the f39_bump branch from d33ecc1 to 2b60f12 Compare November 9, 2023 19:51

edsantiago closed this Nov 14, 2023

edsantiago mentioned this pull request Nov 14, 2023

New f39 (official, not beta) image #312

Merged

edsantiago deleted the f39_bump branch November 15, 2023 19:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update images #311

update images #311

edsantiago commented Nov 6, 2023

github-actions bot commented Nov 6, 2023

edsantiago commented Nov 9, 2023

cevich commented Nov 10, 2023

cevich commented Nov 10, 2023

edsantiago commented Nov 10, 2023

cevich commented Nov 10, 2023 •

edited

cevich commented Nov 10, 2023

cevich commented Nov 10, 2023

edsantiago commented Nov 14, 2023 •

edited

update images #311

update images #311

Conversation

edsantiago commented Nov 6, 2023

github-actions bot commented Nov 6, 2023

edsantiago commented Nov 9, 2023

cevich commented Nov 10, 2023

cevich commented Nov 10, 2023

edsantiago commented Nov 10, 2023

cevich commented Nov 10, 2023 • edited

cevich commented Nov 10, 2023

cevich commented Nov 10, 2023

edsantiago commented Nov 14, 2023 • edited

cevich commented Nov 10, 2023 •

edited

edsantiago commented Nov 14, 2023 •

edited