Proxied update connections are broken in F38 #1477

fifofonix · 2023-04-19T13:31:30Z

Describe the bug

F37 servers configured behind a corporate proxy cease to be able to apply rpm-ostree updates once they have upgraded to F38. Newly provisioned F38 servers are similarly affected.

Related to: https://bugzilla.redhat.com/show_bug.cgi?id=2185433

Reproduction steps

Configure a server like this: https://docs.fedoraproject.org/en-US/fedora-coreos/proxy/
Zincati detects any updates correctly via proxy (sudo systemctl status zincati)
Rpm-ostreed times out on attempt to stage release (sudo systemctl status rpm-ostreed)

rpm-ostree[52378]: libostree HTTP error from remote fedora for <https://ostree.fedoraproject.org/mirrorlist>: Timeout was reached
rpm-ostree[52378]: Txn Deploy on /org/projectatomic/rpmostree1/fedora_coreos failed: While pulling bfbc0cd30068bd5a7eaac5bac2f0420d01652f073fee64c8d2b0b37868c801e7: While fetching mirrorlist 'https://ostree.fedoraproject.org/mirrorlist'

Expected behavior

OS updates should continue to be applied unattended as previously.

Actual behavior

Node is stuck with inability to apply future updates.

It is possible to rollback to F37 on nodes that have upgraded. However, critically watch out for: #1473

System details

Nodes behind a corporate proxy
VMWare (Version: 38.20230414.1.0 (2023-04-14T10:19:13Z)

Butane or Ignition config

No response

Additional information

No response

The text was updated successfully, but these errors were encountered:

travier · 2023-04-19T13:47:57Z

We paused the F38 rollout: coreos/fedora-coreos-streams#700

fifofonix · 2023-04-19T13:55:47Z

Not at all sure what the root cause issue behind the BZ is yet.

But, I wonder whether some of the historical means to configure proxies might work to avoid having to rollback a server or re-provision it (as a temporary solution): coreos/rpm-ostree#762

jmarrero · 2023-04-19T22:29:08Z

Thank you so much for the detailed bug report @fifofonix!

It looks like the issue is with libcurl/curl packages as downgrading to libcurl/curl 7.86.0-4 solves the issue and upgrading it to 8.0.1-2 works as expected too.

I can continue to reproduce the issue consistently with the first curl/libcurl 7.87 build (7.87.0-1) and the last two ones for F38 7.87.0-6 & 7.87.0-7

I have reached out the curl maintainer in the BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2185433 and we are trying to pinpoint the possible bug fix regression on the curl code.

travier · 2023-04-21T13:18:14Z

Update is out: https://bodhi.fedoraproject.org/updates/FEDORA-2023-eec1379708

jmarrero · 2023-04-21T13:26:28Z

verified that the update works for me with:

rpm-ostree override replace https://bodhi.fedoraproject.org/updates/FEDORA-2023-eec1379708

dustymabe · 2023-04-21T13:41:03Z

Fast track PRs:

jlebon · 2023-04-21T20:20:09Z

So how do we want to tackle the semi-broken testing release (38.20230414.2.0)? Presumably, there are nodes on there that use a proxy that will stay stuck there. Should we send a coreos-status email with instructions to either rollback, reprovision, or do an override replace with the fixed libcurl?

We should also consider marking the release as a deadend some time after the rollout for the fixed testing release has finished. Nodes that aren't stuck will have upgraded. Nodes that are stuck will get the MOTD. (We could mark as a deadend immediately; IIRC it doesn't prevent upgrades from happening, but the MOTD would be incorrect on nodes that aren't actually stuck.)

dustymabe · 2023-04-21T20:32:15Z

The fix for this went into testing stream release 38.20230414.2.1. Please try out the new release and report issues.

dustymabe · 2023-04-21T20:37:29Z

So how do we want to tackle the semi-broken testing release (38.20230414.2.0)? Presumably, there are nodes on there that use a proxy that will stay stuck there. Should we send a coreos-status email with instructions to either rollback, reprovision, or do an override replace with the fixed libcurl?

Yeah we should probably send a status post. I hesitate to instruct people to rollback at this point because it's a major rollback and there might be some things that don't work as a result (even if they were just going to immediately re-upgrade).

override replace sounds the nicest, but we'd have to make sure they removed the override after upgrade.. I guess we could just give them instructions to make it ephemeral so it wouldn't have to be override removed.

We should also consider marking the release as a deadend some time after the rollout for the fixed testing release has finished. Nodes that aren't stuck will have upgraded. Nodes that are stuck will get the MOTD. (We could mark as a deadend immediately; IIRC it doesn't prevent upgrades from happening, but the MOTD would be incorrect on nodes that aren't actually stuck.)

If the MOTD is all that we'd get from that I think I'd vote to not to do this. We'd have to deadend every 38 next release so far too.

I imagine behind a proxy isn't a huge part of our userbase or else we would have had a user report this problem before it got to testing? Though maybe all of them just aren't running next like @fifofonix is (thank you @fifofonix!).

fifofonix · 2023-04-24T11:41:30Z

Interested in bringing a handful of next nodes (and one testing node) back into the fold. What is the ephemeral way to do this rpm-ostree override replace? I presume one advantage of this method is that it would also allow addressing newly provisioned nodes on the now defunct release as well as nodes that have upgraded to this state (of which I have a few)?

dustymabe · 2023-04-24T12:24:45Z

on x86_64 maybe try something like (untested):

sudo systemctl stop zincati
sudo rpm-ostree usroverlay
sudo rpm -Uvh https://kojipkgs.fedoraproject.org//packages/curl/7.87.0/8.fc38/x86_64/curl-7.87.0-8.fc38.x86_64.rpm7.87.0-8.fc38.x86_64.rpm https://kojipkgs.fedoraproject.org//packages/curl/7.87.0/8.fc38/x86_64/libcurl-minimal-7.87.0-8.fc38.x86_64.rpm
sudo systemctl start zincati

The update rollout window for the new release starts this morning so you might not see an update happen immediately.

fifofonix · 2023-04-24T15:29:11Z

Slight modification to above did yield an update to a 'stuck' node - for the time being pending the start of the new rollout the update was to an equally 'stuck' 38.20230417.1.0 - but the point is this proves how to move forward.

sudo systemctl stop zincati
sudo rpm-ostree usroverlay
sudo rpm -Uvh https://kojipkgs.fedoraproject.org//packages/curl/7.87.0/8.fc38/x86_64/curl-7.87.0-8.fc38.x86_64.rpm https://kojipkgs.fedoraproject.org//packages/curl/7.87.0/8.fc38/x86_64/libcurl-minimal-7.87.0-8.fc38.x86_64.rpm
sudo systemctl restart rpm-ostreed
sudo systemctl start zincati

dustymabe · 2023-04-24T15:45:11Z

for the time being pending the start of the new rollout the update was to an equally 'stuck' 38.20230417.1.0

Indeed. We haven't rolled out this fix to next yet.

fifofonix · 2023-04-25T14:07:48Z

Note, that for some reason this solution did not work for me on the sole testing node that I had let upgrade to the latest 'stuck' version. The sudo rpm step just hung as if it was having connection issues. I ended up choosing to rollback the node via GRUB prompt and then the server promptly upgraded to the latest testing version without issue. A side effect of the rollback/upgrade is that it validates closure of: #1473

dustymabe · 2023-04-25T17:36:39Z

The sudo rpm step just hung as if it was having connection issues.

This could make sense if you require a proxy to get to the internet. I guess we'd need to modify the steps to say "download and copy over x,y RPMs to the affected nodes" before running the rpm -Uvh.

dustymabe · 2023-05-03T20:55:32Z

The fix for this went into next stream release 38.20230430.1.0. Please try out the new release and report issues.

dustymabe · 2023-05-03T20:55:38Z

This issue never affected our stable stream.

fifofonix · 2023-05-04T12:03:14Z

Note that if trying to fix/path an affected next server you may need to follow the steps outlined above more than once. For example, upgrade from 38.20230322.1.0 will yield broken 38.20230417.1.0 but repeating steps above again will then reach fixed 38.20230430.1.0.

fifofonix added the kind/bug label Apr 19, 2023

travier added fallout/f38 area/updates labels Apr 19, 2023

jmarrero mentioned this issue Apr 21, 2023

Add kola test that uses a proxy and ostree coreos/fedora-coreos-config#2392

Open

dustymabe changed the title ~~OS Updates initiated from behind a Corporate Proxy Stop Working With F38 (at least up to 38.20230417.1.0)~~ Proxied update connections are broken in F38 Apr 21, 2023

dustymabe added status/pending-testing-release Fixed upstream. Waiting on a testing release. status/pending-next-release Fixed upstream. Waiting on a next release. labels Apr 21, 2023

dustymabe removed the status/pending-testing-release Fixed upstream. Waiting on a testing release. label Apr 21, 2023

dustymabe removed the status/pending-next-release Fixed upstream. Waiting on a next release. label May 3, 2023

dustymabe closed this as completed May 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proxied update connections are broken in F38 #1477

Proxied update connections are broken in F38 #1477

fifofonix commented Apr 19, 2023

travier commented Apr 19, 2023

fifofonix commented Apr 19, 2023

jmarrero commented Apr 19, 2023

travier commented Apr 21, 2023

jmarrero commented Apr 21, 2023

dustymabe commented Apr 21, 2023

jlebon commented Apr 21, 2023

dustymabe commented Apr 21, 2023

dustymabe commented Apr 21, 2023 •

edited

Loading

fifofonix commented Apr 24, 2023

dustymabe commented Apr 24, 2023 •

edited

Loading

fifofonix commented Apr 24, 2023

dustymabe commented Apr 24, 2023

fifofonix commented Apr 25, 2023 •

edited

Loading

dustymabe commented Apr 25, 2023 •

edited

Loading

dustymabe commented May 3, 2023

dustymabe commented May 3, 2023

fifofonix commented May 4, 2023

Proxied update connections are broken in F38 #1477

Proxied update connections are broken in F38 #1477

Comments

fifofonix commented Apr 19, 2023

Describe the bug

Reproduction steps

Expected behavior

Actual behavior

System details

Butane or Ignition config

Additional information

travier commented Apr 19, 2023

fifofonix commented Apr 19, 2023

jmarrero commented Apr 19, 2023

travier commented Apr 21, 2023

jmarrero commented Apr 21, 2023

dustymabe commented Apr 21, 2023

jlebon commented Apr 21, 2023

dustymabe commented Apr 21, 2023

dustymabe commented Apr 21, 2023 • edited Loading

fifofonix commented Apr 24, 2023

dustymabe commented Apr 24, 2023 • edited Loading

fifofonix commented Apr 24, 2023

dustymabe commented Apr 24, 2023

fifofonix commented Apr 25, 2023 • edited Loading

dustymabe commented Apr 25, 2023 • edited Loading

dustymabe commented May 3, 2023

dustymabe commented May 3, 2023

fifofonix commented May 4, 2023

dustymabe commented Apr 21, 2023 •

edited

Loading

dustymabe commented Apr 24, 2023 •

edited

Loading

fifofonix commented Apr 25, 2023 •

edited

Loading

dustymabe commented Apr 25, 2023 •

edited

Loading