Warning: Failed to download action 'https://api.github.com/repos/microsoft/powerplatform-actions/zipball/0c80aacb61f9bfdcb930b880febc420ec4ef2f3e'. Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing. #416

Joseph-Duty-VA · 2023-08-05T00:30:23Z

I have started seeing this error repeatedly over the past several days across a number of my workflow calls into this repo:

Warning: Failed to download action 'https://api.github.com/repos/microsoft/powerplatform-actions/zipball/0c80aacb61f9bfdcb930b880febc420ec4ef2f3e'. Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.

and then it repeatedly fails before erroring out my entire workflow run. Sometimes the retry succeeds, but other times it continues to fail. The other actions from external repos (actions/checkout, some internal ones i have in my customer's org, etc.) all seem to be working just fine.

I am using v0 (e.g. - uses: microsoft/powerplatform-actions/import-data@v0 )

tehcrashxor · 2023-08-10T00:16:11Z

I've seen this issue occur intermittently in one of our own workflows. It usually would fail less than once a day, but that's been increasing lately.

The failure itself occurs during the "Set up job" step, which is an autogenerated step created by GitHub Actions itself

so there's nothing apparent that we can do to affect that ourselves.

I've opened up a support ticket with GitHub to see if they will assist.

Joseph-Duty-VA · 2023-08-10T01:08:44Z

This makes sense. I had opened a ticket here because it seems to only happen when I call actions from this repo. Other actions (from other repos) don't seem to have this same issue.

camiloqp · 2023-08-15T04:39:08Z

It is happening to me as well.
Just when using
repos/microsoft/powerplatform-actions

tehcrashxor · 2023-08-17T00:27:57Z

No word back from GitHub support on the ticket since Friday.

Our current hypothesis is that this has been occurring more frequently due to the steady size increase of our action's tarball/zipball, as each version of PAC is larger than the last, and those are checked in via git-lfs into the repo. Reducing the size of that should lead to fewer timeouts, which might just occur during periods when GitHub has higher server load?

#424 is an option to do just that - remove PAC from the repo and install via Nuget at action execution time. If we go this route, it will be a breaking change with a major version bump, as all Jobs would need a new install step.

Joseph-Duty-VA · 2023-08-18T13:31:07Z

Thanks for the follow-up. To clarify, when you say "it will be a breaking change with a major version bump, as all Jobs would need a new install step" - you are referring to consumers needing to call an additional step actions-install on each job that has a "- uses: microsoft/powerplatform-actions/xxxxxx@v0" step in it would need to have previously called a "- uses: microsoft/powerplatform-actions/actions-install@v0" step - is that right?

tehcrashxor · 2023-08-18T16:05:20Z

Correct.
New releases would have their version numbers in the v1.X.Y range instead of the current v0.X.Y, the tag v0 would point to the last v0.X.Y release, and the tags v1 and latest would point to the most recent v1.X.Y release.

matthewborne13 · 2023-08-18T19:51:15Z

@tehcrashxor
Thank you for looking into this issue. My organization's applications rely on these actions for deployment, and we are concerned that our production rollouts will be unstable until this issue is resolved. One of our deployments this week failed 4 times over the course of 2.5 hours before succeeding on the 5th try. I don't have any additional information to add, but I am adding this comment in case it helps the teams prioritize this higher.

petrochuk · 2023-08-18T20:51:41Z

Our current plan is to release new version next week.

tehcrashxor · 2023-08-18T22:16:56Z

We heard back from GitHub on our support ticket - they're still looking into why the download speed within the Set up job step drops so drastically, but recommended what we're already doing via #424 by reducing the size of tarball/zipball downloaded by the actions runner.

Preliminary findings from that tinkering is that the tarball's size has been cut from 53 MB down to 5.2 MB, and the Set up job step usually takes 10 seconds or less. However I've seen at a couple of runs which have taken closer to 3 minutes, triggering the 100 second timeout, but succeeding on the retry before failing the entire job. Hopefully GitHub will be able to resolve that download issue entirely, but at the very least this should reduce the number of job failures.

larsxschneider · 2023-08-21T11:28:02Z

👋 Hello from GitHub!

You have enabled "Git LFS objects in archives" for this repo (the default is "off").

The commit mentioned in the issue here (0c80aacb61f9bfdcb930b880febc420ec4ef2f3e) is referencing 800 files tracked by LFS files and all of those files are added to your archive which seems to cause a timeout for your repo.

$ git checkout 0c80aacb61f9bfdcb930b880febc420ec4ef2f3e
$ git lfs ls-files | wc -l
     800

We are looking into this issue!

Do you need the LFS files in your archive? If not, then disabling the "Git LFS objects in archives" option for this repo should fix the download of older archives. Archives of the lastest main commit seem to work just fine either way because they only track 74 files with LFS.

tehcrashxor · 2023-08-22T16:58:28Z

Do you need the LFS files in your archive?

Yep, the Node code is just a thin shim to read the parameters passed in the users YAML to the binaries stored in LFS.
Even with our plans for the upcoming version switching to downloading those from Nuget first instead of storing the LFS, we can't disable the LFS archive inclusion as that would break all previous versions of our Actions.

tehcrashxor · 2023-08-22T20:20:53Z

Released new version as v1.0.0, which should alleviate most of the "Set up job" failures as the tarball/zipball is significantly smaller.

As noted in the release notes, this pipelines upgrading to the new version will need both to update the @v0 version specifier to @v1, and will need the actions-install step added prior to any other of our actions to handle grabbing PAC from nuget.org at runtime.

markwong-synechron · 2023-09-04T15:17:24Z

No word back from GitHub support on the ticket since Friday.

Our current hypothesis is that this has been occurring more frequently due to the steady size increase of our action's tarball/zipball, as each version of PAC is larger than the last, and those are checked in via git-lfs into the repo. Reducing the size of that should lead to fewer timeouts, which might just occur during periods when GitHub has higher server load?

#424 is an option to do just that - remove PAC from the repo and install via Nuget at action execution time. If we go this route, it will be a breaking change with a major version bump, as all Jobs would need a new install step.

HI, this method won't work for my clients. Many large and highly regulated company do not allow direct access to nuget hence grabbing Nuget on runtime would be a deadend for them. Their runners are behind firewalls and security won't open access to nuget given the risk. So they would be stuck using v0 and with this timeout issue. Since many of my client's production deployment is heavily impacted.

petrochuk · 2023-09-04T22:29:47Z

Some highly regulated companies use self hosted GitHub runners.

danmcpherson · 2023-09-05T08:18:56Z

I'm seeing these timeout error more often than I'm not seeing them. I've moved to v1 and added the Install-Action. If I try to manually download the file "https://api.github.com/repos/microsoft/powerplatform-actions/zipball/09afea19cc361004739641ee6dda4ee7d7fac716" from my browser, it gets to 5.6 MB quite quickly, but then it just sits there waiting, which I'm sure is also happening inside the Github Action.

tehcrashxor · 2023-09-06T22:17:47Z

I'm seeing these timeout error more often than I'm not seeing them. I've moved to v1

Our monitoring job showed a fair number of v1 timeouts over Sunday and Monday, but it's still failing significantly less often for us than the v0 runs. We've updated our GitHub support ticket with that info, and we know of at least one large customer that has their own opened as well to investigate the timeouts, but haven't had anything further from GitHub yet.

HI, this method won't work for my clients. Many large and highly regulated company do not allow direct access to nuget hence grabbing Nuget on runtime would be a deadend for them.

We're looking into updating the actions-install step to support taking the package from either internal feeds or directly from a .nupkg file already on the box, so that such customers can obtain the package and provide it on the build agent / action runner without a direct call to nuget.org

Obtaining PAC another way and installing into the expected location that actions-install/index.ts uses and setting an environment variable named POWERPLATFORMTOOLS_PACINSTALLED to true is sufficient to get the other v1 actions running without our own actions-install step, though is a bit hacky and shouldn't be necessary after we update the install step.

tehcrashxor · 2023-09-18T20:32:01Z

We have a monitoring job that runs every hour, which runs four jobs to check that the actions are downloading and running on the agent correctly. These jobs are two v0 who-am-i jobs (one Windows, one Ubuntu), and another two jobs consisting of v1 actions-install followed by who-am-i (also one Windows, one Ubuntu).

In the last week, we've only seen a single failure of one v0 job failing, so it looks like the process is working better now.

GitHub Support suggests that this resolved incident https://www.githubstatus.com/incidents/frdfpnnt85s8 was likely a fix for the download slowness and failure, but we're not 100% convinced as it was Resolved on September 5th, but there were a series of failures after that on September 8th through September 10th.

tehcrashxor · 2023-09-19T22:27:49Z

GitHub Support elaborated that the incident marked Resolved on Sept 5th wasn't fully remediated until Sept 10th, explaining the failures we saw from the 8th - 10th.

As we've only had a single failure in the monitoring job for the last week, it looks like they may well have fixed the issue. We'll monitor for further failures and reopen this issue and our GitHub Support Ticket if we see it start to fail again.

petrochuk added the enhancement New feature or request label Aug 21, 2023

tehcrashxor closed this as completed Sep 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warning: Failed to download action 'https://api.github.com/repos/microsoft/powerplatform-actions/zipball/0c80aacb61f9bfdcb930b880febc420ec4ef2f3e'. Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing. #416

Warning: Failed to download action 'https://api.github.com/repos/microsoft/powerplatform-actions/zipball/0c80aacb61f9bfdcb930b880febc420ec4ef2f3e'. Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing. #416

Joseph-Duty-VA commented Aug 5, 2023

tehcrashxor commented Aug 10, 2023

Joseph-Duty-VA commented Aug 10, 2023

camiloqp commented Aug 15, 2023

tehcrashxor commented Aug 17, 2023

Joseph-Duty-VA commented Aug 18, 2023

tehcrashxor commented Aug 18, 2023

matthewborne13 commented Aug 18, 2023

petrochuk commented Aug 18, 2023

tehcrashxor commented Aug 18, 2023

larsxschneider commented Aug 21, 2023

tehcrashxor commented Aug 22, 2023

tehcrashxor commented Aug 22, 2023

markwong-synechron commented Sep 4, 2023 •

edited

petrochuk commented Sep 4, 2023

danmcpherson commented Sep 5, 2023 •

edited

tehcrashxor commented Sep 6, 2023

tehcrashxor commented Sep 18, 2023 •

edited

tehcrashxor commented Sep 19, 2023

Warning: Failed to download action 'https://api.github.com/repos/microsoft/powerplatform-actions/zipball/0c80aacb61f9bfdcb930b880febc420ec4ef2f3e'. Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing. #416

Warning: Failed to download action 'https://api.github.com/repos/microsoft/powerplatform-actions/zipball/0c80aacb61f9bfdcb930b880febc420ec4ef2f3e'. Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing. #416

Comments

Joseph-Duty-VA commented Aug 5, 2023

tehcrashxor commented Aug 10, 2023

Joseph-Duty-VA commented Aug 10, 2023

camiloqp commented Aug 15, 2023

tehcrashxor commented Aug 17, 2023

Joseph-Duty-VA commented Aug 18, 2023

tehcrashxor commented Aug 18, 2023

matthewborne13 commented Aug 18, 2023

petrochuk commented Aug 18, 2023

tehcrashxor commented Aug 18, 2023

larsxschneider commented Aug 21, 2023

tehcrashxor commented Aug 22, 2023

tehcrashxor commented Aug 22, 2023

markwong-synechron commented Sep 4, 2023 • edited

petrochuk commented Sep 4, 2023

danmcpherson commented Sep 5, 2023 • edited

tehcrashxor commented Sep 6, 2023

tehcrashxor commented Sep 18, 2023 • edited

tehcrashxor commented Sep 19, 2023

markwong-synechron commented Sep 4, 2023 •

edited

danmcpherson commented Sep 5, 2023 •

edited

tehcrashxor commented Sep 18, 2023 •

edited