Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious Failures in download:plugins #13902

Closed
tsmaeder opened this issue Jul 9, 2024 · 3 comments
Closed

Spurious Failures in download:plugins #13902

tsmaeder opened this issue Jul 9, 2024 · 3 comments

Comments

@tsmaeder
Copy link
Contributor

tsmaeder commented Jul 9, 2024

Bug Description:

Currently, a lot of our CI jobs fail during the yarn download:plugin task. I was able to debug this a bit and it seems that when we request the plugin metadata for a uri like `https://open-vsx.org/api/v2/-/query?extensionId=vscode.go&includeAllVersions=true&targetPlatform=universal&offset=0', we get an error thrown (which is unfortunately eaten silently). The stack trace is:

'Error: write ECONNABORTED\n
    at WriteWrap.onWriteComplete [as oncomplete] (node:internal/stream_base_commons:95:16)\n
    at WriteWrap.callbackTrampoline (node:internal/async_hooks:130:17)\n
    at handleWriteReq (node:internal/stream_base_commons:59:21)\n
    at writeGeneric (node:internal/stream_base_commons:150:15)\n
    at Socket._writeGeneric (node:net:953:11)\n
    at Socket._write (node:net:965:8)\n
    at doWrite (node:internal/streams/writable:590:12)\n
    at clearBuffer (node:internal/streams/writable:774:7)\n
    at Writable.uncork (node:internal/streams/writable:523:7)\n    at ClientRequest._flushOutput (node:_http_outgoing:1177:10)' (edited) 

A typical scenario where this might happen is when a write occurs to the request after request.end() has been called. Another possibility is a timeout on the request, but we don't set a timeout on the request 🤷. Unfortunately, the problem comes and goes, so debugging is hard.

@tsmaeder
Copy link
Contributor Author

With some more logging, we're getting, for example "Server returned code 503." ("server unavailable"). In local testing I've also seen 429 ("too many requests"). Could be the server is just overloaded.

@tsmaeder
Copy link
Contributor Author

Here are some more findings:

  • I added a size parameter of 5 to the requests, because usually the first response is the one we want anyway: no need to load a hundred versions.
  • This morning, I ran the download:plugins script with the default rate limit of 15 request/second. I had a couple of 429 errors, then some requests succeeded.
  • I tried to rerun the script numerous times with the same rate limit, but no errors occurred
  • I ran the script with a rate limit of 1000 requests/second, which worked fine
  • Then all of a sudden, I got a couple of 429 errors again (rate limte === 15), but only on one run
  • There seems to be no retry-after header on those 429 errors.

I can't really make out a pattern here: the errors seem to happen totally randomly.

tsmaeder added a commit that referenced this issue Jul 12, 2024
Attempted mitigation for #13902 

- do not eat exceptions and properly log errors
- reduces request rate to 3/sec.

Contributed on behalf of STMicroelectronics

Signed-off-by: Thomas Mäder <t.s.maeder@gmail.com>
@tsmaeder
Copy link
Contributor Author

The CI build jobs seem to be fixed after reducing the rate limit via CLI parameter to 3/sec. As for the 503's, there does seem to be some problem on the ovsx side: EclipseFdn/open-vsx.org#2752
Closing this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant