SLOW OR FAILED (500 ERROR) NODE.JS DOWNLOADS #4495

DeeDeeG · 2022-03-16T03:24:46Z

^{Edited by the Node.js Website Team}

⚠️ PLEASE AVOID CREATING DUPLICATED ISSUES

Learn more about this incident at https://nodejs.org/en/blog/announcements/node-js-march-17-incident

tl;dr: The Node.js website team is aware of ongoing issues with intermittent download instability.

More Details: nodejs/build#1993 (comment)

_{Original Issue Below}

URL: https://nodejs.org/dist/latest-v16.x/node-v16.14.1-linux-x64.tar.xz (or basically any specific file, as opposed to just browsing the dirs, which mostly works)
Browser version: Firefox 98.0.0, Firefox Nightly 100.0a1 (2022-03-15), curl 7.47.0 (Travis CI), curl 7.68.0 (my local machine) (Doesn't feel like a browser issue, feels like a server issue...)
Operating system: Ubuntu 20.04 / Linux Mint 20

When trying to get files off of nodejs.org/dist/... or nodejs.org/download/..., I get a server error.

"500 Internal Server Error"

(error page served by nginx)

Full error message page (HTML snippet, click to expand)

<html>
<head><title>500 Internal Server Error</title></head>
<body bgcolor="white">
<center><h1>500 Internal Server Error</h1></center>
<hr><center>nginx</center>
</body>
</html>

Browsing around the dirs, like https://nodejs.org/dist/latest-v16.x/, seems to work. Also, Downloading really small files such as https://nodejs.org/dist/latest-v16.x/SHASUMS256.txt seems to work sporadically, whereas downloading tarballs doesn't seem to work.

Given that the outage seems sporadic: Maybe it's a resource exhaustion issue over at the server? Running out of RAM or something?? I don't know.

~~Edit to add: The error message page seems to be served by Cloudflare. (According to the server: cloudflare response header, when looking in browser dev tools). So I guess this is a Cloudflare issue?~~ Actually that's probably not what that means.

The text was updated successfully, but these errors were encountered:

SheetJSDev · 2022-03-16T03:40:22Z

For some reason, iojs.org redirected to nodejs.org, causing errors with nvm (https://github.com/SheetJS/sheetjs/runs/5564331772 test failed when trying to fetch https://iojs.org/dist/v3.3.1/iojs-v3.3.1-linux-x64.tar.xz)

DeeDeeG · 2022-03-16T03:52:51Z

I wonder if the site/Cloudflare are just getting overwhelmed with traffic for people downloading the latest release?

Node v16.14.1 was released about an hour ago... #4494 / https://github.com/nodejs/node/releases/tag/v16.14.1

It's the latest LTS release, so everybody in the world looking for the default version of Node is downloading uncached right now, basically. (Other than as apparently cached by Cloudflare).

Should settle down in a few hours, or by tomorrow??

Note: I have nothing to do with Node's web servers or site, I'm just a regular person speculating at this point.

Trott · 2022-03-16T04:55:30Z

Closing as the issue isn't happening right now, but also pinging @nodejs/build in case there's anything to add here or something to be aware of.

Trott · 2022-03-16T06:18:36Z

People are still reporting this over in the OpenJS Slack, so I'm going to re-open this until that gets sorted.

rvagg · 2022-03-16T06:33:18Z

I have CloudFlare LB error notices spanning from ~2:40 am UTC to ~3:40 am UTC, which I suppose correlates with the latest 16.x release. Overloaded primary server, switching to the backup. Users may have got the 500 while hitting the primary before it switched.

I don't have a good explanation beyond that, I'm not sure what's caused the overload, I don't believe that server actually has trouble serving, but we have witnessed some weird I/O issues connected with rsync that maybe all happened at the same time. Perhaps during the next release someone should be on the server watching for weirdness.

ljharb · 2022-03-17T23:59:45Z

This seems to be happening again; https://iojs.org/dist/index.json is 500ing, and v12.22.11 just went out.

(after about 5-10 minutes, the 500s seemed to stop, but it's still worth looking into)

DeeDeeG · 2022-03-18T00:07:44Z

FWIW I am still seeing either "500 internal server error" or very slow file downloads (less than 50 kilobytes a second. Often less than 10 kilobytes a second, especially at the beginning of a download.)

More details (click to expand):

The "slow download" symptom is less obvious for small files, because they complete quickly anyway.

I have seen a tarball download start, take a long time (over a minute), and ultimately fail to download mid-way through.

I hope that's useful to diagnose the problem. (Results may vary across the globe, since the path through the CDN is probably not identical everywhere?)

Edit to add: My experience is basically identical (as described in this comment) for nodejs.org/dist/ and iojs.org/dist. And identical experience today compared to 2 days ago.

richardlau · 2022-03-18T00:30:37Z

After promoting 12.22.11 I went to rerun the same release script to promote 14.19.1 and it just hung during the bit when it fetches the promotable builds. Same behaviour from another machine on a different network. Weirdly I was able to ssh into the machine in an interactive session and manually run the command the promotion script was trying to run 😕.

I don't have a good explanation beyond that, I'm not sure what's caused the overload, I don't believe that server actually has trouble serving, but we have witnessed some weird I/O issues connected with rsync that maybe all happened at the same time. Perhaps during the next release someone should be on the server watching for weirdness.

I logged into the machine, ran ps -ef and saw we had two rsyncs in progress

root     12174 12137  0 00:08 ?        00:00:00 rsync --server --sender -logDtprze.iLsfxC . /home/dist/iojs/
nodejs   27646 27638  0 00:00 ?        00:00:00 ssh benchmark rsync --server --sender -logDtprze.iLsfx . coverage-out/out/

The first of those, running as root, is from the backup server -- I've left that alone. I killed the second one, which is the coverage data, and suddenly my other terminal running the promotion script got unstuck. This would kind of suggest the coverage data syncing is partly, or wholly, responsible. In the past either @mhdawson or I would do an annoying manual clear up of old coverage data to reduce the volume of stuff being synced but I'm going to recommend now we turn off running coverage on Jenkins and the associated data shuffling to populate coverage.nodejs.org and switch exclusively to codecov.io.

Actually as I typed the above paragraph my promotion execution has broken 😢 so there's still something up with the machine:

Are you sure you want to promote the v14.19.1 assets? [y/n] y
client_loop: send disconnect: Broken pipe

richardlau · 2022-03-18T00:34:31Z

Currently there are no rsync processes -- but running the promotion script is "hung" again for me 😞.
These are the top things via top:

top - 00:33:22 up 195 days, 16:59,  1 user,  load average: 0.72, 0.80, 0.84
Tasks: 165 total,   2 running, 163 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.7 us,  3.3 sy,  0.0 ni, 83.6 id,  0.1 wa,  0.0 hi,  8.2 si,  0.1 st
KiB Mem : 16432136 total,  1251072 free,  1222728 used, 13958336 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 14391196 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
25151 www-data  20   0  595256 456632 375512 S   9.0  2.8 510:58.65 nginx
25149 www-data  20   0  595796 456892 375408 S   8.3  2.8 529:56.28 nginx
25152 www-data  20   0  597244 458740 375596 S   8.0  2.8 524:21.40 nginx
25150 www-data  20   0  594556 456176 375488 S   7.3  2.8 505:23.46 nginx
25153 www-data  20   0  596728 458372 375512 S   7.0  2.8 517:06.30 nginx
25154 www-data  20   0  595496 456944 375552 S   6.3  2.8 515:04.23 nginx
   23 root      20   0       0      0      0 R   5.0  0.0 238:39.12 ksoftirqd/3
12273 telegraf  20   0 6316820  38820   7492 S   3.0  0.2  17:58.94 telegraf
   58 root      20   0       0      0      0 S   2.0  0.0   2773:53 kswapd0
    7 root      20   0       0      0      0 S   0.7  0.0 595:13.23 rcu_sched
   18 root      20   0       0      0      0 S   0.3  0.0  12:58.42 ksoftirqd/2
16257 root      20   0   42092   3784   3128 R   0.3  0.0   0:00.13 top

richardlau · 2022-03-18T00:35:05Z

I think I'm going to bounce nginx.

KevLehman · 2022-03-18T00:35:15Z

Happening same here. Know this type of comments are not useful in most cases, but in this case at least serves to know that there are users still being affected 😬

Edit: worked now for me

richardlau · 2022-03-18T00:54:57Z

I did systemctl restart nginx. I'm still seeing rsync processes from the backup server still running.

richardlau · 2022-04-05T14:18:14Z

This is being reported again, probably due to an increase in traffic from the Node.js 12 that went out earlier.
This is from the Cloudflare Load balancing activity analytics:

Grafana for the main server (DO):

top:

top - 14:15:36 up 214 days,  6:42,  1 user,  load average: 1.38, 1.00, 1.04
Tasks: 168 total,   4 running, 164 sleeping,   0 stopped,   0 zombie
%Cpu(s):  5.0 us,  2.7 sy,  0.0 ni, 75.5 id,  0.3 wa,  0.0 hi,  6.1 si, 10.3 st
KiB Mem : 16432136 total,   885560 free,  1270772 used, 14275804 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 14359928 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
   23 root      20   0       0      0      0 R  24.3  0.0 269:33.01 ksoftirqd/3
 6072 www-data  20   0  598296 443264 359044 S  17.3  2.7 737:22.52 nginx
 6071 www-data  20   0  599072 444024 359044 S  14.3  2.7 717:50.13 nginx
 6074 www-data  20   0  594416 439496 359044 R  14.3  2.7 751:33.93 nginx
 6069 www-data  20   0  599124 444240 359044 S  12.0  2.7 752:04.15 nginx
 6073 www-data  20   0  597752 442640 359044 S  11.6  2.7 748:45.83 nginx
 6070 www-data  20   0  598220 442908 359044 R   9.0  2.7 731:50.99 nginx
    7 root      20   0       0      0      0 S   1.3  0.0 722:24.29 rcu_sched
   18 root      20   0       0      0      0 S   0.7  0.0  14:54.21 ksoftirqd/2
   58 root      20   0       0      0      0 S   0.7  0.0   3122:16 kswapd0
  673 root      20   0  187004  42708  42360 S   0.3  0.3  92:42.30 systemd-journal
 1220 root      20   0   42092   3628   3004 R   0.3  0.0   0:00.09 top                                                                                         1 root      20   0  185104   4900   3136 S   0.0  0.0  26:18.81 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:07.94 kthreadd
    3 root      20   0       0      0      0 S   0.0  0.0   4:51.64 ksoftirqd/0

BethGriggs · 2022-04-09T01:54:59Z

No idea if this is related - but wanted to report it here just in case. I've found the download server has been incredibly slow for the past 30 mins or so, with some requests hanging and never completing.

I spotted this during some citgm-smoker-nobuild jobs that I was running:

02:24:08 + curl -O https://nodejs.org/download/release/v17.9.0/node-v17.9.0-aix-ppc64.tar.gz
02:24:08   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
02:24:08                                  Dload  Upload   Total   Spent    Left  Speed
02:24:08  55 52.2M   55 29.0M    0     0  33587      0  0:27:12  0:15:06  0:12:06 42751
 55 52.2M   55 29.0M    0     0  33550      0  0:27:14  0:15:07  0:12:07 12811
 55 52.2M   55 29.0M    0     0  33538      0  0:27:14  0:15:08  0:12:06 10273Terminated
02:39:17 Build was aborted

Also seen in https://ci.nodejs.org/job/citgm-smoker-nobuild/1190/nodes=rhel8-s390x/console

^{(I hope the citgm-smoker-nobuild jobs that I've been running are not the cause here 😅)}

I also tried locally and was getting 20-30 KB/s 🐢 . After ~30 mins or so new requests started completing as normal. I did have to abort the jobs in Jenkins to stop them hanging.

rvagg · 2022-04-09T06:20:21Z

I have a flurry of notifications from CF around 6 hours ago about timeouts. The interesting thing this time is that both servers are getting timeout notifications. I'm not sure what that means. We should probably do a test to push all traffic to the backup server and invalidate cache to see what sort of load it gets and whether it can handle it. It'd be surprising to me if it can't. This might be deeper, or could be a CF issue.

rvagg · 2022-04-09T07:06:44Z

And having just written that, it's flapping again right now. I'm on the main server and it's pretty sluggish, load <3 though. What I can see happening in parallel is an xz compressing log files thanks to logrotate while at the same time nightlies are being uploaded, and therefore CF cache is getting invalidated. Both servers have nginx at the top of their process lists sorted by CPU.

I've reniced the xz process to 19, it's annoying that it's 0 and I'm not sure there's a good way to deprioritise compression jobs when it's running. So what I've done in /etc/cron.daily/logrotate.conf is prefix the logrotate call with nice -n 19 ionice -c3. It seems to have calmed down after I manually reniced the running process, no flapping since then.

Another thing I'm noticing as I tail the logs is that the Cloudflare-Traffic-Manager is getting really chatty. Each of their edge locations is pinging both servers to check their health, but that means that as Cloudflare expands their network, and we maintain a fairly frequent health check, that even at idle they're serving quite a bit of traffic. Nowhere near their capacity, it just means that background traffic is quite high, and then cache invalidation shoots it right up.

So, I'm going to try something, currently our health check is: every 30 seconds, timeout after 4 seconds, retry 3 times. I'm bumping it to: every 60 seconds, timeout after 8 seconds, retry 2 times. Anyone got a better suggestion?

austinbutler · 2022-04-11T20:08:55Z

I seem to have run into this a few minutes ago. It's working again now.

ljharb · 2022-04-11T20:12:25Z

Same; it was happening a bunch for me within the last hour.

begin-again · 2022-04-18T20:32:57Z

unable to get to any of the documentation now

Trott · 2022-04-18T20:50:23Z

Seems to be resolved now, if having a comment like this with a timestamp helps anyone look through relevant logs or whatever.

ljharb · 2022-04-18T20:59:24Z

@Trott a few links are still 500ing for me

mhdawson · 2022-04-20T21:09:52Z

It's interesting we did not get reports after 18.x was launched. I'd expect if it was driven by downloads we might have seen something yesterday.

ljharb · 2022-04-20T21:24:17Z

It definitely seems more driven by the build/release process itself than by downloads later.

Z9n2JktHlZDmlhSvqc9X2MmL3BwQG7tk · 2023-08-30T11:55:00Z

Problem still exists, downloading is too slow:

$ wget https://nodejs.org/download/release/v17.9.0/node-v17.9.0.tar.xz
--2023-08-30 11:44:53--  https://nodejs.org/download/release/v17.9.0/node-v17.9.0.tar.xz
Resolving nodejs.org (nodejs.org)... 104.20.23.46, 104.20.22.46
Connecting to nodejs.org (nodejs.org)|104.20.23.46|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 35778144 (34M) [application/x-xz]
Saving to: ‘node-v17.9.0.tar.xz’

node-v17.9.0.tar.xz          23%[=======>                            ]   8.17M  15.5KB/s    in 5m 12s

2023-08-30 11:50:06 (26.8 KB/s) - Connection closed at byte 8568562. Retrying.

--2023-08-30 11:50:07--  (try: 2)  https://nodejs.org/download/release/v17.9.0/node-v17.9.0.tar.xz
Connecting to nodejs.org (nodejs.org)|104.20.23.46|:443... connected.
HTTP request sent, awaiting response...

UnoMomento12 · 2023-08-30T16:05:45Z

Hi. I'm getting 500s on https://nodejs.org/dist/v16.13.2/node-v16.13.2-win-x64.7z

pughpugh · 2023-09-20T16:42:08Z

We're seeing 500s on https://nodejs.org/dist/v18.17.1/node-v18.17.1-darwin-arm64.tar.gz

broksonic21 · 2023-09-20T21:21:01Z

I actually just filed a parallel ticket for that one @pughpugh before I saw this: #5818 . Not sure why Cloudflare is caching 500s

joaomoreno · 2023-10-10T18:55:39Z

Seen on an Azure DevOps agent:

Downloading: https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz
##[error]Aborted
##[warning]Content-Length (44417128 bytes) did not match downloaded file size (8204861 bytes).

shailendra043 · 2023-10-17T05:52:59Z

gyp ERR! stack Error: 500 status code downloading checksum

pbering · 2023-10-18T13:32:47Z

Please reopen as it still happens multiple times a day!

jeremills · 2023-10-18T13:41:24Z

Same, still happening

Hotell · 2023-10-18T14:41:36Z

we were facing this as well for quite some time. only solution to mitigate this is to use cached nodejs that ships with container :-/ not the best thing in the world, but it is what it is microsoft/fluentui#29552

jfuinsure · 2023-10-18T14:55:55Z

Have been facing this intermittently too:

Downloading: https://nodejs.org/dist/v16.16.0/node-v16.16.0-win-x64.7z
##[error]Aborted
##[warning]Content-Length (17121358 bytes) did not match downloaded file size (5534315 bytes).

jfuinsure · 2023-10-19T11:25:13Z

and happening again :(

Downloading: https://nodejs.org/dist/v16.16.0/node-v16.16.0-darwin-x64.tar.gz
##[error]Aborted
##[warning]Content-Length (30597385 bytes) did not match downloaded file size (5075560 bytes).

broksonic21 · 2023-10-24T14:06:56Z

This was happening continually today at https://nodejs.org/dist/v20.8.1/node-v20.8.1-darwin-arm64.tar.gz

Any update on why this is closed?

ovflowd · 2023-10-24T14:08:52Z

This is closed because it's a known issue, and we're working on it.

iansedano · 2023-10-26T14:56:44Z

I've been getting this on GitHub Actions (ubuntu-latest) for the last couple days:

> Could not get resource 'https://nodejs.org/dist/v18.14.0/node-v18.14.0-linux-x64.tar.gz'.
            > Read timed out

horttanainen · 2023-10-31T12:00:07Z

@ovflowd

This is closed because it's a known issue, and we're working on it.

I think it would better communicate your intent if you closed this issue after you have solved it.

This comment was marked as off-topic.

Sign in to view

Trott closed this as completed Mar 16, 2022

Trott reopened this Mar 16, 2022

rvagg mentioned this issue Apr 9, 2022

TODO: add logrotate renice to www server ansible script nodejs/build#2920

Closed

Trott mentioned this issue Apr 18, 2022

500 Internal Server Error on Debugging - Getting Started guide #4549

Closed

Trott mentioned this issue Apr 18, 2022

Some links returning 500 Internal Request Error #4550

Closed

nschonni mentioned this issue Apr 19, 2022

Increase disk space on unofficial-builds.nodejs.org nodejs/unofficial-builds#55

Closed

bnoordhuis mentioned this issue Aug 17, 2023

Error 500 downloading node-v14.18.0-linux-x64.tar.gz nodejs/node#49209

Closed

ovflowd mentioned this issue Aug 17, 2023

https://nodejs.org/dist/v14.15.1/node-v14.15.1-linux-x64.tar.xz is throwing 500 Error. nodejs/build#3456

Closed

This was referenced Aug 24, 2023

Node v19.7.0 linux x64 - 500 Internal Server Error #5666

Closed

500 Internal Server Error When Downloading NodeJS v20.5.0 #5667

Closed

500 Internal Server error downloading node-v16.7.0-linux-x64.tar.gz from nodejs.org nodejs/build#3472

Closed

ovflowd mentioned this issue Aug 30, 2023

Cannot download distribution (error 500) #5718

Closed

ovflowd changed the title ~~Slow or failed Node.js Downloads~~ SLOW OR FAILED (500 ERROR) NODE.JS DOWNLOADS Aug 30, 2023

pal03377 mentioned this issue Aug 30, 2023

The link to node-v18.13.0-linux-x64.tar.xz returns a 500 error #5719

Closed

ovflowd mentioned this issue Oct 10, 2023

Distribution download links not working #5976

Closed

jdx mentioned this issue Oct 17, 2023

Downloading nodejs binaries is sometimes timing out jdx/mise#945

Closed

bmuenzenmeyer mentioned this issue Oct 24, 2023

NodeJS download very slow #6043

Closed

bnoordhuis mentioned this issue Oct 24, 2023

Node CDN continues to serve 500's nodejs/node#50367

Closed

malaskowski mentioned this issue Oct 25, 2023

flaky installation on ci server eirslett/frontend-maven-plugin#882

Open

ovflowd mentioned this issue Nov 22, 2023

dist/index.json broken #6136

Closed

manishprivet unpinned this issue Mar 5, 2024

SLOW OR FAILED (500 ERROR) NODE.JS DOWNLOADS #4495

SLOW OR FAILED (500 ERROR) NODE.JS DOWNLOADS #4495

Comments

DeeDeeG commented Mar 16, 2022 • edited by ovflowd Loading

⚠️ PLEASE AVOID CREATING DUPLICATED ISSUES

SheetJSDev commented Mar 16, 2022

DeeDeeG commented Mar 16, 2022 • edited Loading

This comment was marked as off-topic.

Trott commented Mar 16, 2022

Trott commented Mar 16, 2022

rvagg commented Mar 16, 2022

ljharb commented Mar 17, 2022 • edited Loading

DeeDeeG commented Mar 18, 2022 • edited Loading

richardlau commented Mar 18, 2022

richardlau commented Mar 18, 2022

richardlau commented Mar 18, 2022

KevLehman commented Mar 18, 2022 • edited Loading

richardlau commented Mar 18, 2022

richardlau commented Apr 5, 2022

BethGriggs commented Apr 9, 2022 • edited Loading

rvagg commented Apr 9, 2022

rvagg commented Apr 9, 2022

austinbutler commented Apr 11, 2022

ljharb commented Apr 11, 2022

begin-again commented Apr 18, 2022

Trott commented Apr 18, 2022

ljharb commented Apr 18, 2022

mhdawson commented Apr 20, 2022

ljharb commented Apr 20, 2022

Z9n2JktHlZDmlhSvqc9X2MmL3BwQG7tk commented Aug 30, 2023

UnoMomento12 commented Aug 30, 2023

pughpugh commented Sep 20, 2023

broksonic21 commented Sep 20, 2023

joaomoreno commented Oct 10, 2023

shailendra043 commented Oct 17, 2023

pbering commented Oct 18, 2023

jeremills commented Oct 18, 2023

Hotell commented Oct 18, 2023

jfuinsure commented Oct 18, 2023

jfuinsure commented Oct 19, 2023

broksonic21 commented Oct 24, 2023

ovflowd commented Oct 24, 2023

iansedano commented Oct 26, 2023

horttanainen commented Oct 31, 2023

DeeDeeG commented Mar 16, 2022 •

edited by ovflowd

Loading

DeeDeeG commented Mar 16, 2022 •

edited

Loading

ljharb commented Mar 17, 2022 •

edited

Loading

DeeDeeG commented Mar 18, 2022 •

edited

Loading

KevLehman commented Mar 18, 2022 •

edited

Loading

BethGriggs commented Apr 9, 2022 •

edited

Loading