Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: retry write on EPROTOTYPE on OS X #482

Closed
wants to merge 1 commit into from

Conversation

mscdex
Copy link
Contributor

@mscdex mscdex commented Aug 16, 2015

At least on OS X 10.10 "Yosemite", an EPROTOTYPE can occur when trying to write to a stream that was abruptly closed. By retrying the write on EPROTOTYPE, we correctly get EPIPE.

Related to nodejs/node#2382

Also, this issue is documented elsewhere in the wild.

@mscdex
Copy link
Contributor Author

mscdex commented Aug 16, 2015

FWIW I also tested this change on OS X 10.9 (where EPIPE was already occurring before the changes) and it was not negatively affected.

@saghul
Copy link
Member

saghul commented Aug 17, 2015

I followed the links you mention and the change looks ok. If I'm reading the blog post correctly, this happens when the send is processed while the socket is being torn down, not after it was abruptly closed; can you please clarify that in the commit message?

Any chance you could write a test for this?

@bnoordhuis
Copy link
Member

Can you add some in-source comments as well? It's a rather obscure bug, it's not obvious from just looking at the code what the EPROTOTYPE check is for.

At least on OS X 10.10 "Yosemite", an EPROTOTYPE can occur
when trying to write to a socket that is shutting down.
By retrying the write after EPROTOTYPE, we correctly get EPIPE.
@mscdex
Copy link
Contributor Author

mscdex commented Aug 17, 2015

@bnoordhuis Comments added.

@saghul To be honest, I'm not sure how to trigger this at the libuv layer.

@saghul
Copy link
Member

saghul commented Aug 17, 2015

@mscdex From the blog post linked in the issue:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <strings.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <pthread.h>
#include <errno.h>

int do_server() {
  int fd;
  struct sockaddr_in server_addr;

  fd = socket(AF_INET, SOCK_STREAM, 0);

  if (fd == -1) {
    perror("error socket server");
    exit(1);
  }

  bzero((char*) &server_addr, sizeof(server_addr));
  server_addr.sin_family = AF_INET;
  server_addr.sin_addr.s_addr = INADDR_ANY;
  server_addr.sin_port = htons(9600);

  if (bind(fd, (struct sockaddr *) &server_addr, sizeof(server_addr)) < 0) {
    perror("error binding");
    exit(1);
  }

  return fd;
}

void* do_child_thread(void* unused) {
  struct sockaddr_in client_addr;
  int fd;

  fd = socket(AF_INET, SOCK_STREAM, 0);

  if (fd == -1) {
    perror("error socket client");
    exit(1);
  }

  bzero((char*) &client_addr, sizeof(client_addr));
  client_addr.sin_family = AF_INET;
  client_addr.sin_addr.s_addr = INADDR_ANY;
  client_addr.sin_port = htons(9600);

  if (connect(fd, (struct sockaddr *) &client_addr, sizeof(client_addr)) < 0) {
    perror("error connect");
    exit(1);
  }

  fprintf(stderr, "closing client socket\n");

  if (close(fd) < 0) {
    perror("error close client socket");
    exit(1);
  }

  fprintf(stderr, "closed client socket\n");

  return NULL;
}

int main(int argc, char** argv) {
  int server_fd, client_fd;
  socklen_t client_len;
  struct sockaddr_in client_addr;
  char buf[] = { 'a', '\n' };
  pthread_t child_thread;
  int rc;

  signal(SIGPIPE, SIG_IGN);

  server_fd = do_server();
  rc = listen(server_fd, 5);
  if (rc < 0) {
    perror("error listen");
    return 1;
  }

  rc = pthread_create(&child_thread, NULL, do_child_thread, NULL);
  if (rc != 0) {
    perror("error pthread_create");
    return 1;
  }

  client_len = sizeof(client_addr);
  client_fd = accept(server_fd, (struct sockaddr *) &client_addr, &client_len);
  if (client_fd < 0) {
    perror("error accept");
    return 1;
  }

  while (1) {
    fprintf(stderr, "before send\n");
    rc = send(client_fd, buf, sizeof(buf), 0);
    fprintf(stderr, "after send: %d\n", rc);

    if (rc < 0) {
      if (errno == EPIPE) {
        break;
      } else {
        int so_type;
        socklen_t so_len = sizeof(so_type);
        getsockopt(client_fd, SOL_SOCKET, SO_TYPE, &so_type, &so_len);
        fprintf(stderr, "type: %d %d\n", so_type, SOCK_STREAM);

        perror("error send");
        return 1;
      }
    }
  }

  fprintf(stderr, "before server closing client fd\n");
  if (close(client_fd) < 0) {
    perror("error close client");
    return 1;
  }
  fprintf(stderr, "after server closing client fd\n");


  fprintf(stderr, "before server closing fd\n");
  if (close(server_fd) < 0) {
    perror("error close server");
    return 1;
  }
  fprintf(stderr, "after server closing fd\n");

  rc = pthread_join(child_thread, NULL);
  if (rc != 0 && rc != ESRCH) {
    fprintf(stderr, "error pthread_join: %d\n", rc);
    return 1;
  }

  return 0;
}

Maybe you can try to port this one to libuv.

@mscdex
Copy link
Contributor Author

mscdex commented Aug 18, 2015

@saghul Right, but I am not able to get the timing right or something when I tried my hand at it. I'm not very familiar with the libuv internals. I always get EBADF or EAGAIN, never EPROTOTYPE.

@saghul
Copy link
Member

saghul commented Aug 18, 2015

Since it seems inherently racey, maybe we could let this one slide :-) @bnoordhuis WDYT?

@bnoordhuis
Copy link
Member

I guess so. LGTM.

saghul pushed a commit that referenced this pull request Aug 19, 2015
At least on OS X 10.10 "Yosemite", an EPROTOTYPE can occur
when trying to write to a socket that is shutting down.
By retrying the write after EPROTOTYPE, we correctly get EPIPE.

PR-URL: #482
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
@saghul
Copy link
Member

saghul commented Aug 19, 2015

Thanks @mscdex, landed in 4069613 👍

@jvimal-eg
Copy link

jvimal-eg commented Jan 2, 2022

Hey all, I think this workaround might be the cause of a bigger issue on newer versions on MacOSX (I've tested it on BigSur and Monterey). If a certain kind of network extension (AppProxy or TransparentProxy extension) is enabled, any sockets that were opened prior to the extension becoming active will return an EPROTOTYPE error that is not a transient error.

See https://developer.apple.com/forums/thread/681135 for more info. This is not a libuv specific issue -- see the ssh fix committed upstream referenced in the above post.

This affects Electron based apps such as Slack and Signal, and it's pretty easily reproducible. With a flow-based VPN, merely starting/stopping an audio call on Slack, or just starting the network extension after Signal has started, will cause a 100% CPU usage pattern. On Signal, the activity monitor and CPU profile clearly show most of the time is spent inside libuv / uv_write2. Running dtruss on the process shows a large number of write() syscalls failing with errno=41 (EPROTOTYPE).

For Slack, however, I haven't been able to get a clean indication from dtruss / Instruments as to why it's spinning. It's more of a conjecture that the two issues are related.

Should the fix in this PR at least be conditional on the OSX version? Looking at the comment tokio-rs/mio#1364 (comment), newer MacOS versions might not be hit by this issue.

Since documentation is very scarce on this, and this fix has been copied onto other projects (Ruby, dotnet/runtime, etc.), I believe there should at least be a guard against a potential infinite loop here. :) What do you all think?

@bnoordhuis
Copy link
Member

macOS never ceases to be a clusterfuck of software bugs... For a company with the resources of Apple you'd think they could at least keep something as basic as an operating system in working shape.

Let me summarize and see if my understanding is correct:

  1. Some macOS versions return a transient EPROTOTYPE error on teardown, which libuv handles

  2. Said bug may or may not have been fixed but we don't know when / in what version

  3. Some software causes permanent EPROTOTYPE errors. Unknown when that bug was introduced

Is that an accurate summary?

@jvimal-eg
Copy link

@bnoordhuis I believe that's accurate based on my reading so far. Re (3): I would reword that as "If some OS features are turned on, write() calls on sockets that were connected before the OS feature was enabled, return EPROTOTYPE errors."

That said, while I share your frustration with Apple's lack of documentation and transparency about these bugs, the workaround in this PR could have been better in the sense that it could have guarded against the infinite loop, rather than make a strong assumption about the (undocumented) transient nature of the EPROTOTYPE error. But I see the tension here, it's unclear how many retries are needed before we give up and bubble up the error! :)

@stephentoub was spot on with his review comment on dotnet/corefx repository: https://github.com/dotnet/corefx/pull/37208/files/116dd13f10017f2991aefa9ad21273395526f1ff?w=1#diff-6c0d8c7b971aa7240f2b70aeb9fb718ed207b0fee11b8bb4c2bb95496a818187.

@bnoordhuis
Copy link
Member

it's unclear how many retries are needed before we give up and bubble up the error

Indeed, that turns it into a flaky bug and that's arguably worse.

I'm unsure what the best way forward is. Give up after x amount of wall clock time has passed? That fixes the infinite loop but introduces a throughput choke point.

@jvimal-eg
Copy link

@bnoordhuis Some ideas..

  1. Have the infinite loop, but only on MacOS Yosemite 10.10.* where the bug was first observed.
  2. Have a loop that gives up after a few seconds of wall clock time, and returns either EPIPE or maybe just EPROTOTYPE to the caller.
  3. Wait for Apple to fix (I highly doubt they would fix this, or if this would be on time!). Perhaps the fix is to not return EPROTOTYPE (such a strange choice of error).

It might not be a throughput choke point because the error seems to indicate that the socket must be closed and reopened, so it's better to bubble this error up... After reopening the socket, things ought to succeed. This typically corresponds to VPN enable/disable system events that usually disrupt live connections anyway.

@bnoordhuis
Copy link
Member

@libuv/collaborators This needs your input, see ^. I'm inclined to just remove the workaround and see what happens.

bnoordhuis added a commit to bnoordhuis/libuv that referenced this pull request Jan 5, 2022
It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv#482
@bnoordhuis
Copy link
Member

I gave in to my inclination and opened #3405.

bnoordhuis added a commit to bnoordhuis/libuv that referenced this pull request Jan 5, 2022
We can't realistically claim to support 10.7 or any version that Apple
no longer supports so let's bump the baseline to something more
realistic.

Refs: libuv#482
Refs: libuv#3405
bnoordhuis added a commit that referenced this pull request Jan 9, 2022
It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: #482
bnoordhuis added a commit to bnoordhuis/libuv that referenced this pull request Jan 12, 2022
macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv#482
Refs: libuv#3405
@bnoordhuis
Copy link
Member

I reproduced the EPROTOTYPE on macos 10.15 today:

not ok 312 - tcp_try_write_error
2: # exit code 6
2: # Output from process `tcp_try_write_error`:
2: # uv_try_write error: -41 protocol wrong type for socket
2: # Assertion failed in /Users/runner/work/libuv/libuv/test/test-tcp-try-write-error.c on line 51: r == UV_EPIPE || r == UV_ECONNABORTED || r == UV_ECONNRESET

(error code -41 is EPROTOTYPE negated, that's how libuv passes around errors.)

I've opened #3413 to map the error to ECONNRESET, which is an expected error and probably appropriate given the circumstances.

bnoordhuis added a commit that referenced this pull request Jan 12, 2022
macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: #482
Refs: #3405
bnoordhuis added a commit to bnoordhuis/libuv that referenced this pull request Feb 8, 2022
We can't realistically claim to support 10.7 or any version that Apple
no longer supports so let's bump the baseline to something more
realistic.

Refs: libuv#482
Refs: libuv#3405
bnoordhuis added a commit that referenced this pull request Feb 8, 2022
We can't realistically claim to support 10.7 or any version that Apple
no longer supports so let's bump the baseline to something more
realistic.

Refs: #482
Refs: #3405
JeffroMF pushed a commit to JeffroMF/libuv that referenced this pull request May 16, 2022
It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv#482
JeffroMF pushed a commit to JeffroMF/libuv that referenced this pull request May 16, 2022
macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv#482
Refs: libuv#3405
JeffroMF pushed a commit to JeffroMF/libuv that referenced this pull request May 16, 2022
We can't realistically claim to support 10.7 or any version that Apple
no longer supports so let's bump the baseline to something more
realistic.

Refs: libuv#482
Refs: libuv#3405
bnoordhuis added a commit to bnoordhuis/io.js that referenced this pull request Jul 22, 2022
Original commit log follows:

darwin: remove EPROTOTYPE error workaround (nodejs#3405)

It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv/libuv#482
Fixes: nodejs#43916
bnoordhuis added a commit to bnoordhuis/io.js that referenced this pull request Jul 22, 2022
Original commit log follows:

darwin: translate EPROTOTYPE to ECONNRESET (nodejs#3413)

macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv/libuv#482
Refs: libuv/libuv#3405
Fixes: nodejs#43916
bnoordhuis added a commit to bnoordhuis/io.js that referenced this pull request Jul 22, 2022
Original commit log follows:

darwin: remove EPROTOTYPE error workaround (libuv/libuv#3405)

It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv/libuv#482
Fixes: nodejs#43916
bnoordhuis added a commit to bnoordhuis/io.js that referenced this pull request Jul 22, 2022
Original commit log follows:

darwin: translate EPROTOTYPE to ECONNRESET (libuv/libuv#3413)

macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv/libuv#482
Refs: libuv/libuv#3405
Fixes: nodejs#43916
nodejs-github-bot pushed a commit to nodejs/node that referenced this pull request Jul 25, 2022
Original commit log follows:

darwin: remove EPROTOTYPE error workaround (libuv/libuv#3405)

It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv/libuv#482
Fixes: #43916

PR-URL: #43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
nodejs-github-bot pushed a commit to nodejs/node that referenced this pull request Jul 25, 2022
Original commit log follows:

darwin: translate EPROTOTYPE to ECONNRESET (libuv/libuv#3413)

macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv/libuv#482
Refs: libuv/libuv#3405
Fixes: #43916

PR-URL: #43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
danielleadams pushed a commit to nodejs/node that referenced this pull request Jul 26, 2022
Original commit log follows:

darwin: remove EPROTOTYPE error workaround (libuv/libuv#3405)

It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv/libuv#482
Fixes: #43916

PR-URL: #43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
danielleadams pushed a commit to nodejs/node that referenced this pull request Jul 26, 2022
Original commit log follows:

darwin: translate EPROTOTYPE to ECONNRESET (libuv/libuv#3413)

macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv/libuv#482
Refs: libuv/libuv#3405
Fixes: #43916

PR-URL: #43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
targos pushed a commit to nodejs/node that referenced this pull request Jul 28, 2022
Original commit log follows:

darwin: remove EPROTOTYPE error workaround (libuv/libuv#3405)

It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv/libuv#482
Fixes: #43916

PR-URL: #43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
targos pushed a commit to nodejs/node that referenced this pull request Jul 28, 2022
Original commit log follows:

darwin: translate EPROTOTYPE to ECONNRESET (libuv/libuv#3413)

macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv/libuv#482
Refs: libuv/libuv#3405
Fixes: #43916

PR-URL: #43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
targos pushed a commit to nodejs/node that referenced this pull request Jul 31, 2022
Original commit log follows:

darwin: remove EPROTOTYPE error workaround (libuv/libuv#3405)

It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv/libuv#482
Fixes: #43916

PR-URL: #43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
targos pushed a commit to nodejs/node that referenced this pull request Jul 31, 2022
Original commit log follows:

darwin: translate EPROTOTYPE to ECONNRESET (libuv/libuv#3413)

macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv/libuv#482
Refs: libuv/libuv#3405
Fixes: #43916

PR-URL: #43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Fyko pushed a commit to Fyko/node that referenced this pull request Sep 15, 2022
Original commit log follows:

darwin: remove EPROTOTYPE error workaround (libuv/libuv#3405)

It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv/libuv#482
Fixes: nodejs#43916

PR-URL: nodejs#43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Fyko pushed a commit to Fyko/node that referenced this pull request Sep 15, 2022
Original commit log follows:

darwin: translate EPROTOTYPE to ECONNRESET (libuv/libuv#3413)

macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv/libuv#482
Refs: libuv/libuv#3405
Fixes: nodejs#43916

PR-URL: nodejs#43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
guangwong pushed a commit to noslate-project/node that referenced this pull request Oct 10, 2022
Original commit log follows:

darwin: remove EPROTOTYPE error workaround (libuv/libuv#3405)

It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv/libuv#482
Fixes: nodejs/node#43916

PR-URL: nodejs/node#43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
guangwong pushed a commit to noslate-project/node that referenced this pull request Oct 10, 2022
Original commit log follows:

darwin: translate EPROTOTYPE to ECONNRESET (libuv/libuv#3413)

macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv/libuv#482
Refs: libuv/libuv#3405
Fixes: nodejs/node#43916

PR-URL: nodejs/node#43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
danielleadams pushed a commit to nodejs/node that referenced this pull request Oct 29, 2022
Original commit log follows:

darwin: remove EPROTOTYPE error workaround (libuv/libuv#3405)

It's been reported in the past that OS X 10.10, because of a race
condition in the XNU kernel, sometimes returns a transient EPROTOTYPE
error when trying to write to a socket. Libuv handles that by retrying
the operation until it succeeds or fails with a different error.

Recently it's been reported that current versions of the operating
system formerly known as OS X fail permanently with EPROTOTYPE under
certain conditions, resulting in an infinite loop.

Because Apple isn't exactly forthcoming with bug fixes or even details,
I'm opting to simply remove the workaround and have the error bubble up.

Refs: libuv/libuv#482
Fixes: #43916

PR-URL: #43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
danielleadams pushed a commit to nodejs/node that referenced this pull request Oct 29, 2022
Original commit log follows:

darwin: translate EPROTOTYPE to ECONNRESET (libuv/libuv#3413)

macOS versions 10.10 and 10.15 - and presumbaly 10.11 to 10.14, too -
have a bug where a race condition causes the kernel to return EPROTOTYPE
because the socket isn't fully constructed.

It's probably the result of the peer closing the connection and that is
why libuv translates it to ECONNRESET.

Previously, libuv retried until the EPROTOTYPE error went away but some
VPN software causes the same behavior except the error is permanent, not
transient, turning the retry mechanism into an infinite loop.

Refs: libuv/libuv#482
Refs: libuv/libuv#3405
Fixes: #43916

PR-URL: #43950
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants