Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retry on cancelling of getaddrinfo #11037

Merged
merged 1 commit into from
Jun 21, 2024
Merged

Conversation

ko1
Copy link
Contributor

@ko1 ko1 commented Jun 21, 2024

When the registerred unblock function is called, it should retry the cancelled blocking function if possible after checkints.

For example, SIGCHLD can cancel this method, but it should not raise any exception if there is no trap handlers.

The following is repro-code:

require 'socket'
PN = 10_000

1000000.times{
  p _1
  PN.times{
    fork{
      sleep rand(0.3)
    }
  }
  i = 0
  while i<PN
    cpid = Process.wait -1, Process::WNOHANG
    if cpid
      # p [i, cpid]
      i += 1
    end

    begin
      TCPServer.new(nil, 0).close
    rescue
      p $!
      exit!
    end
  end
}

When the registerred unblock function is called, it should retry
the cancelled blocking function if possible after checkints.

For example, `SIGCHLD` can cancel this method, but it should not
raise any exception if there is no trap handlers.

The following is repro-code:

```ruby
require 'socket'
PN = 10_000

1000000.times{
  p _1
  PN.times{
    fork{
      sleep rand(0.3)
    }
  }
  i = 0
  while i<PN
    cpid = Process.wait -1, Process::WNOHANG
    if cpid
      # p [i, cpid]
      i += 1
    end

    begin
      TCPServer.new(nil, 0).close
    rescue
      p $!
      exit!
    end
  end
}
```
@ko1 ko1 enabled auto-merge (rebase) June 21, 2024 12:59
@ko1
Copy link
Contributor Author

ko1 commented Jun 21, 2024

For example,

trap(:USR1){ task }
blocking_operation

In this case, task will be invoked by interrupting blocking_operation (called by rb_thread_check_ints()) when SIGUSR1 is received. However user doesn't expect that blocking_operation is cancelled by SIGUSR1. So blocking_operation should be rerun after rb_thread_check_ints() is called if it is retriable.

BTW SIGINT or Thread#raise raise an exception at rb_thread_check_ints(), so it can be no problem without retrying. So this issue is easy to overlook.

@@ -511,7 +511,7 @@ rb_getaddrinfo(const char *hostp, const char *portp, const struct addrinfo *hint
if (err == 0) *ai = arg->ai;
}
else if (arg->cancelled) {
err = EAI_AGAIN;
retry = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
retry = 1;
// If canceled due to signal delivery, we need to retry the process after the trap handler is executed.
retry = 1;

@@ -734,7 +734,7 @@ rb_getnameinfo(const struct sockaddr *sa, socklen_t salen,
}
}
else if (arg->cancelled) {
err = EAI_AGAIN;
retry = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
retry = 1;
// If canceled due to signal delivery, we need to retry the process after the trap handler is executed.
retry = 1;

@ko1 ko1 merged commit bd583ca into ruby:master Jun 21, 2024
107 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants