Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby: Provide reset mechanism for low-level resources #8798

Open
blowmage opened this issue Nov 19, 2016 · 30 comments · Fixed by #33430
Open

Ruby: Provide reset mechanism for low-level resources #8798

blowmage opened this issue Nov 19, 2016 · 30 comments · Fixed by #33430

Comments

@blowmage
Copy link
Contributor

Please add a mechanism to be called on the Ruby GRPC client to reset low-level resources. This will be useful when the Ruby GRPC client is in a process that has been forked and resources need to be reset. This will help when Ruby GRPC is used in Rails apps using Puma or Unicorn, which regularly fork processes.

In #7951 @soltanmm-google said:

Forking processes and using gRPC across processes is not supported behavior due to very low-level resource issues. Short of having a way to introspect the kernel resources we're using and micromanaging them from some wrapped call to fork in a system-dependent way (which... just... no, heck no) this isn't tractable.

The Ruby GRPC client does not need to micromanage kernel resources to detect when its process has been forked, as this can be called by the user on process fork. All we need is a mechanism to do so.

@murgatroid99
Copy link
Member

@ctiller @nicolasnoble What would we need to do to create this?

@soltanmm-google
Copy link
Contributor

soltanmm-google commented Dec 13, 2016

If Ruby provides a way to just drop the module s.t. it's entirely unloaded, and if Ruby's wrapper calls grpc_shutdown deterministically on module clean-up, then that should be sufficient, and gRPC doesn't need to do anything (or really even should), right?

And that isn't something gRPC should provide as an explicit capability if the language supports it.

@jrun
Copy link

jrun commented Feb 2, 2017

Is there anything I can do to help move this issue forward? This issue prevents me from being able to use grpc dependent google-cloud-ruby libraries in my internal services.

@apolcyn
Copy link
Contributor

apolcyn commented Feb 2, 2017

@jrun this is under discussion but there are some difficulties around it.

But getting some more specific use cases and problems would be helpful. For example, do you need "pre-fork" and "port-fork" hooks to reset an active gRPC library? Possibly deferred library startup is sufficient?

@jrun
Copy link

jrun commented Feb 2, 2017

Thanks for following up. In our specific use case a "post-fork" callback is needed to reset active gRPC connections. Unfortunately a deferred library startup isn’t sufficient.

Our backend Ruby services use a prefork model. A post-fork callback is used to re-establish network connections (e.g. database, messaging, caching services) in the forked process. A deferred library startup isn’t sufficient because the master process often needs to access the same APIs as the child processes.

For example the master may initially read it’s configuration from Cloud Datastore to determine the set of child workers to spawn. The child workers then may need to read/write to Cloud Datastore as part of their operation.

Another example is Stackdriver Error Reporting. The master and child workers need to be able to write to the Error Reporting service.

@jrun
Copy link

jrun commented Feb 2, 2017

I want to clarify one point in an effort to avoid any miscommunication. We don't need the gRPC library to be provided a "post-fork" callback. We need a method to call that re-establishes the underlying gRPC connections which will be called from our own "post-fork" callback.

@jrun
Copy link

jrun commented Feb 6, 2017

@apolcyn Is there anything additional I can provide? Does the use case I provide make sense?

@apolcyn
Copy link
Contributor

apolcyn commented Feb 6, 2017

@jrun thanks for data point, this is helpful and makes sense. AFAICS supporting the case described here will require some complicated changes to the core C-library that grpc-ruby is wrapping. But this is important - taking a look at how feasible this is.

@Gubbi
Copy link

Gubbi commented Aug 2, 2017

@apolcyn Is there a temporary workaround recommended until this is fixed at the C-library level?

@apolcyn
Copy link
Contributor

apolcyn commented Aug 2, 2017

@Gubbi avoiding use grpc library in the parent process before forking is the best thing to do AFAIK. Note that since the change in #10670, the library won't initialize until the first grpc object (e.g. channel/stub, server), is created.

@ctiller ctiller removed their assignment Dec 13, 2017
@ebenoist
Copy link

ebenoist commented Apr 4, 2018

Having an explicit reset or shutdown and start hook would make this issue much simpler to deal with. Is it not possible to stop the underlying event loop and call the grpc_init() function again? Would exposing grpc_rb_shutdown and grpc_ruby_once_init_internal allow the underlying library to be reset?

@apolcyn
Copy link
Contributor

apolcyn commented Apr 5, 2018

@ebenoist something along the lines of what you described should be possible now, and it's actually something I've been meaning to do. What I'm thinking is we can expose global "before fork" and "after fork" hooks which applications are responsible for calling before and after forking, and which will themselves basically shut down and restart the grpc library. We can probably get such an API available within the next couple of releases (not the soon-to-come 1.11 release, but probably in the 1.12 or 1.13 releases).

@ebenoist
Copy link

ebenoist commented Apr 5, 2018

@apolcyn Thank you so much for your prompt response. That API makes a lot of sense to me and I'd be eager to test it out for you folks. Please let me know if there is anything I can do to help.

@stale
Copy link

stale bot commented Sep 5, 2019

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 180 days. It will be closed automatically if no further update occurs in 1 day. Thank you for your contributions!

@blowmage
Copy link
Contributor Author

blowmage commented Sep 5, 2019

This issue is still outstanding, AFAIK. We still want to be able to reset the Ruby GRPC client's low-level resource after a fork.

@stale stale bot removed the disposition/stale label Sep 5, 2019
@jeremywadsack
Copy link

jeremywadsack commented Jan 3, 2020

This affects us as well using Google Cloud Monitoring (Stackdriver). I am trying to write metics from background Resque jobs. Resque forks from the main process for each job and the jobs all fail with this:

grpc cannot be used before and after forking

Using grpc 1.25.0.

I've looked through the discussions here and I'm not sure that the "fix" in #16332 (https://github.com/grpc/grpc/pull/16332/files#diff-40f6e37e5d9670d49001d5551bc9da82R275) is correct. It checks if the PID has changed which happens when the API forks.

If GRPC is now lazy loaded (I think this says that: googleapis/google-cloud-ruby#2917 (comment)), then if we load it on each fork, then shouldn't that work without hanging?

@blowmage
Copy link
Contributor Author

blowmage commented May 6, 2020

This issue is still outstanding.

@stale stale bot removed the disposition/stale label May 6, 2020
@stale
Copy link

stale bot commented Aug 5, 2020

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 30 days. It will be closed automatically if no further update occurs in 7 day. Thank you for your contributions!

@anujbiyani
Copy link

Copying @blowmage 's bump from earlier:

This issue is still outstanding, AFAIK. We still want to be able to reset the Ruby GRPC client's low-level resource after a fork.

@stale stale bot removed the disposition/stale label Aug 6, 2020
@stale
Copy link

stale bot commented Nov 5, 2020

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 30 days. It will be closed automatically if no further update occurs in 7 day. Thank you for your contributions!

@blowmage
Copy link
Contributor Author

blowmage commented Nov 5, 2020

This issue is still outstanding, AFAIK. We still want to be able to reset the Ruby GRPC client's low-level resource after a fork.

@stale stale bot removed the disposition/stale label Nov 5, 2020
@fabirydel
Copy link

Is there any update regarding this issue? I'm using Google Cloud Logging and must absolutely send logs from both the parent and child processes. As it stands, I can't use the stackdriver gem because of this issue, has anyone come up with a workaround?

@jrun
Copy link

jrun commented Dec 22, 2020

@fabirydel If you're systems have google-fluentd installed, you can configure a forward source that listens on 127.0.0.1. The fluent out_google_cloud has the details on what the JSON payload should be.

@ianks
Copy link

ianks commented Mar 2, 2021

We are running into this as well. Really painful.

@stale
Copy link

stale bot commented Jun 2, 2021

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 30 days. It will be closed automatically if no further update occurs in 7 day. Thank you for your contributions!

@matthewford
Copy link

bump

@syed-mohsin
Copy link

bump :)

@apolcyn
Copy link
Contributor

apolcyn commented Jul 7, 2023

FYI there is work in progress on this in #33430 (should get into the 1.57 release)

apolcyn added a commit that referenced this issue Jul 10, 2023
Adds experimental fork support to gRPC/Ruby

Works towards #8798 (see caveats for why this wasn't marked fixed yet)
Works towards #33578 (see caveats for why this wasn't marked fixed yet)

This leverages existing `pthread_atfork` based C-core support for
forking that python/php use, but there's a bit extra involved mainly
because gRPC/Ruby has additional background threads.

New tests under `src/ruby/end2end` show example usage.

Based on #33495

Caveats:
- Bidi streams are not yet supported (bidi streams spawn background
threads which are not yet fork safe)
- Servers not supported
- Only linux supported
@apolcyn apolcyn reopened this Jul 10, 2023
@apolcyn
Copy link
Contributor

apolcyn commented Jul 10, 2023

Reopening because #33430 isn't a complete fix (forking with bidi streams is not yet supported with that, in particular)

mario-vimal pushed a commit to mario-vimal/grpc that referenced this issue Jul 13, 2023
Adds experimental fork support to gRPC/Ruby

Works towards grpc#8798 (see caveats for why this wasn't marked fixed yet)
Works towards grpc#33578 (see caveats for why this wasn't marked fixed yet)

This leverages existing `pthread_atfork` based C-core support for
forking that python/php use, but there's a bit extra involved mainly
because gRPC/Ruby has additional background threads.

New tests under `src/ruby/end2end` show example usage.

Based on grpc#33495

Caveats:
- Bidi streams are not yet supported (bidi streams spawn background
threads which are not yet fork safe)
- Servers not supported
- Only linux supported
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.