Skip to content

Conversation

@carlpett
Copy link
Contributor

@carlpett carlpett commented Apr 5, 2021

Change Description

This PR adds support for running with Type=notify under systemd. The benefit of this is that service startup is not marked as successful until the signal has been sent, meaning other services with appropriate Requires and After directives won't try to start until the proxy is ready to handle connections

Checklist

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea. (Kinda? There already was one...)
  • Ensure the tests and linter pass
  • Appropriate documentation is updated (if necessary)

Relevant issues:

@google-cla
Copy link

google-cla bot commented Apr 5, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no label Apr 5, 2021

logging.Infof("Ready for new connections")

go func() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I debated this a bit, but ended up running the notifies in separate goroutines to guard against any potential weirdness. Would gladly have a discussion on if this is a good idea or not.
The call dials a unix socket (but only if NOTIFY_SOCKET is set), and I'm wary that under the right set of unfortunate circumstances that could hang. But I don't know exactly what those circumstances would be (maybe if systemd itself is hanging)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any idea what the impact on systems without systemd might be?

Is this actually ready at this point in time? Or does it need to wait for proxyCleint.Run() to be called below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the docs, looks like this only kicks in if $NOTIFY_SOCKET is set (which you referred to above). So this shouldn't have any effect on systems without systemd present? Have you done any testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly, the code bails early if the environment variable isn't set. So non-systemd users will not get any change in behaviour unless they set that themselves.

Is this actually ready at this point in time?

Probably no, you are very correct that it should be holding until after proxyClient.Run, at least. In my testing the connection set up fast enough that I didn't see that the log statement above my addition was a bit premature.

But yes, I've done testing, and apart from that embarrassing part, the happy cases work well. I'm not sure what the less happy ones would be, though. From digging a bit, it looks like even proxyClient.Run might not be entirely correct, since it just kicks off the connections starting?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're actually okay, because we start receiving connections with the connSrc is initialized, even if they aren't handled until Run() is called. However it's probably logically better to move this down closer to .Run() and add a comment, to help minimize the chance of a potentially long running bit sneaking in-between the two sections.

@carlpett
Copy link
Contributor Author

carlpett commented Apr 5, 2021

I'm not sure why the CLA doesn't work. I'm fairly sure I've signed it previously, but now it appears to try to match it up against my corporate identity when I open the link? (Not US-based, and my employer does not have rights over my spare time contributions)

Copy link
Contributor

@kurtisvg kurtisvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to consider putting it behind a flag as well - user's without systemd might be


logging.Infof("Ready for new connections")

go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any idea what the impact on systems without systemd might be?

Is this actually ready at this point in time? Or does it need to wait for proxyCleint.Run() to be called below.


logging.Infof("Ready for new connections")

go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the docs, looks like this only kicks in if $NOTIFY_SOCKET is set (which you referred to above). So this shouldn't have any effect on systems without systemd present? Have you done any testing?

@kurtisvg
Copy link
Contributor

kurtisvg commented Apr 5, 2021

@carlpett also RE: CLA bot, it uses the commit email. Is that address correct for the CLA?

@carlpett
Copy link
Contributor Author

carlpett commented Apr 5, 2021

Yes, should be. But signing in with my private account if doesn't seem to find the old CLA, so 🤷 Re-signed it now.
@googlebot I signed it!

@google-cla google-cla bot added cla: yes and removed cla: no labels Apr 5, 2021
Copy link
Contributor

@kurtisvg kurtisvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We are planning a release tomorrow, but we're gonna wait until it's complete before merging this in.

Also @enocom is gonna run a couple of smoke tests on:

  • linux w/ systemd
  • windows (w/o systemd)

@kurtisvg
Copy link
Contributor

kurtisvg commented Apr 5, 2021

Also, this would have fixed #290, but I'm not sure it's addressing #234 as a whole. We should follow up on that before closing it.

@enocom
Copy link
Member

enocom commented Apr 15, 2021

Sorry for the slow response. I'll be double-checking this today.

@enocom
Copy link
Member

enocom commented Apr 15, 2021

I ran the following test to confirm that this PR works with systemd where Type=notify.

First I used the following unit file:

eno@scratch:~/cloudsql-proxy$ cat /etc/systemd/system/cloud-sql-proxy.service 
[Unit]
Description=GCP CloudSQL Proxy
After=network.target

[Service]
Type=notify
User=eno
Restart=always
ExecStart=/home/eno/cloudsql-proxy/cloud_sql_proxy -instances=PROJECT_ID_OMITTED:us-central1:postgres-instance=tcp:5432

[Install]
WantedBy=multi-user.target

And then I built the proxy from this PR and confirmed the service started:

eno@scratch:~$ sudo systemctl status cloud-sql-proxy
● cloud-sql-proxy.service - GCP CloudSQL Proxy
   Loaded: loaded (/etc/systemd/system/cloud-sql-proxy.service; disabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-04-15 17:33:28 UTC; 5min ago
 Main PID: 5273 (cloud_sql_proxy)
    Tasks: 7 (limit: 4665)
   Memory: 13.2M
   CGroup: /system.slice/cloud-sql-proxy.service
           └─5273 /home/eno/cloudsql-proxy/cloud_sql_proxy -instances=PROJECT_ID_OMITTED:us-central1:postgres-instance=tcp:5432

Apr 15 17:33:17 scratch systemd[1]: Starting GCP CloudSQL Proxy...
Apr 15 17:33:17 scratch cloud_sql_proxy[5273]: 2021/04/15 17:33:17 Rlimits for file descriptors set to {Current = 8500, Max = 524288}
Apr 15 17:33:18 scratch cloud_sql_proxy[5273]: 2021/04/15 17:33:18 Listening on 127.0.0.1:5432 for PROJECT_ID_OMITTED:us:us-central1:postgres-instance
Apr 15 17:33:18 scratch cloud_sql_proxy[5273]: 2021/04/15 17:33:18 Ready for new connections
Apr 15 17:33:28 scratch systemd[1]: Started GCP CloudSQL Proxy.
~

Next, I rebuilt the binary, omitted the call to daemon.Sdnotify, and confirmed the service failed to start.

@enocom
Copy link
Member

enocom commented Apr 15, 2021

I tried this change out on a Window machine as well.

Here's what I did:

  1. Built the proxy from this PR
  2. Ran it
  3. Made sure I could connect to my instance as usual

All works as expected.

@enocom
Copy link
Member

enocom commented Apr 15, 2021

As for #234, I think this PR won't entirely fix the issue and so we should keep the issue open after merging this.

@enocom enocom added the automerge Merge the pull request once unit tests and other checks pass. label Apr 15, 2021
@enocom
Copy link
Member

enocom commented Apr 15, 2021

Thanks @carlpett for this PR. Looks great.

@gcf-merge-on-green
Copy link
Contributor

Merge-on-green attempted to merge your PR for 6 hours, but it was not mergeable because either one of your required status checks failed, one of your required reviews was not approved, or there is a do not merge label. Learn more about your required status checks here: https://help.github.com/en/github/administering-a-repository/enabling-required-status-checks. You can remove and reapply the label to re-run the bot.

@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Apr 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants