Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow connection_timeout to be set for MySQL checks #2729

Merged
merged 2 commits into from Aug 8, 2016

Conversation

scottgeary
Copy link
Contributor

What does this PR do?

Allow a connection timeout to be used on when connecting to MySQL instances.

Motivation

When monitoring more than one instance of MySQL, if the host becomes unavailable, it'll block all the other MySQL instance checks. This block can be longer than the polling period, and you get false alerts and service checks.

Use case: Monitoring many RDS instances from a single host. A failover event occurs, causing connection issues to a single box temporarily. This connection hang (while the Master is being rotated), also blocks all the other monitored instances.

Testing Guidelines

Unsure how to a reasonable test case. Other than pull a remote host down, or disable networking.

Additional Notes

Defaults to the original no-timeout 'None' behaviour. When the original behaviour is used, the OS socket defaults on sock.settimeout() are used. Which could be a long time.

Also, read_timeout + write_timeout could also occur (once connected, and issuing SQL), but the bundled version of pymysql (v0.6.6) isn't new enough to contain these parameters. (v0.7.3 required).

Defaults to the original 'None' behaviour: falling back to OS socket defaults on sock.settimeout()
@@ -330,9 +330,10 @@ def _get_config(self, instance):
options = instance.get('options', {})
queries = instance.get('queries', [])
ssl = instance.get('ssl', {})
connect_timeout = instance.get('connect_timeout', None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should connect_timeout be cast as an int ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a question of what the default setting should be:

  • Setting it to None preserves the original check behaviour (leaving it to system socket defaults).
  • Explicitly setting it to 1 second changes the current behaviour, and I'm not sure that assuming 1 second would be appropriate for everyone.

The actual code that changes the socket timeout accepts any value types integer/float as well as None.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we had to cast the instance.get('connect_timeout') to an int when it wasn't None, but it doesn't seem to be needed.

We have to keep the default to None for backward compatibility, even if it would probably better to set one at 5 seconds or so. Thanks for the details!

@degemer
Copy link
Member

degemer commented Aug 5, 2016

Thank you for the PR @scottgeary ! I added one comment, and I was also wondering why you didn't add the connect_timeout to pymysql.connect(read_default_file=defaults_file, ssl=ssl): is it already part of the config file ?

@degemer degemer merged commit 4cba70a into DataDog:master Aug 8, 2016
@degemer
Copy link
Member

degemer commented Aug 8, 2016

Thanks again for the PR, it's merged now and will be part of the next minor release!

scottgeary added a commit to vend/dd-agent that referenced this pull request Aug 9, 2016
* Allow connection_timeout to be set for pymysql instances

Defaults to the original 'None' behaviour: falling back to OS socket defaults on sock.settimeout()

* Pass in connect_timeout when using default file too
@truthbk truthbk added this to the 5.9.0 milestone Aug 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants