Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[forwarder] allow multiple endpoints/api_keys #2564

Merged
merged 7 commits into from Jun 15, 2016
Merged

Conversation

degemer
Copy link
Member

@degemer degemer commented Jun 1, 2016

[config] remove disable_dd CLI option

It simplifies the config logic.

[forwarder] allow multiple endpoints

In order to do so, the config was modified:
other_dd_urls and other_api_keys were added.
They both accept a a comma-separated "list", which is
transformed in a endpoints dict. (merged with dd_url and api_key)
This one has the following format:
{
'https://app.datadoghq.com': ['api_key_abc', 'api_key_def'],
'https://app.example.com': ['api_key_xyz']
}

The forwarder is the only one using this new configuration.
It loops through all endpoints and sends the same payload.

Config tests and update transaction tests.
Set default max_parallelism to 1 in transaction.py. (it's the real
default)

[forwarder] handle errors for each endpoint

Previously everything was dismissed except for the last result.
This commit duplicates payload for each endpoint, so that the transaction
manager handles them properly.

Another solution would have been to make the transaction aware of the errors
for each of the endpoints, but the Transaction Manager and the Transaction were
not created to do this, and thus is/would be a huge pain to update them.
(because of the async nature of the flush)

[forwarder] enable multiple parallel requests

Throttling is still there, so this should only improve the situation where
there is a long running request (especially for mutliple endpoints, if one of
them is timeouting, this should improve the situation of the other endpoint)

[forwarder] add a configurable timeout

Default timeout is 20s, same as before (it's the default timeout of
tornado).
It is now possible to configure it.

[forwarder] limit errors per endpoint per flush

When an endpoint fails 4 times, it is considered down (or at least
unreachable), and all transactions targetting this endpoint are dropped
from the current flush. (they are rescheduled for a future flush)

This also changes the way transactions are scheduled in each flush: new
transactions without any error will be tried first.

It also cleans up a little logs.

@degemer degemer added this to the 5.9.0 milestone Jun 1, 2016
@degemer degemer force-pushed the quentin/rainbow-emitter branch 2 times, most recently from 41ed060 to fa0f09f Compare June 1, 2016 22:26
@degemer degemer changed the title WIP - [forwarder] allow multiple endpoints/api_keys [forwarder] allow multiple endpoints/api_keys Jun 1, 2016
@degemer degemer force-pushed the quentin/rainbow-emitter branch 3 times, most recently from ee692f5 to 510307a Compare June 3, 2016 18:39
@gmmeyer gmmeyer self-assigned this Jun 14, 2016
@degemer degemer force-pushed the quentin/rainbow-emitter branch 2 times, most recently from bcefdc0 to 792dcef Compare June 15, 2016 17:29
@@ -88,6 +93,11 @@ use_mount: no
# histogram_aggregates: max, median, avg, count
# histogram_percentiles: 0.95

# Multiple endpoints/api_keys
Copy link
Contributor

@gmmeyer gmmeyer Jun 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I set a second endpoint and associate keys with it? (I can see how to do it in the tests, you just don't have it here.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it's just that our customers shouldn't be using it. But you're right, it should be specified anyway. I added it. ✅

@gmmeyer
Copy link
Contributor

gmmeyer commented Jun 15, 2016

I think this looks good, though we definitely want to test it extensively to be sure there's no memory issues. 👍

In order to do so, the config was modified:
other_dd_urls and other_api_keys were added.
They both accept a a comma-separated "list", which is
transformed in a endpoints dict. (merged with dd_url and api_key)
This one has the following format:
{
  'https://app.datadoghq.com': ['api_key_abc', 'api_key_def'],
  'https://app.example.com': ['api_key_xyz']
}

The forwarder is the only one using this new configuration.
It loops through all endpoints and sends the same payload.

[tests] multiple endpoints logic

Config tests and update transaction tests.
Set default max_parallelism to 1 in transaction.py. (it's the real
default)
Previously everything was dismissed except for the last result.
This commit duplicates payload for each endpoint, so that the transaction
manager handles them properly.

Another solution would have been to make the transaction aware of the errors
for each of the endpoints, but the Transaction Manager and the Transaction were
not created to do this, and thus is/would be a huge pain to update them.
(because of the async nature of the flush)
Throttling is still there, so this should only improve the situation where
there is a long running request (especially for mutliple endpoints, if one of
them is timeouting, this should improve the situation of the other endpoint)
This is only enabled if multiple endpoints are configured in the config.
Default timeout is 20s, same as before (it's the default timeout of
tornado).
It is now possible to configure it.
When an endpoint fails 4 times, it is considered down (or at least
unreachable), and all transactions targetting this endpoint are dropped
from the *current* flush. (they are rescheduled for a future flush)

This also changes the way transactions are scheduled in each flush: new
transactions without any error will be tried first.

It also cleans up a little logs.
Also add a multiple endpoints test.
@degemer
Copy link
Member Author

degemer commented Jun 15, 2016

Thanks for the review @gmmeyer! I'll be sure to double check this before releasing.

@degemer degemer merged commit f091c0d into master Jun 15, 2016
@degemer degemer deleted the quentin/rainbow-emitter branch June 15, 2016 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants