Startup checks for sources and sinks #67

lukesteensen · 2019-01-04T03:10:21Z

We should allow sources and sinks to run a health check on their dependencies at startup to avoid situations where we boot up and then immediately fail or go into a surprising retry loop. This could potentially tie into #66.

One thing to keep in mind is that there may be cases (e.g. intermittent connectivity issues or recovering after an incident) where starting up in spite of certain types of issues is desirable. A simple solution would be a flag to skip these checks, but a better (and more difficult) one would be some way of differentiating between errors that can be retried and those that are fatal.

michaelfairley · 2019-01-15T16:27:36Z

"but a better (and more difficult) one would be some way of differentiating between errors that can be retried and those that are fatal."

I don't think this is possible. For instance, a connection refused from a Splunk HEC sink could mean either that the HEC server is down (a temporary error) or that the URL for the HEC server in the config file is wrong (a permanent error). I think we'll want a subcommand that validates the config file and runs the healthchecks (but doesn't actually start the server) as a tool to use while getting router set up, but for a smoothly running router setup, the healthchecks are more informational, rather than affecting how the router operates.

lukesteensen · 2019-01-17T17:04:12Z

Yeah, that's probably true. You'd have to assume everything is maybe permanent and then mark specific cases as retriable. And even that would probably be very limited because of cases like the one you mentioned where it could be either.

Fix PROTOC env variable value

michaelfairley self-assigned this Jan 10, 2019

michaelfairley mentioned this issue Jan 15, 2019

Add sink healthchecks and command line flag to shut down if any of them are unhealthy #69

Merged

michaelfairley closed this as completed in #69 Jan 17, 2019

binarylogic added this to the 0.1 milestone Mar 20, 2019

syedriko referenced this issue in syedriko/vector Jun 13, 2022

Merge pull request #67 from vimalk78/fix-protoc

ea69c61

Fix PROTOC env variable value

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Startup checks for sources and sinks #67

Startup checks for sources and sinks #67

lukesteensen commented Jan 4, 2019

michaelfairley commented Jan 15, 2019 •

edited

lukesteensen commented Jan 17, 2019

Startup checks for sources and sinks #67

Startup checks for sources and sinks #67

Comments

lukesteensen commented Jan 4, 2019

michaelfairley commented Jan 15, 2019 • edited

lukesteensen commented Jan 17, 2019

michaelfairley commented Jan 15, 2019 •

edited