Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle validation failures on startup more gracefully #1113

Closed
adamconnelly opened this issue Jun 27, 2020 · 1 comment · Fixed by #1114 or #1144
Closed

Handle validation failures on startup more gracefully #1113

adamconnelly opened this issue Jun 27, 2020 · 1 comment · Fixed by #1114 or #1144
Assignees
Labels
enhancement Enhancements for current features

Comments

@adamconnelly
Copy link
Contributor

At the moment Promitor relies on throwing a ValidationFailedException to crash the application if the configuration isn't valid. This isn't ideal because it adds noise to the console output, meaning you have to scroll back up past the stack trace to get to the validation error.

Here's an example of the output you receive:

[13:56:38 INF] Starting validation of Promitor setup
[13:56:38 INF] Start Validation step 1/6: Metrics Declaration Path
[13:56:38 INF] Scrape configuration found at '/home/adam/github.com/adamconnelly/promitor/config/promitor/scraper/metrics.yaml'
[13:56:38 INF] Validation step 1/6 succeeded
[13:56:38 INF] Start Validation step 2/6: Azure Authentication
[13:56:38 INF] Validation step 2/6 succeeded
[13:56:38 INF] Start Validation step 3/6: Metrics Declaration
[13:56:38 INF] Metrics declaration is using spec version v1
[13:56:38 ERR] The following problems were found with the metric configuration:
Error 1:1: 'metrics' is a required field but was not found.
Warning 12:1: Unknown field 'metric'. Did you mean 'metrics'?
[13:56:38 WRN] Validation step 3/6 failed. Error(s): Errors were found while deserializing the metric configuration.
[13:56:38 INF] Start Validation step 4/6: Resource Discovery
[13:56:38 INF] Validation step 4/6 succeeded
[13:56:38 INF] Start Validation step 5/6: StatsD Metric Sink
[13:56:38 INF] Validation step 5/6 succeeded
[13:56:38 INF] Start Validation step 6/6: Prometheus Scraping Endpoint Metric Sink
[13:56:38 INF] Validation step 6/6 succeeded
[13:56:38 FTL] Promitor is not configured correctly. Please fix validation issues and re-run.
[13:56:38 FTL] Host terminated unexpectedly
Promitor.Agents.Scraper.Validation.Exceptions.ValidationFailedException: Validation Failed. Errors:- Metrics Declaration: Errors were found while deserializing the metric configuration.

   at Promitor.Agents.Scraper.Validation.RuntimeValidator.ProcessValidationResults(List`1 validationResults) in /home/adam/github.com/adamconnelly/promitor/src/Promitor.Agents.Scraper/Validation/RuntimeValidator.cs:line 61
   at Promitor.Agents.Scraper.Validation.RuntimeValidator.Run() in /home/adam/github.com/adamconnelly/promitor/src/Promitor.Agents.Scraper/Validation/RuntimeValidator.cs:line 47
   at Promitor.Agents.Scraper.Startup.ValidateRuntimeConfiguration(IServiceCollection services) in /home/adam/github.com/adamconnelly/promitor/src/Promitor.Agents.Scraper/Startup.cs:line 87
   at Promitor.Agents.Scraper.Startup.ConfigureServices(IServiceCollection services) in /home/adam/github.com/adamconnelly/promitor/src/Promitor.Agents.Scraper/Startup.cs:line 58
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor, Boolean wrapExceptions)
   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at Microsoft.AspNetCore.Hosting.ConfigureServicesBuilder.InvokeCore(Object instance, IServiceCollection services)
   at Microsoft.AspNetCore.Hosting.ConfigureServicesBuilder.<>c__DisplayClass9_0.<Invoke>g__Startup|0(IServiceCollection serviceCollection)
   at Microsoft.AspNetCore.Hosting.ConfigureServicesBuilder.Invoke(Object instance, IServiceCollection services)
   at Microsoft.AspNetCore.Hosting.ConfigureServicesBuilder.<>c__DisplayClass8_0.<Build>b__0(IServiceCollection services)
   at Microsoft.AspNetCore.Hosting.GenericWebHostBuilder.UseStartup(Type startupType, HostBuilderContext context, IServiceCollection services)
   at Microsoft.AspNetCore.Hosting.GenericWebHostBuilder.<>c__DisplayClass12_0.<UseStartup>b__0(HostBuilderContext context, IServiceCollection services)
   at Microsoft.Extensions.Hosting.HostBuilder.CreateServiceProvider()
   at Microsoft.Extensions.Hosting.HostBuilder.Build()
   at Promitor.Agents.Scraper.Program.Main(String[] args) in /home/adam/github.com/adamconnelly/promitor/src/Promitor.Agents.Scraper/Program.cs:line 23

In this case the situation isn't really exceptional, in the sense that it's part of the normal lifecycle of the application, and we expect that validation can fail (which is why we're doing it in the first place!), so relying on an exception like this doesn't feel appropriate. The stack trace also doesn't add value to Promitor users since the validation already tells them what's wrong, and isn't necessary for developers since we know where the validation code lives, and we can see the step that failed from the error message.

Specification

The application should follow the following lifecycle:

  1. Load configuration and register any services.
  2. Run validation to make sure we can start successfully.
  3. Start the web server and scraping jobs.

We can use a similar approach to the one outlined here to achieve this: https://andrewlock.net/running-async-tasks-on-app-startup-in-asp-net-core-part-1/#4-manually-running-tasks-in-program-cs.

This will allow us to run validation, exiting with a non-zero exit code if validation fails.

Question

Currently validation continues even if a previous step has failed. Is this deliberate, and is it the behaviour we want?

adamconnelly added a commit to adamconnelly/promitor that referenced this issue Jun 27, 2020
I've tweaked the way that the startup process for Promitor works so that it runs the validation in the `Main()` method. This gives us the opportunity to exit gracefully if validation fails instead of throwing an exception.

I've also added a new enum to track the possible exit statuses, and made sure that unhandled exceptions continue to use an exit code of `1`.

Fixes tomkerkhove#1113
adamconnelly added a commit to adamconnelly/promitor that referenced this issue Jun 27, 2020
I've tweaked the way that the startup process for Promitor works so that it runs the validation in the `Main()` method. This gives us the opportunity to exit gracefully if validation fails instead of throwing an exception.

I've also added a new enum to track the possible exit statuses, and made sure that unhandled exceptions continue to use an exit code of `1`.

Fixes tomkerkhove#1113
adamconnelly added a commit to adamconnelly/promitor that referenced this issue Jun 27, 2020
I've tweaked the way that the startup process for Promitor works so that it runs the validation in the `Main()` method. This gives us the opportunity to exit gracefully if validation fails instead of throwing an exception.

I've also added a new enum to track the possible exit statuses, and made sure that unhandled exceptions continue to use an exit code of `1`.

Fixes tomkerkhove#1113
adamconnelly added a commit to adamconnelly/promitor that referenced this issue Jun 28, 2020
I've tweaked the way that the startup process for Promitor works so that it runs the validation in the `Main()` method. This gives us the opportunity to exit gracefully if validation fails instead of throwing an exception.

Also:

- Added a new enum to track the possible exit statuses, and made sure that unhandled exceptions continue to use an exit code of `1`.
- Updated the unhandled exception message to point people to raising an issue.
- Altered the check to make sure the config folder is set so that it exits gracefully instead of ending up in the unhandled exception block.
- Moved the logging about whether or not the configuration is valid from RuntimeValidator into the main method. It seemed more appropriate for the logging to be there since the main method now has logic for exiting if the config is invalid.

Fixes tomkerkhove#1113
adamconnelly added a commit to adamconnelly/promitor that referenced this issue Jun 28, 2020
I've tweaked the way that the startup process for Promitor works so that it runs the validation in the `Main()` method. This gives us the opportunity to exit gracefully if validation fails instead of throwing an exception.

Also:

- Added a new enum to track the possible exit statuses, and made sure that unhandled exceptions continue to use an exit code of `1`.
- Updated the unhandled exception message to point people to raising an issue.
- Altered the check to make sure the config folder is set so that it exits gracefully instead of ending up in the unhandled exception block.
- Moved the logging about whether or not the configuration is valid from RuntimeValidator into the main method. It seemed more appropriate for the logging to be there since the main method now has logic for exiting if the config is invalid.

Fixes tomkerkhove#1113
@tomkerkhove tomkerkhove added this to the v2.0.0 milestone Jun 29, 2020
@tomkerkhove tomkerkhove added the enhancement Enhancements for current features label Jun 29, 2020
@tomkerkhove
Copy link
Owner

Re-opening for resource discovery agent

@tomkerkhove tomkerkhove reopened this Jun 29, 2020
adamconnelly added a commit to adamconnelly/promitor that referenced this issue Jul 3, 2020
- Moved `ExitStatus` into the agents core and updated the discovery agent to use it.
- Updated the unhandled exception message for the discovery agent to match the format of the scraper agent.
- Added some additional validation to both agents to check that their required config files exist. This is to avoid us ending up in the unhandled exception block and directing users to create an issue.

Fixes tomkerkhove#1113
adamconnelly added a commit to adamconnelly/promitor that referenced this issue Jul 5, 2020
- Moved `ExitStatus` into the agents core and updated the discovery agent to use it.
- Updated the unhandled exception message for the discovery agent to match the format of the scraper agent.
- Added some additional validation to both agents to check that their required config files exist. This is to avoid us ending up in the unhandled exception block and directing users to create an issue.

Fixes tomkerkhove#1113
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancements for current features
Projects
None yet
2 participants