[HUDI-6399] Warn when datadog api key is wrong#8997
[HUDI-6399] Warn when datadog api key is wrong#8997parisni wants to merge 1 commit intoapache:masterfrom
Conversation
|
@xushiyan maybe ? |
| } catch (IOException e) { | ||
| throw new IllegalStateException("Failed to connect to Datadog to validate API key.", e); | ||
| } catch (IOException | IllegalStateException e) { | ||
| LOG.warn(String.format("Failed to connect to Datadog to validate API key. %s", e.getMessage())); |
There was a problem hiding this comment.
Should we still fail the job if the metrics collector does not work?
There was a problem hiding this comment.
We just should show a warning as proposed here
There was a problem hiding this comment.
do you mean, catch any exeption ? this makes sense
There was a problem hiding this comment.
Should we still fail the job if the metrics collector does not work?
+1
There was a problem hiding this comment.
What I meant is, we should not change the behavior of throwing an exception here. If the metric collection does not work due to API key, it should fail the job so that the user knows and fixes it before proceeding.
There was a problem hiding this comment.
I don't understand the rational here. Should user know the API key is not working and fix it before running the job again to properly generate the metrics? It's not a good idea to silently fail here.
There was a problem hiding this comment.
To me, the metric provider is responsible to contact the user if metrics won't work (mailing alarm, oncall ...). But the ingestion jobs should not stop working. Not having metrics is a minor problem versus having all the company pipelines broken because of a token renewal issue.
Also users configure the metrics provider to alarm in case of no metrics.
At least I assume some user won't want their nightly jobs broken because of token, this would also be the case for an API or any metrics collection, outage is a minor problem versus stopping working.
Currently same apply for pushing metrics, if it does not work, it is only a warning see
There was a problem hiding this comment.
Got it. I suggest having a feature flag on whether to fail the job if the metric provider does not work. By default, it's on, i.e., failing the job due to metric provider, the same behavior as before, while users can turn this off in the case metrics can be skipped.
There was a problem hiding this comment.
https://hudi.apache.org/docs/configurations#hoodiemetricsdatadogapikeyskipvalidation
shame on me, the option to skip validation already exist. Then this PR is useless.
Change Logs
Currently when the datadog api key is wrong the job will fail. We likely should not fail but log a warning to avoid the whole pipeline fails
Impact
Describe any public API or user-facing feature change or any performance impact.
Risk level (write none, low medium or high below)
If medium or high, explain what verification was done to mitigate the risks.
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist