Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Add telemetry section to chaos strategies documentation pages #2071

Conversation

peter-csala
Copy link
Contributor

@peter-csala peter-csala commented Apr 23, 2024

Pull Request

The issue or feature being addressed

Follow up of the following PRs:

Details on the issue fix or feature implementation

  • Unified Behavior page
  • Unified Fault page
  • Unified Latency page
  • Unified Outcome page
  • Updated chaos index page
  • Updated telemetry page as well

Confirm the following

  • I started this PR by branching from the head of the default branch
  • I have targeted the PR to merge into the default branch
  • I have included unit tests for the issue/feature
  • I have successfully run a local build

@peter-csala
Copy link
Contributor Author

peter-csala commented Apr 23, 2024

@vany0114

I was playing with the Behavior Chaos and I've just realized if the BehaviorGenerator throws an exception then neither the telemetry will be reported nor the OnBehaviorInjected will be called.

var builder = new ResiliencePipelineBuilder<int>() 
{ 
	Name = "MyPipeline",
	InstanceName = "MyPipelineInstance"
}; 
var pipeline = builder
.AddChaosBehavior(new ChaosBehaviorStrategyOptions()
{
	InjectionRate = 1,
	Enabled = true,
	BehaviorGenerator = static args =>
        {
                Console.WriteLine("Custom Behavior Injected");
		//throw new CustomException(); // uncomment to reproduce
		return default;
        },
	OnBehaviorInjected = static args =>
   	{
		Console.WriteLine("OnBehaviorInjected, Operation: {0}.", args.Context.OperationKey);
		return default;
	}
})
.ConfigureTelemetry(new TelemetryOptions
{
	LoggerFactory = LoggerFactory.Create(builder => builder.AddConsole())
}) 
.Build();

ResilienceContext resilienceContext = ResilienceContextPool.Shared.Get("MyBehaviorInjectedOperation");

try
{
	await pipeline.ExecuteAsync<int>(ctx => { return ValueTask.FromResult<int>(1); }, resilienceContext);
}
catch(Exception ex)
{
	Console.WriteLine(ex.Message);
}

ResilienceContextPool.Shared.Return(resilienceContext);

Dotnet fiddle link: https://dotnetfiddle.net/EaycSE

The standard strategies do report telemetry even in a case of exception. They also call the OnXYZ callback as well.

Does this strategy behave differently intentional?

cc: @martincostello , @martintmk

Copy link

codecov bot commented Apr 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.69%. Comparing base (b89ce23) to head (975a8f3).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2071   +/-   ##
=======================================
  Coverage   83.69%   83.69%           
=======================================
  Files         312      312           
  Lines        7114     7114           
  Branches     1054     1054           
=======================================
  Hits         5954     5954           
  Misses        789      789           
  Partials      371      371           
Flag Coverage Δ
linux 83.69% <ø> (ø)
macos 83.69% <ø> (ø)
windows 83.69% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

docs/chaos/behavior.md Outdated Show resolved Hide resolved
docs/chaos/latency.md Outdated Show resolved Hide resolved
@peter-csala
Copy link
Contributor Author

I hope I do something really silly but for me the Outcome Chaos strategy does not seem to inject exception.

var pipeline = new ResiliencePipelineBuilder<int>()
.AddChaosOutcome(new ChaosOutcomeStrategyOptions<int>()
{
	InjectionRate = 1,
	Enabled = true,
	OutcomeGenerator = static args =>
        {
		//throw new Exception("A");
		return ValueTask.FromResult<Outcome<int>?>(Outcome.FromException<int>(new Exception()));
        },
	//OutcomeGenerator =  new OutcomeGenerator<int>().AddException<Exception>(),
})
.ConfigureTelemetry(new TelemetryOptions
{
	LoggerFactory = LoggerFactory.Create(builder => builder.AddConsole())
}) 
.Build();

var resilienceContext = ResilienceContextPool.Shared.Get();
try
{
	var result = await pipeline.ExecuteAsync<int>(ctx => { return ValueTask.FromResult<int>(1); }, resilienceContext);
	//var result = await pipeline.ExecuteAsync<int>(ctx => { return ValueTask.FromException<int>(new CustomException()); }, resilienceContext);
	result.Dump();
}
catch(Exception ex)
{
	Console.WriteLine(ex.Message);
}
ResilienceContextPool.Shared.Return(resilienceContext);

Then the output is

info: Polly[0]
      Resilience event occurred. EventName: 'Chaos.OnOutcome', Source: '(null)/(null)/Chaos.Outcome', Operation Key: '', Result: ''
Dumping object(Int32)
 0

Dotnet fiddle: https://dotnetfiddle.net/82Chpg

cc: @vany0114

@peter-csala
Copy link
Contributor Author

Based on my experience the Result part of the telemetry events are never populated for the chaos strategies. IMHO we could populate it in case of Fault and Outcome.

What do you guys think?
cc: @martincostello , @martintmk

@peter-csala peter-csala marked this pull request as ready for review April 24, 2024 08:51
Copy link
Member

@martincostello martincostello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy but let's see what @vany0114 thinks before merging.

@vany0114
Copy link
Contributor

Sorry for the delay here guys, will take a look as soon as I can.

@vany0114
Copy link
Contributor

vany0114 commented Apr 26, 2024

I hope I do something really silly but for me the Outcome Chaos strategy does not seem to inject exception.
[...]

@peter-csala you're right! And I it is because of the way we're handling the outcome when it is an exception since originally the outome strategy wasn't designed to return an exception as an outcome, so instead it is returning the default value ot T.

I added the OnOutcomeInjected delegate to the fiddle you shared and look at the results:

info: Polly[0]
      Resilience event occurred. EventName: 'Chaos.OnOutcome', Source: '(null)/(null)/Chaos.Outcome', Operation Key: '', Result: ''
OnOutcomeInjected, Outcome: 0, Operation: .
OnOutcomeInjected, Exception: Exception of type 'System.Exception' was thrown., Operation: .
Dumping object(Int32)
 0

We would need to do a change in the Outcome strategy to fully support it, I can help with that PR, something like this:
image

One caveat there that underscores this usage as an anti-pattern is that the telemetry event was registered as an empty result, instead of the exception, we could also do something like this before returning the exception to mitigate it:

_telemetry.Report(new(ResilienceEventSeverity.Information, ChaosOutcomeConstants.OnOutcomeInjectedEvent), context, outcome.Value.Exception);

@vany0114
Copy link
Contributor

vany0114 commented Apr 26, 2024

I was playing with the Behavior Chaos and I've just realized if the BehaviorGenerator throws an exception then neither the telemetry will be reported nor the OnBehaviorInjected will be called.
The standard strategies do report telemetry even in a case of exception. They also call the OnXYZ callback as well.

Does this strategy behave differently intentional?

@peter-csala Yep that's intended, if the generator throws an exception we're not handling that, we just let the exception to bubble up, I spot-checked a couple of Polly strategies and I couldn't see that they are handling errors in the generators and tracking those as telemetry events 🤔 can you please provide an example when you say that the standard strategies report telemetry even in a case of exception?

@peter-csala
Copy link
Contributor Author

peter-csala commented Apr 26, 2024

I will create a dedicated discussion thread for each question just to keep the PR focused.

I'll update this comment with the links:

@peter-csala
Copy link
Contributor Author

@peter-csala Yep that's intended, if the generator throws an exception we're not handling that, we just let the exception to bubble up, I spot-checked a couple of Polly strategies and I couldn't see that they are handling errors in the generators and tracking those as telemetry events 🤔 can you please provide an example when you say that the standard strategies report telemetry even in a case of exception?

Here is an example in case of Fallback, both the telemetry is reported as well as the OnFallback is called

var pipeline = new ResiliencePipelineBuilder<int>()
.AddFallback(new FallbackStrategyOptions<int>
{
	ShouldHandle = static args => PredicateResult.True(),
	FallbackAction = static args => throw new Exception("Fallback boom"),
	OnFallback = static args => 
	{
		Console.WriteLine("OnFallback is called");
		return default;
	}
})
.ConfigureTelemetry(new TelemetryOptions
{
	LoggerFactory = LoggerFactory.Create(builder => builder.AddConsole())
}) 
.Build();

var resilienceContext = ResilienceContextPool.Shared.Get();
try
{
	var result = await pipeline.ExecuteAsync<int>(ctx => { return ValueTask.FromResult<int>(1); }, resilienceContext);
	result.Dump();
}
catch(Exception ex)
{
	Console.WriteLine(ex.Message);
}
ResilienceContextPool.Shared.Return(resilienceContext);

�[1mwarn: Polly[0]
      Resilience event occurred. EventName: 'OnFallback', Source: '(null)/(null)/Fallback', Operation Key: '', Result: '1'
OnFallback is called
Fallback boom

https://dotnetfiddle.net/S7AjzT

@peter-csala
Copy link
Contributor Author

@vany0114 Are you fine with this documentation PR?

Copy link
Contributor

@vany0114 vany0114 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

---

> [!IMPORTANT]
> Please note that if the calculated latency is negative (regardless if it's fixed or dynamic) then it will not be injected.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@martincostello martincostello merged commit 5441dc9 into App-vNext:main Apr 27, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants