Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polly.Core Caching policy #1127

Open
martincostello opened this issue Apr 14, 2023 · 19 comments
Open

Polly.Core Caching policy #1127

martincostello opened this issue Apr 14, 2023 · 19 comments

Comments

@martincostello
Copy link
Member

Investigate implementing a new caching policy for Polly v8 that depends on Microsoft.Extensions.Caching, rather than implementing a whole cache provider ourselves.

@martincostello martincostello added the v8 Issues related to the new version 8 of the Polly library. label Apr 14, 2023
@martincostello martincostello added this to the v8.0.0 milestone Apr 14, 2023
@martincostello martincostello self-assigned this Apr 14, 2023
@martincostello
Copy link
Member Author

Just did a quick spike of hacking around at this locally and found the following:

  • Polly v7 has a sync and an async cache, which are the consumer is expected to implement themselves as-appropriate using the provided interface. There's also some basic wrapping to handle serializing the value before caching.
  • Microsoft.Extensions.Caching.Abstractions provides the sync IMemoryCache and the async IDistributedCache - this is effectively the same as having a sync and async implementation in Polly v7.

I'm not sure what the best way to go about doing this natively for Polly v8 would be. Off the top of my head, it feels like we'd need a bridge interface that has sync and async methods so the strategy can just pick the right method and what happens under-the-hood is up to the implementation.

Doing the above means the policy can live in the core, but by default it wouldn't actually do anything out of the box (which I think is also true of v7). We could then possibly have an additional assembly that will adapt from IMemoryCache/IDistributedCache to the policy cache interface. There'd possibly be some space for unintentional mistakes here though if someone setup a cache policy associated with IDistributedCache and then did a sync operation on it.

Using it via the extension would then be something like:

// IMemoryCache
builder.AddCaching(
    MemoryCache.Default,
    (context, state) => /* some object */,
    (cacheEntry) => cacheEntry.SlidingExpiration = TimeSpan.FromSeconds(3));

// IDistributedCache
IDistributedCache distributedCache = ...;
builder.AddCaching(
    distributedCache,
    (context, state) => "my-cache-key",
    async (value) => await serializeToByteArray(value));

The other bit I wasn't 100% sure on was on exactly how to get the callbacks for generating the cache key and value to preserve TResult and TState.

@martincostello
Copy link
Member Author

Oh, and this was as far as I got with the provider just noodling around with IMemoryCache. I didn't get to IDistributedCache, or do anything with getting TState passed through.

using Microsoft.Extensions.Caching.Distributed;
using Microsoft.Extensions.Caching.Memory;
using Polly.Strategy;

namespace Polly.Caching;

internal sealed class CachingResilienceStrategy : ResilienceStrategy
{
    private readonly ResilienceStrategyTelemetry _telemetry;

    public CachingResilienceStrategy(
        IMemoryCache cache,
        ResilienceStrategyTelemetry telemetry)
    {
        Cache = cache;
        _telemetry = telemetry;
    }

    public IMemoryCache Cache { get; }

    public Func<OnCacheHitArguments, ValueTask>? OnCacheHit { get; }

    public Func<OnCacheMissedArguments, ValueTask>? OnCacheMissed { get; }

    public Func<OnCachePutArguments, ValueTask>? OnCachePut { get; }

    public Func<KeyGeneratorArguments, ValueTask<object>>? KeyGenerator { get; }

    protected override async ValueTask<TResult> ExecuteCoreAsync<TResult, TState>(
        Func<ResilienceContext, TState, ValueTask<TResult>> callback,
        ResilienceContext context,
        TState state)
    {
        object key = await KeyGenerator(new(context, state));

        if (Cache.TryGetValue(key, out TResult? result))
        {
            if (OnCacheHit is { } onCacheHit)
            {
                await onCacheHit(default).ConfigureAwait(context.ContinueOnCapturedContext);
            }

            return result!;
        }

        if (OnCacheMissed is { } onCacheMissed)
        {
            await onCacheMissed(default).ConfigureAwait(context.ContinueOnCapturedContext);
        }

        result = await callback(context, state).ConfigureAwait(context.ContinueOnCapturedContext);

        var entry = Cache.CreateEntry(key);

        if (OnCachePut is { } onCachePut)
        {
            await onCachePut(new(context, entry)).ConfigureAwait(context.ContinueOnCapturedContext);
        }

        return result;
    }
}

@martintmk
Copy link
Contributor

Hey @martincostello, nice job :) just a couple of thoughts about the caching:

Microsoft.Extensions.Caching.Abstractions provides the sync IMemoryCache and the async IDistributedCache - this is effectively the same as having a sync and async implementation in Polly v7.

I believe the semantics are that IDistributedCache does not hold the cached locally while IMemoryCache does. The IDistributedCache is inherently slower and better scalable. When accessing/updating the IMemoryCache you don't need any async methods because all work is local. For IDistributedCache we have async overloads because it can indeed block the calling thread when making a request. However, IDistributedCache also exposes sync methods so we can use those when ResilienceContext is in synchronous execution.

I'm not sure what the best way to go about doing this natively for Polly v8 would be. Off the top of my head, it feels like we'd need a bridge interface that has sync and async methods so the strategy can just pick the right method and what happens under-the-hood is up to the implementation.

My naive thinking is that we expose two strategies Distributed Caching Strategy and Memory Caching Strategy each with their respective options. That way, the customer can compose these strategies and create 2-layerer caching strategies (memory first, followed by distributed caching).

Doing the above means the policy can live in the core, but by default it wouldn't actually do anything out of the box (which I think is also true of v7)

I think the main concern is imposing the Microsft.Extensions.Caching.Abstractions dependency on everyone if this was directly in the Polly.Core. My vote would be to have these strategies in Polly.Caching package (similar thing that we did with Polly.RateLimiting). That way the Polly.Core can stay slim.

The other bit I wasn't 100% sure on was on exactly how to get the callbacks for generating the cache key and value to preserve TResult and TState.

I think we don't really need to propagate TState, it is reserved for advanced scenarios to reduce lambda allocations and I cannot image having anything caching-related in it.

Regarding the API surface do you think we need these events?

    public Func<OnCacheHitArguments, ValueTask>? OnCacheHit { get; }

    public Func<OnCacheMissedArguments, ValueTask>? OnCacheMissed { get; }

    public Func<OnCachePutArguments, ValueTask>? OnCachePut { get; }

I think OnCachePut is nice to have so the caller can override any caching per-item options but the others might be not necesssary. We can still generate the resilience events from these though.

@martintmk
Copy link
Contributor

Based on the recent sync, the caching strategy won't be a part of the initial V8 release. If they're is a demand from community we might induce it in a next minor release.

The reason to not include:

  • There is not much data about usability of such strategy or whatever it really belongs in a core resilience package.
  • There is a work pending on the next set of caching libraries and we do not want to duplicate the work or be based on older abstractions.

cc @martincostello ,@joelhulen

@martintmk martintmk added enhancement feature and removed v8 Issues related to the new version 8 of the Polly library. labels Jun 10, 2023
@martincostello martincostello removed this from the v8.0.0 milestone Jun 10, 2023
@martincostello martincostello removed their assignment Jun 10, 2023
@joelhulen
Copy link
Member

Based on the recent sync, the caching strategy won't be a part of the initial V8 release. If they're is a demand from community we might induce it in a next minor release.

The reason to not include:

  • There is not much data about usability of such strategy or whatever it really belongs in a core resilience package.
  • There is a work pending on the next set of caching libraries and we do not want to duplicate the work or be based on older abstractions.

cc @martincostello ,@joelhulen

Per our conversation on Thursday, I agree that we can exclude the Polly v7 caching policy from v8. If we find that excluding the caching policy impacts a lot of people (no data on whether this is the case), then we can either find a way to implement it or provide some alternatives like the upcoming set of caching libraries.

@martincostello martincostello changed the title Polly v8 Caching policy Polly.Core Caching policy Jun 16, 2023
@martincostello
Copy link
Member Author

martincostello commented Jun 16, 2023

If/when we get to this issue, consider the use case in #834 and #660.

@github-actions
Copy link
Contributor

This issue is stale because it has been open for 60 days with no activity. It will be automatically closed in 14 days if no further updates are made.

@github-actions github-actions bot added the stale Stale issues or pull requests label Aug 16, 2023
@martincostello martincostello removed the stale Stale issues or pull requests label Aug 16, 2023
@martincostello martincostello closed this as not planned Won't fix, can't repro, duplicate, stale Sep 29, 2023
@TallBaldGeek
Copy link

I was directed here by Martin Tomka after I commented on the blog post Building resilient cloud services with .NET 8.

I work on a large enterprise application that is currently ASP.NET using Framework 4.8, with an effort underway to migrate to .NET 8+. We’re using the IHttpClientFactory to maintain a registry of named http clients with Polly v7 policies defined.

We use a Polly AsyncCachePolicy on 6 of the ~75 named http clients we have defined for connecting to external APIs. This policy is backed by an IAsyncCacheProvider which is wired to use a Singleton MemoryCacheProvider in our DI container. The cache policy will serialize an HttpResponseMessage and cache that serialized representation in memory per-server. In the case of a cache hit, we deserialize back to an HttpResponseMessage and return that.

We knew there were trade-offs in this approach such as having to still ReadAsAsync<T> on the resulting HttpResponseMessage to further deserialize it into a DTO, but we felt the convenience of being able to define a generic caching policy to apply to a named http client was worth it in these cases. (In other words, we were aware of the guidance “Is caching at the HttpResponseMessage level the right fit?” )

In addition, we do have a handful of areas in our application where we use Polly’s Policy.Execute() syntax with a caching policy to cache a DTO at a higher level than the http pipeline.

To further complicate things, on the clients where we’re using the AsyncCachePolicy, it’s part of a Policy Wrap with Polly.Contrib.DuplicateRequestCollapser in order to avoid cache repopulation storms. There is an issue on that package that is in the process of being resolved where it initially didn’t support Polly v8. It looks like support for v8 is being added, but not Polly.Core.

To recap, if my team wants to migrate from v7 to Polly.Core, in addition to the standard stuff outlined in the migration guide we’ll need to account for:

  • Polly.Core not currently supporting a Cache strategy
  • Polly.Core not providing an equivalent of Polly.Contrib.DuplicateRequestCollapser

Thanks,
-Brett.

@martintmk
Copy link
Contributor

I was directed here by Martin Tomka after I commented on the blog post Building resilient cloud services with .NET 8.

I work on a large enterprise application that is currently ASP.NET using Framework 4.8, with an effort underway to migrate to .NET 8+. We’re using the IHttpClientFactory to maintain a registry of named http clients with Polly v7 policies defined.

We use a Polly AsyncCachePolicy on 6 of the ~75 named http clients we have defined for connecting to external APIs. This policy is backed by an IAsyncCacheProvider which is wired to use a Singleton MemoryCacheProvider in our DI container. The cache policy will serialize an HttpResponseMessage and cache that serialized representation in memory per-server. In the case of a cache hit, we deserialize back to an HttpResponseMessage and return that.

We knew there were trade-offs in this approach such as having to still ReadAsAsync<T> on the resulting HttpResponseMessage to further deserialize it into a DTO, but we felt the convenience of being able to define a generic caching policy to apply to a named http client was worth it in these cases. (In other words, we were aware of the guidance “Is caching at the HttpResponseMessage level the right fit?” )

In addition, we do have a handful of areas in our application where we use Polly’s Policy.Execute() syntax with a caching policy to cache a DTO at a higher level than the http pipeline.

To further complicate things, on the clients where we’re using the AsyncCachePolicy, it’s part of a Policy Wrap with Polly.Contrib.DuplicateRequestCollapser in order to avoid cache repopulation storms. There is an issue on that package that is in the process of being resolved where it initially didn’t support Polly v8. It looks like support for v8 is being added, but not Polly.Core.

To recap, if my team wants to migrate from v7 to Polly.Core, in addition to the standard stuff outlined in the migration guide we’ll need to account for:

  • Polly.Core not currently supporting a Cache strategy
  • Polly.Core not providing an equivalent of Polly.Contrib.DuplicateRequestCollapser

Thanks, -Brett.

@TallBaldGeek, the caching strategy is relative simple to implement. For example:

public static class CachingKey
{
    public static readonly ResiliencePropertyKey<string> Id = new("Resilience.CachingKey");
}

internal class CachingStrategy(IMemoryCache cache) : ResilienceStrategy
{
    protected override async ValueTask<Outcome<TResult>> ExecuteCore<TResult, TState>(
        Func<ResilienceContext, TState, ValueTask<Outcome<TResult>>> callback,
        ResilienceContext context,
        TState state)
    {
        if (!context.Properties.TryGetValue(CachingKey.Id, out var key))
        {
            return await callback(context, state);
        }

        if (cache.TryGetValue<TResult>(key, out var cachedValue))
        {
            return Outcome.FromResult(cachedValue!);
        }

        var outcome = await callback(context, state);

        if (outcome.Exception is null)
        {
            cache.Set(key, outcome.Result);
        }

        return outcome;
    }
}

And to use it:

ResilienceContext context = ResilienceContextPool.Shared.Get();
context.Properties.Set(CachingKey.Id, "some-key");
resiliencePipeline.ExecuteAsync(context, c => SomeExpensiveCacheableStringOperation(), context);

The strategy is based on pro-reactive implementation guidelines:
https://www.pollydocs.org/extensibility/proactive-strategy.html

So if you are blocked by this, you can do your own caching strategy in your project. If there is more Community-feedback and demand, we might introduce built-in caching strategy later. The challenge is not to duplicate and reinvent other caching API such as:

Instead, what we would want is to expose APIs that allows integrating various caches implementations easily, without us implementing our own caching layers. I am thinking of something very similar of what we did with RateLimiting, where we just created a thin layer over System.Threading.RateLimiting APIs.

@jodydonetti
Copy link

If you are interested I may implement a policy based on FusionCache, which in turn can be configured to seamlessly work with both in memory and distributed caches (it has full support for both sync/async programming model, without the usual sync-over-async trick which does have its issues).

Honestly it was on my todo list for some time (I mean adding FusionCache support to Polly), but first I was waiting for Polly V8 to be released and then I've been sidetracked with some other stuff recently.

@martintmk
Copy link
Contributor

@jodydonetti

Mi thinking about cache is to introduce some low-level layer into Polly.Core:

public readonly struct CacheStoreArguments<TValue>
{
    public Outcome<TValue> Outcome { get; }

    public ResilienceContext Context { get; }

    public string Key { get; }
}

public readonly struct CacheRetrieveArguments
{
    public ResilienceContext Context { get; }

    public string Key { get; }
}

public class CachingOptions<TValue> :ResilienceStrategyOptions
{
    [Required]
    public Func<CacheRetrieveArguments, ValueTask<Outcome<TValue>?>>? Retrieve { get; set; }

    [Required]
    public Func<CacheStoreArguments<TValue>, ValueTask<bool>>? Store { get; set; }
}

public static class CacheResiliencePipelineBuilderExtensions
{
    public static ResiliencePipelineBuilder AddCache<TValue>(
        this ResiliencePipelineBuilder<TValue> builder,
        CachingOptions<TValue> options)
    {
        throw new NotImplementedException();
    }
}

The strategy will use the delegates to retrieve and store items into cache. It will also reports hit misses and any cache actions. The implementation would reside in Polly.Core. It is really a minimal bare-bones resilience strategy.

Then we can expose some convenience overloads in Polly.Extensions that just specify their own retrieve/store delegates:

public static class CacheResiliencePipelineBuilderExtensions
{
    public static ResiliencePipelineBuilder AddMemoryCache<TValue>(
        this ResiliencePipelineBuilder<TValue> builder,
        IMemoryCache memoryCache)
    {
        return builder.AddCaching(new()
        {
            Retrieve = args =>
            {
                if (memoryCache.TryGetValue<TValue>(args.Key, out var cachedValue))
                {
                    return new ValueTask<Outcome<TValue>?>(Outcome.FromResult(cachedValue));
                }

                return ValueTask.FromResult<Outcome<TValue>?>(null!);
            },
            Store = args =>
            {
                if (args.Outcome.Exception is null)
                {
                    memoryCache.Set(args.Key, args.Outcome.Result);
                    return ValueTask.FromResult(true);
                }

                return ValueTask.FromResult(false);

            }
        });
    }

    public static ResiliencePipelineBuilder AddDistributedCache<TValue>(
        this ResiliencePipelineBuilder<TValue> builder,
        IDistributedCache distributedCache)
    {
        return builder.AddCaching(new()
        {
            Retrieve = async args =>
            {
                if (await distributedCache.GetAsync(args.Key) is { } cachedData)
                {
                    // deserialize and return the result
                }

                return null;
            },
            Store = async args =>
            {
                if (args.Outcome.Exception is null)
                {
                    // Serialize the result
                    byte[] data = ... ;

                    await distributedCache.SetAsync(args.Key, data, args.Context.CancellationToken);
                    return ValueTask.FromResult(true);
                }

                return ValueTask.FromResult(false);
            }
        });
    }
}

This way we can reduce our coupling to particular caching technology and make it really easy to integrate their own cache into Polly. This applies to FusionCache as well, then can just expose their own AddFusionCache extensions that leverages the core infra.

Wdyt?

cc @martincostello, @joelhulen

@martintmk martintmk reopened this Dec 4, 2023
@jodydonetti
Copy link

jodydonetti commented Dec 4, 2023

Hi @martintmk , I totally agree, I was not suggesting to add a hard link to FusionCache, that would be crazy.

The idea iwould be to have a caching abstraction specific for Polly with the specific requirements needed for Polly itself, so that various impl (including but of course not limited to FusionCache) can be made.

If, btw, there's currently no time to create such abstraction now, another possibility would be to start without the abstraction and try to see if it would be possible to get the caching feature as an external third party package, in this case tightly coupled with FusionCache, to get the feature out. Then, if and when the common abstraction will arrive, the package would be updated to work with the common abstraction. Just an idea.

Of course I would prefer the first approach, if possible.

ps: I'll look at your sample code asap, will let you know.

@TallBaldGeek
Copy link

@martintmk & @jodydonetti - Thanks for the quick responses and for reopening this issue for discussion.

Would your proposed approaches be something that would eventually make their way into the Microsoft.Extensions.Http.Resilience package so they could be added to an HTTP Client definition? Or is the intent only to allow a cache strategy to be defined and incorporated into a standalone resilience pipeline that would be defined outside of the HTTP Client factory?

I was hoping to have both approaches available so that we could continue to do caching at either the http message level or "one level up" to operate on a DTO.

@martintmk
Copy link
Contributor

Would your proposed approaches be something that would eventually make their way into the Microsoft.Extensions.Http.Resilience package so they could be added to an HTTP Client definition? Or is the intent only to allow a cache strategy to be defined and incorporated into a standalone resilience pipeline that would be defined outside of the HTTP Client factory?

Any new API added to Polly is automatically available in Microsoft.Extensions.Http.Resilience. This applies for caching too. When using AddResilienceHandler extension, the AddCache extension would be visible for ResiliencePipelineBuilder<HttpResponseMessage>.

One gotcha is that we can't serve the same instance of HttpResponseMessage so some cloning/custom handling would possibly be required.

@jodydonetti
Copy link

One gotcha is that we can't serve the same instance of HttpResponseMessage so some cloning/custom handling would possibly be required.

Makes sense, and I think the approach would be similar to the one taken by OutputCache which basically does more or less the same thing (it caches http responses and serve a copy of them later).

Anyway if a discussion starts on the shape of the generic cache api for Polly, I'm available to discuss the details of it and provide a first implementation based on FusionCache as a third-party package.

Copy link
Contributor

github-actions bot commented Feb 7, 2024

This issue is stale because it has been open for 60 days with no activity. It will be automatically closed in 14 days if no further updates are made.

@github-actions github-actions bot added the stale Stale issues or pull requests label Feb 7, 2024
@jodydonetti
Copy link

Hi, since this has been marked as stale and I'm on the verge of releasing v1.0 of FusionCache, I feel it can be a good time to give it a try and make an attempt at creating a first caching policy for Polly V8, at least to see how it would work out.
Is there anybody willing to have a chat in the next weeks about how to better approach that?
Thanks!

@github-actions github-actions bot removed the stale Stale issues or pull requests label Feb 8, 2024
Copy link
Contributor

github-actions bot commented Apr 8, 2024

This issue is stale because it has been open for 60 days with no activity. It will be automatically closed in 14 days if no further updates are made.

@github-actions github-actions bot added the stale Stale issues or pull requests label Apr 8, 2024
@OskarKlintrot
Copy link

Not stale, I hope?

@github-actions github-actions bot removed the stale Stale issues or pull requests label Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants