Allow (middleware) components to define internal actors #6538

ItalyPaleAle · 2023-06-18T17:31:54Z

Note this is about extending the capabilities of existing components, not creating a new component class.

Internal actors have been added in Dapr 1.10 to enable workflows. They allow the runtime itself to act as actor host for some actor types.

It would be helpful to extend the ability to register internal actor types to components, so that components can host actors when needed to perform certain tasks.

The most immediate need for this is to enable the Rate Limiter middleware to perform rate-limiting across multiple instances of the same service. Right now, the state is kept in-memory, so if the service is scaled horizontally, each instance has to manage its own rate-limiting.

From an implementation standpoint, this could be enabled by exposing the method actors.RegisterInternalActor to components, for example adding it to the context that’s passed to the components’ Init method

EDIT: Updated the proposal to indicate this is meant for middleware components only. At least at this stage, it may not make much sense to allow this for other kinds of components.

The text was updated successfully, but these errors were encountered:

ItalyPaleAle · 2023-06-30T17:37:17Z

Another issue in components-contrib that could benefit from internal actors: dapr/components-contrib#2635

yaron2 · 2023-07-05T21:51:01Z

Why not extend this to all component types and also to other parts of the runtime? For example, making the entire Resiliency feature distributed (global counts for circuits, retries) is a very lucrative feature that could benefit from this (It may be that internal actors are abstracted well enough to be used already by other parts of the runtime).

ItalyPaleAle · 2023-07-05T22:04:00Z

Why not extend this to all component types and also to other parts of the runtime? For example, making the entire Resiliency feature distributed (global counts for circuits, retries) is a very lucrative feature that could benefit from this (It may be that internal actors are abstracted well enough to be used already by other parts of the runtime).

I think that's a good idea for things like circuits (maybe less for retries since that would require sharing the entire request's state).

The downside of this approach of relying on actors (and why I am limiting it to middleware components for now) is that it requires having an actor runtime set up. So, a compatible actor state store, a placement service running, etc.

Perhaps there could be a separate issue for other places where the actor runtime could be leveraged more?

dapr-bot · 2023-09-03T22:13:52Z

This issue has been automatically marked as stale because it has not had activity in the last 60 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.

ItalyPaleAle · 2023-09-06T19:21:49Z

I am very interested in this issue to enable more internal actors.

Components, especially middleware, can greatly benefit from them
- Examples include rate limiting, OAuth2
In the runtime, we can use them for other background tasks too, so they are coordinated even when multiple apps are running. Some examples could include:
- Workflow: cleaning up old workflow history
- Migrations, for example if we need to change how things like reminders are stored (hypothetically). This can be done in conjunction with Include runtime version in Dapr placement data to ease with upgrades #6838

I have investigated this and it seems that the work that's already been done for Workflow covers most of what's needed.

The InternalActor interface is all that a struct needs to implement:

dapr/pkg/actors/internal_actor.go

Lines 34 to 44 in e57a91f

    
           // InternalActor represents the interface for invoking an "internal" actor (one which is built into daprd directly). 
        
           type InternalActor interface { 
        
           	SetActorRuntime(actorsRuntime Actors) 
        
           	InvokeMethod(ctx context.Context, actorID string, methodName string, data []byte) (any, error) 
        
           	DeactivateActor(ctx context.Context, actorID string) error 
        
           	// InvokeReminder invokes reminder logic for an internal actor. 
        
           	// Note that the DecodeInternalActorReminderData function should be used to decode the [data] parameter. 
        
           	InvokeReminder(ctx context.Context, actorID string, reminderName string, data []byte, dueTime string, period string) error 
        
           	InvokeTimer(ctx context.Context, actorID string, timerName string, params []byte) error 
        
           }

The only "problem" is that to expose this to components in components-contrib, we have another issue with circular dependencies.

The biggest problem in the interface above is SetActorRuntime(actorsRuntime Actors) which involves setting the entire actor runtime's interface, so the internal actor can perform operations.

The Actors interface contains links to a lot of objects that are defined in dapr/dapr:

dapr/pkg/actors/actors.go

Lines 76 to 96 in e57a91f

    
           // Actors allow calling into virtual actors as well as actor state management. 
        
           // 
        
           //nolint:interfacebloat 
        
           type Actors interface { 
        
           	io.Closer 
        
           	Call(ctx context.Context, req *invokev1.InvokeMethodRequest) (*invokev1.InvokeMethodResponse, error) 
        
           	Init(context.Context) error 
        
           	GetState(ctx context.Context, req *GetStateRequest) (*StateResponse, error) 
        
           	GetBulkState(ctx context.Context, req *GetBulkStateRequest) (BulkStateResponse, error) 
        
           	TransactionalStateOperation(ctx context.Context, req *TransactionalRequest) error 
        
           	GetReminder(ctx context.Context, req *GetReminderRequest) (*internal.Reminder, error) 
        
           	CreateReminder(ctx context.Context, req *CreateReminderRequest) error 
        
           	DeleteReminder(ctx context.Context, req *DeleteReminderRequest) error 
        
           	RenameReminder(ctx context.Context, req *RenameReminderRequest) error 
        
           	CreateTimer(ctx context.Context, req *CreateTimerRequest) error 
        
           	DeleteTimer(ctx context.Context, req *DeleteTimerRequest) error 
        
           	IsActorHosted(ctx context.Context, req *ActorHostedRequest) bool 
        
           	GetActiveActorsCount(ctx context.Context) []*runtimev1pb.ActiveActorsCount 
        
           	RegisterInternalActor(ctx context.Context, actorType string, actor InternalActor) error 
        
           }

Some of these methods are internal and don't need to be exposed to an internal actor, but even simplifying the interface, we have a lot of types that need to be exposed. (#6892)

The most obvious solution to this problem would be to move the code to dapr/kit, but that would mean essentially moving the entire actors (or at least most of the types, such as CreateReminderRequest etc) runtime to dapr/kit, with possible maintenance issues in the future. A big issue is also that Call requires the invokev1 package.

What are your thoughts @dapr/maintainers-dapr ?

yaron2 · 2023-09-07T21:28:40Z

Moving to /kit will cause governance issues. Moving to a new repository to host the actors code with the same governance as dapr/dapr is doable, but we need to be entirely convinced first about why this can't be done via interface/package refactoring in this repository.

ItalyPaleAle · 2023-09-07T21:59:00Z

An external repo repo, be it kit or a new one, would also be a bit of a nightmare from a management standpoint.

A package refactoring would not work due to the circular dependency.

An interface would not be enough since we have all the *Request and *Response structs, but more crucially we have the invokev1.InvokeMethodRequest and invokev1.InvokeMethodResponse types.

I'm not sure what the right answer is.

artursouza · 2023-10-23T16:31:27Z

I am not in favor of adding a circular dependency between other building blocks and actors for a few reasons.

Actors is a special building block that is built on top of others, at the runtime. The interface might solve a circular dependency at compilation time but it still creates a circular dependency at runtime - which is way worse. Imagine trying to apply a middleware to an actors API, now it becomes a circular dependency.

Regarding OAuth and rate limiting, there are ways to share state without actors and we should favor those first. I will share here in this issue some references on how Redis, Memcached and similar stack can be used. The abstraction here might be simply an optional reference to a state store that a middleware component can add.

Regarding workflows, bringing logic to components contrib seems odd too. Cleaning up workflow history can perfectly fit in runtime.

ItalyPaleAle · 2023-10-23T16:43:33Z

The benefits of using actors are:

It includes support for storing state without having to explicitly add another component. For example, if we were to allow any state store, we'd need users to create a new component specifically for that, and then a way to reference that.
It allows keeping state in-memory, which avoids the need to make requests to persistent storage every time (which are slower). Of course, this state can be lost at any time, but in some cases that is acceptable.
It offers the ability to use reminders to perform garbage collection, both in the state store and in external services as needed.
It offers concurrency control without the need to mess with etags
It's a one-stop solution that would fix a lot of problems we have today and in the future, and doesn't require one-off implementations that will need to be maintained by contrib maintainers in the future
It allows us to dogfood actors more, which is probably something we need to do

Regarding workflows, bringing logic to components contrib seems odd too. Cleaning up workflow history can perfectly fit in runtime.

I don't believe bringing this logic to components-contrib was something that was proposed. The case above was about using an internal actor for cleanups, but wasn't about using it in components-contrib.

artursouza · 2023-10-23T16:48:14Z

The benefits of using actors are:

It includes support for storing state without having to explicitly add another component. For example, if we were to allow any state store, we'd need users to create a new component specifically for that, and then a way to reference that.

It allows keeping state in-memory, which avoids the need to make requests to persistent storage every time (which are slower). Of course, this state can be lost at any time, but in some cases that is acceptable.

It offers the ability to use reminders to perform garbage collection, both in the state store and in external services as needed.

It offers concurrency control without the need to mess with etags

It's a one-stop solution that would fix a lot of problems we have today and in the future, and doesn't require one-off implementations that will need to be maintained by contrib maintainers in the future

It allows us to dogfood actors more, which is probably something we need to do

Regarding workflows, bringing logic to components contrib seems odd too. Cleaning up workflow history can perfectly fit in runtime.

I don't believe bringing this logic to components-contrib was something that was proposed. The case above was about using an internal actor for cleanups, but wasn't about using it in components-contrib.

I agree that actors has advantages but we can solve the problems on each scenario without using actors. As I said, it is dangerous circular dependency that will cause more issues than it will solve. I am not in favor of this direction for the project.

ItalyPaleAle · 2023-10-23T16:49:36Z

Looking forward hearing your suggestions. We are currently blocked by this on things like dapr/components-contrib#2967 (the component isn't really usable today in some scenarios)

artursouza · 2023-10-23T17:03:38Z

These are solved problems in the industry today and don't use actors. I am doing a simple Google search and rate limiting is a common scenario for it. It makes total sense to have a Redis-specific implementation for rate limiter: https://redis.io/commands/incr/#pattern-rate-limiter

If someone finds a way to write one with SQL, that would not be as common, so I don't think we need an interface. Just let the middleware be specific per stack.

I will be doing more Google searching and find more about OAuth. Again, it is a solved problem in the industry - we are not the first ones to solve this.

ItalyPaleAle · 2023-10-23T17:08:15Z

It makes total sense to have a Redis-specific implementation for rate limiter

Yes, that's what as a contrib maintainer I do not want, because it will lead to others making (or asking for) implementations for N more components. The burden of maintaining that will be on us.

artursouza · 2023-10-23T17:12:50Z

It makes total sense to have a Redis-specific implementation for rate limiter

Yes, that's what as a contrib maintainer I do not want, because it will lead to others making (or asking for) implementations for N more components. The burden of maintaining that will be on us.

Well, the industry has moved towards that direction already. I doubt there will be many more. SQL is not widely used for rate limiting. I doubt you will have more than a couple of stateful rate limiters. Having a native Redis implementation benefits the entire community and is a common implementation each company has to do. It seems a perfect fit for a specific component. As a contrib maintainer, I vote to have it. (Are we tracking this specific proposal somewhere? PR? Issue?)

artursouza · 2023-10-23T17:20:06Z

Regarding, dapr/components-contrib#2967 - it seems to be solvable by using the state store interface directly, using the TTL feature to auto-delete old keys. Then, add an in-memory cache on top, respecting the ttl value (@JoshVanL added this recently) and you have the tokens saved.

If you want to improve reuse, there is an opportunity to create a generic cached state store implementation that can take any state store and add a cache layer on top with a cache replace policy (LRU, FIFO, etc). With the advantage of being easily adopted with existing Dapr APIs.

ItalyPaleAle · 2023-10-23T17:20:34Z

SQL is not widely used for rate limiting. I doubt you will have more than a couple of stateful rate limiters.

Maybe, but it's not just SQL. What about Memcached? CosmosDB? etcd? You can see where I'm going and where the community will go.

Assuming we do go down that road, it also solves the rate-limiter issue, but we'll have the OAuth2 middleware too, and possibly other scenarios in the future.

I really do not believe the additional overhead on maintainers and contributors is worth the effort just because leveraging an existing feature of the Dapr runtime seems odd to you.

Regarding, dapr/components-contrib#2967 - it seems to be solvable by using the state store interface directly, using the TTL feature to auto-delete old keys

That requires users to provide their own state store, so we need to ask the runtime to give us the initialized component. You still have the "circular dependency".

PS: this issue was opened on June 18 and I really wish we had heard those concerns before. There was some conversation already. We did delay fixing dapr/components-contrib#2967 because we were waiting for this, and it is impacting users...

stuartleeks · 2023-11-02T18:32:54Z

There are some discussions on #6935 which is aiming to give distributed rate-limited capabilities (rather than the current per-process rate-limiting).

The thinking for that is that having the rate-limit state be handle directly by Dapr would be beneficial as it simplifies the user configuration as well as removing the need to add support for multiple stores.

Internal actors feel like they would be a good candidate for this scenario (assuming they were made available)

sujitdmello · 2023-11-02T19:09:54Z

Specifically for the rate-limiting scenario, using Redis is the fastest and most efficient option. While using internal Actors is a convenient option, it appears to have more hops to get to the Actor to get/set token bucket and that is potentially done frequently across multiple Pods. Using the built-in features of Redis for this specific use-case makes sense IMHO.

ItalyPaleAle · 2023-11-02T19:26:28Z

@sujitdmello i am afraid that any implementation in a component (such as the rate limiter middleware) that is specific to a service/store (such as just redis) would not work. The design needs to leverage existing components to prevent having to re-implement support for various state stores again.

actors still feel the most appropriate solution to me. They already interface with most state stores, have built-in concurrency control that is more reliable than using etags (opportunistic concurrency), and have built-in routines for caching in-memory.

additionally each dapr runtime already comes with an actor state store when configured. If we had to reference components without using actors, we’d need to worry about referencing external components, initialization orders, etc.

berndverst · 2023-11-09T23:07:45Z

So in summary my understanding is:

The component interfaces could be updated such that they receive an optional reference to the actor runtime (an easy change).
In runtime you can pass nil for this reference, and only pass in the actor runtime for the components that really require it (e.g. generic rate limiter component). This is how runtime maintainers can be certain that components do not arbitrarily start relying on actors.
Independently, there can still be native Redis rate limiter and other component implementations. None of this work is preventing any of that, and I will support such efforts.

How does that sound @artursouza @yaron2 @mukundansundar @daixiang0?

I agree with the technologies you pointed our @artursouza, but I also think that this doesn't have have to be mutually exclusive at all. So I'd like us to make both happen!

artursouza · 2023-11-09T23:09:32Z

I don't think that implementing rate limiting with Actors is the right thing for the project. It creates a circular dependency that will be hard to debug in runtime. At this point, I am not in favor of this change.

berndverst · 2023-11-09T23:12:07Z

I don't think that implementing rate limiting with Actors is the right thing for the project. It creates a circular dependency that will be hard to debug in runtime. At this point, I am not in favor of this change.

@artursouza can you please go into more detail why this is harder to debug than say debugging a Redis implementation?

I'd like to especially understand where the circular dependency comes into play in making this unreasonably difficult.

artursouza · 2023-11-09T23:12:07Z

In addition, actors are not designed for high availability and can be downtime when rebalancing. Causing downtime to rate limiting.

ItalyPaleAle · 2023-11-09T23:16:45Z

In addition, actors are not designed for high availability and can be downtime when rebalancing. Causing downtime to rate limiting.

This is factually incorrect.

Yes, actors can go down at any time. For a rate-limiter, losing the state that is stored in-memory is generally not an issue, as it would just reset the rate-limiting counter. For other situations where state matters, state can be persisted on the actor store.

But an actor can always be invoked, even during rebalancing, on some host. This is a behavior we have tests (E2E and integration) for.

At this point, I am not in favor of this change.

Given the problem that we are facing, which is impacting actual Dapr users that have been waiting for months, what would your recommended alternative be? Even using state stores (not actors) does add a circular dependency, if nothing else because of the component registry.

berndverst · 2023-11-09T23:18:29Z

In addition, actors are not designed for high availability and can be downtime when rebalancing. Causing downtime to rate limiting.

Note that distributed rate limiters do not offer exact counting generally, so some inaccuracies are acceptable with all rate limiters - e.g. some events not getting counted, or getting counted too late. Of course poor latency or a complete loss of the current rate limiting bucket values would not be ok.

artursouza · 2023-11-09T23:20:41Z

I am repeating myself here. I proposed already, maybe ignored, to implement it via exposing the state store interface to middlewares. A horizontal dependency in contrib.

Users have asked this for redis, which would have been implemented already if there is an actual rush.

"No is temporary, yes is forever."

yaron2 · 2023-11-09T23:24:43Z

Essentially what we need for a distributed rate limiter is an atomic incrementer. We can use etags to achieve that. If we can do that we'll get the following benefits:

Contrib depends on contrib only and we avoid a circular dependency
Users have more options for state stores
User experience is somewhat better as users do not have to wonder why they need an actor state store for rate limiters when not using actors
Debugging would be easier, as etags don't carry the same amount of issues that actors might bring

ItalyPaleAle · 2023-11-09T23:24:57Z

I proposed already, maybe ignored, to implement it via exposing the state store interface to middlewares. A horizontal dependency in contrib.

When you think about this, this also has a dependency on the runtime.

Yes, the component's code lives in contrib, but the registry and initialization live in the runtime. The runtime must be the one initializing components because it needs to fetch secrets from the Operator.

Unless you are suggesting that each component initializes other components. This would be horizontal indeed, but then we'd need to support the metadata for those components into the "parent" components. For example, a middleware would have to add fields for all the fields supported by the state store it wants to use. This is a mess from the perspective of the user.

Users have asked this for redis, which would have been implemented already if there is an actual rush.

The most pressing issue is actually OAuth2, which today doesn't work. Either you are limited to 1 instance of your app (keeping token in-memory), or it sotres in a cookie and it can't work with Azure AD because the tokens don't fit in a cookie.

berndverst · 2023-11-09T23:25:38Z

Users have asked this for redis, which would have been implemented already if there is an actual rush.

"No is temporary, yes is forever."

I am personally interested in saving users compute resources and money in various clouds or even their own self hosted deployments. For this reason as I mentioned it is preferable to reuse an existing database server / container that is already deployed / provisioned. Hope that makes sense.

A separate Redis deployment as a rate limiter may appeal to some users. I agree - but I personally am very turned off by this world where we make everyone deploy the whole kitchen sink for tiny features from the respective services.

artursouza · 2023-11-09T23:28:24Z

Users have asked this for redis, which would have been implemented already if there is an actual rush.

"No is temporary, yes is forever."

I am personally interested in saving users compute resources and money in various clouds or even their own self hosted deployments. For this reason as I mentioned it is preferable to reuse an existing database server / container that is already deployed / provisioned. Hope that makes sense.

A separate Redis deployment as a rate limiter may appeal to some users. I agree - but I personally am very turned off by this world where we make everyone deploy the whole kitchen sink for tiny features from the respective services.

Yes. For that reason, we can expose the state store API / which is a much simpler solution and easier to debug and operate than actors. Which is my proposal to solve this problem.

yaron2 · 2023-11-09T23:28:41Z

If we can achieve rate limiting like this (#6538 (comment)) it will further reduce consumption of resources as both actor hosts and placement are known to consume memory as scale increases. In addition you could use rate limiters and skip installing placement altogether

ItalyPaleAle · 2023-11-09T23:30:30Z

Please, let's not get carried away too much with rate-limiting. The problem is more generic. I mentioned OAuth2 is the biggest issue right now.

I have also explained above why I don't think state stores are the solution. something needs to initialize the component, so it's either a circular dependency or a hell of a messy job: #6538 (comment)

Additionally, actors can keep data in-memory while they're running, so we avoid I/O with the state store on every call

berndverst · 2023-11-09T23:36:18Z

Users have asked this for redis, which would have been implemented already if there is an actual rush.

"No is temporary, yes is forever."

I am personally interested in saving users compute resources and money in various clouds or even their own self hosted deployments. For this reason as I mentioned it is preferable to reuse an existing database server / container that is already deployed / provisioned. Hope that makes sense.
A separate Redis deployment as a rate limiter may appeal to some users. I agree - but I personally am very turned off by this world where we make everyone deploy the whole kitchen sink for tiny features from the respective services.

Yes. For that reason, we can expose the state store API / which is a much simpler solution and easier to debug and operate than actors. Which is my proposal to solve this problem.

Let's say we can find a solution using state stores (like etags and such), wouldn't we still need a reference from runtime with an initialized state store? I think there is still a circular dependency no matter what.

yaron2 · 2023-11-09T23:39:24Z

The middleware (in contrib) would need to accept a state.StateStore interface that also lives in contrib.

ItalyPaleAle · 2023-11-09T23:43:24Z

The middleware (in contrib) would need to accept a state.StateStore interface that also lives in contrib.

But the state store object (which complies with that interface) is instantiated by the runtime. So you have your circular dependency there too? My understanding is that @artursouza doesn't want any circular dependency, so we don't really have options.

He defined circular dependencies at runtime too:

Actors is a special building block that is built on top of others, at the runtime. The interface might solve a circular dependency at compilation time but it still creates a circular dependency at runtime - which is way worse.

Then you'd need to figure a way in the metadata of the middleware to know which state store you're referencing.

Using actors:

We could still have an interface in contrib. This interface is for an abstraction on top of actors (that doesn't depend on other internals in the runtime)
The runtime passes a concrete object that implements that interface

I don't see much differences there, except that with actors we can keep data in-memory and avoid (expensive) I/O on every call...

That said, using state stores would still be acceptable in the case you're describing above (although I like them much less because that always requires I/O and it's a lot slower for things like OAuth tokens). But this doesn't allow getting away of the "circular dependency" Artur is concerned with.

berndverst · 2023-11-09T23:44:58Z

The middleware (in contrib) would need to accept a state.StateStore interface that also lives in contrib.

I get what you are saying. Let me see if I can write this up so everybody understands :)

artursouza · 2023-11-09T23:47:59Z

The middleware (in contrib) would need to accept a state.StateStore interface that also lives in contrib.

But the state store object (which complies with that interface) is instantiated by the runtime. So you have your circular dependency there too? My understanding is that @artursouza doesn't want any circular dependency, so we don't really have options.

Then you'd need to figure a way in the metadata of the middleware to know which state store you're referencing.

Using actors:

We could still have an interface in contrib. This interface is for an abstraction on top of actors (that doesn't depend on other internals in the runtime)

The runtime passes a concrete object that implements that interface

I don't see much differences there, except that with actors we can keep data in-memory and avoid (expensive) I/O on every call...

That said, using state stores would still be acceptable in the case you're describing above (although I like them much less because that always requires I/O and it's a lot slower for things like OAuth tokens). But this doesn't allow getting away of the "circular dependency" Artur is concerned with.

That is a much reduced circular dependency and we can eventually refactor that into contrib 100%. Having OAuth or rate limiter to depend on actors is a much bigger pill to swallow, with ops concerns.

We can implement a cache layer too and avoid read for each request on oauth. Also, rate limiter can be eventual consistent, which allows caching there too. A combination of token bucket with async reads and writes for state store can be really fast with a quick time to market for users.

berndverst · 2023-11-09T23:54:08Z

In runtime during the component initialization we somehow need to get a reference to the state store to be used. The name can be obtained from the Middleware metadata. Then we either init that state store, or load the existing reference of it and pass it into the constructor for the middleware.

dapr/cmd/daprd/components/middleware_http_oauth2.go

Lines 28 to 34 in 770a047

    
           func init() { 
        
           	httpMiddlewareLoader.DefaultRegistry.RegisterComponent(func(log logger.Logger) httpMiddlewareLoader.FactoryMethod { 
        
           		return func(metadata middleware.Metadata) (httpMiddleware.Middleware, error) { 
        
           			return oauth2.NewOAuth2Middleware(log).GetHandler(context.TODO(), metadata) 
        
           		} 
        
           	}, "oauth2") 
        
           }

return oauth2.NewOAuth2Middleware(log).GetHandler(context.TODO(), statestore, metadata)

statestore would be the reference here.

And in components-contrib we'd update the interface to take the state store reference:

https://github.com/dapr/components-contrib/blob/8fe74b15aa5c89c730076bfa735fa57f7e4d0058/middleware/middleware.go#L21-L24

This would become

// Middleware is the interface for a middleware.
type Middleware interface {
	GetHandler(ctx context.Context, statestore state.Store, metadata Metadata) (func(next http.Handler) http.Handler, error)
}

For a strong reference to the state store name to be used we should add this to the middleware metadata.

https://github.com/dapr/components-contrib/blob/8fe74b15aa5c89c730076bfa735fa57f7e4d0058/middleware/metadata.go#L19-L21

type Metadata struct {
	metadata.Base `json:",inline"`
        Statestore string `json:"statestore"`
}

Does that sound right?

That would not change anything about imports in either contrib or runtime.

yaron2 · 2023-11-09T23:57:29Z

The circular dependency is a secondary win/concern.

In addition to what I described above, we should consider a lot of users use Redis and other NoSQL DBs for rate limiting/session tokens because of their performance. As we know, Redis as an actor state store is deprecated and not recommended, which will further limit users of using a state store that's fit for purpose for this scenario.

So for me, the main benefits are:

Reliablity and better troubleshooting. Actors bring way more complexity than etags
Reduced resource consumption and footprint
Architectural decoupling from actors and it's subsystem
Larger choice of fit for purpose state stores for users

berndverst · 2023-11-10T00:06:39Z

The circular dependency is a secondary win/concern.

In addition to what I described above, we should consider a lot of users use Redis and other NoSQL DBs for rate limiting/session tokens because of their performance. As we know, Redis as an actor state store is deprecated and not recommended, which will further limit users of using a state store that's fit for purpose for this scenario.

So for me, the main benefits are:

Reliablity and better troubleshooting. Actors bring way more complexity than etags

Reduced resource consumption and footprint

Architectural decoupling from actors and it's subsystem

Larger choice of fit for purpose state stores for users

Another possibility this enables (not saying we have to go down that route):
We could expand the state store interface to include efficient methods that can be easily implemented by every state store. Something like Increment, Get and Set which do not use the Dapr envelope in use today.

I'm a bit worried about the overhead of the Dapr envelope for all of these operations.

yaron2 · 2023-11-10T00:12:10Z

The circular dependency is a secondary win/concern.
In addition to what I described above, we should consider a lot of users use Redis and other NoSQL DBs for rate limiting/session tokens because of their performance. As we know, Redis as an actor state store is deprecated and not recommended, which will further limit users of using a state store that's fit for purpose for this scenario.
So for me, the main benefits are:

Reliablity and better troubleshooting. Actors bring way more complexity than etags

Reduced resource consumption and footprint

Architectural decoupling from actors and it's subsystem

Larger choice of fit for purpose state stores for users

Another possibility this enables (not saying we have to go down that route): We could expand the state store interface to include efficient methods that can be easily implemented by every state store. Something like Increment, Get and Set which do not use the Dapr envelope in use today.

I'm a bit worried about the overhead of the Dapr envelope for all of these operations.

For sure, we can create granular interfaces for internal use and as these grow to critical mass we can even look at these as candidates for state store API expansion

ItalyPaleAle · 2023-11-10T01:33:13Z

@berndverst your design has some valid thoughts, although:

Rather than adding a property to the constructor you may want to add it as a property to the Metadata struct (alongside Name and Properties). This way we can extend it to pass other kinds of components in the future, and it is also simpler to change as an API (you know, the usual dance with contrib, runtime, and stuff)
We need to solve the problem of the initialization order. The runtime needs to be aware of when a component depends on another component and initialize them in order (sort of what we do for referencing secrets in secret stores).

However, I still think this is a very much sub-par idea than using actors. This requires re-implementing caching, worrying about when the cache needs to be consistent or not, does more I/O with storage, and OCC with etags means that operations more often than not need to be retried (see what's going on with reminders that fail after a certain load due to too many etag conflicts). This is gonna hurt us badly down the road like it's hurting us with reminders, and in the best case scenario requests to state store will be retried many times (adding lots of latency).

Why re-implementing the wheel when we already have it done, and have it done well (and is getting better)?

In addition to what I described above, we should consider a lot of users use Redis and other NoSQL DBs for rate limiting/session tokens because of their performance. As we know, Redis as an actor state store is deprecated and not recommended, which will further limit users of using a state store that's fit for purpose for this scenario.

As mentioned before many times, the choice of actor state store is really irrelevant for rate-limiting, as with actors the state would be kept in-memory and there'd be no operation on the state store at all...

Heck, in this specific example (rate-limiting), assuming you don't use actors for anything else, you could even initialize the actor runtime with the in-memory component as actor state store, as it doesn't matter at all.

Actors aren't just for storage but they have business logic too. In the case of rate-limiting, we wouldn't just store the state in the actor: we'd send the caller IP and ask the actor to tell us if the call is authorized.

This allows performing the rate-limiting with 1 network call only instead of 2 (read and set). So this should offer half the latency for rate limiting (assuming calls to persistent store have the same latency as calls to actors within the cluster)

This also means the same actor can be re-used, for example for a rate-limiter binding too!

Lastly, while actors a bit more complex, if we use the actors APIs directly (no SDK) it's not too bad. And if we are scared of using actors ourselves, what message are sending to our users??

yaron2 · 2023-11-10T02:47:53Z

Actors with their single threaded nature can become a pretty big bottleneck even when saving state in-memory in terms of the waited I/O calls. However, this doesn't concern me so much for the same reason i'm not concerned too much about the I/O operations to the database - both incrementing and reading can be made asynchronously and outside the path of the main call, as we've concluded that counting can both exceed the threshold and be updated in an eventually consistent way without real time coordination between all instances. More so, if there's an I/O bottleneck waiting for entry into an actor,that might create a noisy neighber problem for other functionality of the Dapr sidecar hosting that actor instance.

The only selling point for me for using actors is the in-memory serving, so now its a question of tradeoffs - does this feature justify all the disadvantages I mentioned above, also considering we can with little work still implement in-memory serving at the middleware component level.

Re: message to users, to me its more about sending the message that we do our best to avoid The Law of the Instrument and use the right tool for the right job, and not simply because a certain tool can also do that job.

ItalyPaleAle · 2023-11-10T03:08:52Z

Actors with their single threaded nature can become a pretty big bottleneck even when saving state in-memory in terms of the waited I/O calls

Your suggestion of using etags is pretty much limited to single-threaded nature too. If you need to read from the state store, get the etag, and then save with the etag, you're still limited to one concurrent transactions.

If there were more than one concurrent transaction, you'd have an etag conflicts, which means the read-write loop needs to be repeated until it succeeds - like we do with reminders (and the performance of reminders is just abysmal).

as we've concluded that counting can both exceed the threshold and be updated in an eventually consistent way without real time coordination between all instances.

Surely there's a trade-off here with how "eventual" the consistency is. With limit of X rps and N replicas, you may have bursts where you're allowing N * X requests, which may or may not be acceptable. Ideally this would have strong consistency, with the etag conflict issues.

if there's an I/O bottleneck waiting for entry into an actor,that might create a noisy neighber problem for other functionality of the Dapr sidecar hosting that actor instance.

I'm not following this. Isn't the queue per-actor?

use the right tool for the right job, and not simply because a certain tool can also do that job.

I'm actually convinced actors are the right tool for this job. I have explained why multiple times, and it's the most generic and universal solution for the problems we have today and in the future.

Bernd almost convinced me that state stores were a viable options until I realized how they have issues with etag conflicts, how they require re-implementing caches everywhere, and how they involve making trade-off between performance and consistency.

Most critically, I really worry using OCC (etags) we will box ourselves with having to deal with the same issues we have with reminders today.

artursouza · 2023-11-10T17:35:38Z

Handling eTags is common when implementing distributed systems. The problem with reminders is that we are trying to persist a list on a limited set of keys, not because we are handling eTags.

I think we have plenty of data to reach a decision among maintainers of runtime. We don't need to have unanimous agreement. People have different views on architecture principles and that is OK.

I vote (binding vote) for using state store interface to implement OAuth and rate limiter middlewares.

yaron2 · 2023-11-10T17:46:48Z

I definitely agree @artursouza that the isue with reminders is not etag but with the underlying data structure we chose.

I would like a little more time to examine this issue.

@ItalyPaleAle:

Your suggestion of using etags is pretty much limited to single-threaded nature too

Yes, but the main difference is that a Dapr sidecar is not blocking connections due to the single threaded execution as that happens in the DB.

I'm not following this. Isn't the queue per-actor?

It is, but an actor instance is always a representation of a resource (for example, a URL path for a given app ID). If the rate limiting representation of actors means there will be concurrent hits on the same resource (actor ID) this will lead to somewhat of an undesirable way of using actors.

ItalyPaleAle · 2023-11-10T17:54:06Z

The problem with reminders is that we are trying to persist a list on a limited set of keys, not because we are handling eTags.

That is true, but it's also true that etags are a big part of the problem too. I have been working on actors a lot lately and I can confirm this is the case. For example, the reason why the Workflow perf test fails 100% of times with 3 replicas of the app (and frequently with 2 replicas too) is because of etag conflicts while storing reminders. Essentially, 2 instances try to store reminders at the same time and they are stuck in a loop where they fetch data with one etag, and by the time they go write it, the etag has changed and the write fails. This is the typical problem with "distributed systems" that rely on OCC. The solution is to either fail (causing errors to be cascaded) or retry, possibly for an infinite amount of time, which increases latency significantly.

Yes, but the main difference is that a Dapr sidecar is not blocking connections due to the single threaded execution as that happens in the DB.

There can be workarounds to this. I have some ideas (that will be on a separate proposal) such as being able to run an actor concurrently if we know that it doesn't need single-threaded execution (like in the case of tokens - but it will need to be single-threaded for rate-limiting)

I think we have plenty of data to reach a decision among maintainers of runtime

This impacts components-contrib too, probably more than runtime (for the benefits), and contrib maintainers should be involved in the decision as well.

ItalyPaleAle mentioned this issue Jul 3, 2023

[WIP] OAuth2 Middleware: Store token in a cookie dapr/components-contrib#2963

Closed

ItalyPaleAle changed the title ~~Allow components to define internal actors~~ Allow (middleware) components to define internal actors Jul 3, 2023

This was referenced Jul 3, 2023

New OAuth2 middlewares with token persistence dapr/components-contrib#2964

Closed

[1.13] New OAuth2 middleware with token persistence dapr/components-contrib#2967

Draft

ItalyPaleAle mentioned this issue Jul 13, 2023

Dapr middleware support microsoft/azure-container-apps#53

Open

dapr-bot added the stale Issues and PRs without response label Sep 3, 2023

ItalyPaleAle added the pinned label Sep 5, 2023

ItalyPaleAle added this to the v1.13 milestone Sep 5, 2023

ItalyPaleAle removed the stale Issues and PRs without response label Sep 5, 2023

This was referenced Sep 6, 2023

Split Actors interface betweeen runtime and internal #6892

Merged

Decrease Internal Actor Idle Timeout #6898

Merged

yaron2 modified the milestones: v1.13, v1.14 Feb 26, 2024

Allow (middleware) components to define internal actors #6538

Allow (middleware) components to define internal actors #6538

Comments

ItalyPaleAle commented Jun 18, 2023 • edited Loading

ItalyPaleAle commented Jun 30, 2023

yaron2 commented Jul 5, 2023

ItalyPaleAle commented Jul 5, 2023

dapr-bot commented Sep 3, 2023

ItalyPaleAle commented Sep 6, 2023 • edited Loading

yaron2 commented Sep 7, 2023

ItalyPaleAle commented Sep 7, 2023

artursouza commented Oct 23, 2023

ItalyPaleAle commented Oct 23, 2023

artursouza commented Oct 23, 2023

ItalyPaleAle commented Oct 23, 2023

artursouza commented Oct 23, 2023 • edited Loading

ItalyPaleAle commented Oct 23, 2023

artursouza commented Oct 23, 2023

artursouza commented Oct 23, 2023

ItalyPaleAle commented Oct 23, 2023 • edited Loading

stuartleeks commented Nov 2, 2023

sujitdmello commented Nov 2, 2023

ItalyPaleAle commented Nov 2, 2023

berndverst commented Nov 9, 2023 • edited Loading

artursouza commented Nov 9, 2023

berndverst commented Nov 9, 2023

artursouza commented Nov 9, 2023

ItalyPaleAle commented Nov 9, 2023

berndverst commented Nov 9, 2023

artursouza commented Nov 9, 2023

yaron2 commented Nov 9, 2023

ItalyPaleAle commented Nov 9, 2023

berndverst commented Nov 9, 2023

artursouza commented Nov 9, 2023

yaron2 commented Nov 9, 2023

ItalyPaleAle commented Nov 9, 2023 • edited Loading

berndverst commented Nov 9, 2023

yaron2 commented Nov 9, 2023

ItalyPaleAle commented Nov 9, 2023 • edited Loading

berndverst commented Nov 9, 2023

artursouza commented Nov 9, 2023

berndverst commented Nov 9, 2023 • edited Loading

yaron2 commented Nov 9, 2023

berndverst commented Nov 10, 2023

yaron2 commented Nov 10, 2023 • edited Loading

ItalyPaleAle commented Nov 10, 2023 • edited Loading

yaron2 commented Nov 10, 2023

ItalyPaleAle commented Nov 10, 2023

artursouza commented Nov 10, 2023 • edited Loading

yaron2 commented Nov 10, 2023

ItalyPaleAle commented Nov 10, 2023

ItalyPaleAle commented Jun 18, 2023 •

edited

Loading

ItalyPaleAle commented Sep 6, 2023 •

edited

Loading

artursouza commented Oct 23, 2023 •

edited

Loading

ItalyPaleAle commented Oct 23, 2023 •

edited

Loading

berndverst commented Nov 9, 2023 •

edited

Loading

ItalyPaleAle commented Nov 9, 2023 •

edited

Loading

ItalyPaleAle commented Nov 9, 2023 •

edited

Loading

berndverst commented Nov 9, 2023 •

edited

Loading

yaron2 commented Nov 10, 2023 •

edited

Loading

ItalyPaleAle commented Nov 10, 2023 •

edited

Loading

artursouza commented Nov 10, 2023 •

edited

Loading