Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Scaling on custom query #6

Open
ankitbko opened this issue Jun 22, 2021 · 17 comments
Open

[Feature Request] Scaling on custom query #6

ankitbko opened this issue Jun 22, 2021 · 17 comments

Comments

@ankitbko
Copy link

ankitbko commented Jun 22, 2021

Any plans to support scaling based on custom query on Cosmosdb like it happens in MSSQL or Mongodb scalers?

@tomkerkhove
Copy link
Member

You can do that as part of KEDA core - http://keda.sh/

@ankitbko
Copy link
Author

@tomkerkhove I will need to write similar an external scaler similar to this right?

@tomkerkhove
Copy link
Member

No, you can just use KEDA which supports both systems.

@ankitbko
Copy link
Author

ankitbko commented Jun 22, 2021

Does it support scaling based on cosmos db query? I don't see the scaler for that. Apologies if I am not understanding this correctly, new to KEDA.

@tomkerkhove
Copy link
Member

Oh I misunderstood your original question - Sorry!

I though you wanted to read a custom query from MSSQL & MongoDB.

@JatinSanghvi
Copy link
Collaborator

@ankitbko - The Cosmos DB scaler will work by monitoring the change-feed that is being (or will be) processed by the target application. From the Cosmos DB documentation, it does not look like we can have filtered change-feeds. The estimator documented here too does not support filtering through query.

It should not be possible to read the content of these change feeds in the scaler, applying the custom query and only mark the external event as 'active' if there are non-zero results after filtering. That would require taking over lease of these change feeds causing conflict with the lease taken by target application, and may result in missing events.

Let me know if you have any suggestion for enabling custom queries using an existing support in Cosmos DB.

@tomkerkhove
Copy link
Member

What would be a good improvement, though, would be to call it the "Azure Cosmos DB Changefeed scaler" to emphasize this @JatinSanghvi

@tomkerkhove
Copy link
Member

The original request is for the data plane for which I can see value as well, but that would rather be a different scenario/scaler

@JatinSanghvi
Copy link
Collaborator

@tomkerkhove, I am unable to understand. I was taking about the data plane actually. Change feeds are the only way new changes to the Cosmos DB container can be processed and we expect all target application to use change feeds for change processing.

The number of change feeds cannot be scaled on demand; the number is same the count of underlying physical partitions of Cosmos DB container; the physical partitions depends on the storage size of container (and also provisioned throughput).

If there are multiple instances of processor apps running (may be because they were scaled out by KEDA), each of these instances will acquire lease on one or more of these feeds and start consuming the changes. This sets the max limit these app instances to the count of physical partitions in Cosmos DB container. The leases ensure that two different apps don't process the same data. But this also limits what the scaler can possibly do. For example, it can estimate the size of data pending for processing, but it cannot steal leases from the scaled-target apps to read data, say to apply filter through custom query.

Let me know if I misunderstood your comment.

@tomkerkhove
Copy link
Member

Changefeed is indeed for data processing of changes, but there is more than that.

If somebody would want to scale based on the # of docs returned from a query that is a valid scenario as well (which is the request here) but not the focus of this scaler.

Hence why I suggested to make it explicit that it's a Azure Cosmos Db Changefeed Scaler

@ankitbko
Copy link
Author

ankitbko commented Sep 20, 2021

Yes the ask here was to run a query on Cosmos and scale based on returned resultset similar to below image. Will this feature be added to this scaler (as its named cosmosdb scaler) or should we have another created for handling that scenario?
image

@tomkerkhove
Copy link
Member

It shoulds, the question is, will it be the scaler or another one.

That's why the name of this one should be very explicit about it @JatinSanghvi @pragnagopa

@JatinSanghvi
Copy link
Collaborator

JatinSanghvi commented Sep 21, 2021

The ask here is specifically to scale the Cosmos DB container listeners based on result of running a query. As I explained in earlier comments, Cosmos DB does not support this as that will require the scaler to steal leases from the listener app. If this could be possible, @ankitbko's ask could be addressed by the current scaler itself.

@tomkerkhove
Copy link
Member

The ask here is specifically to scale the Cosmos DB container listeners based on result of running a query. As I explained in earlier comments, Cosmos DB does not support this as that will require the scaler to steal leases from the listener app. If this could be possible, @ankitbko's ask could be addressed by the current scaler itself.

That is not correct, the ask above is fully unrelated to changefeed and leases. The ask is to scale based on the result of a query and a target value.

For example, what is the count of documents where field status is Unprocessed. The rest is up to the application.

So the bottom line here is, this scaler is only for changefeed and should be named accordingly

@pragnagopa
Copy link

cc @lpapudippu @brettsam @ahmelsayed

In general I agree with the feature request.

@tomkerkhove - regarding

So the bottom line here is, this scaler is only for changefeed and should be named accordingly

If the team agrees to support this feature - we should extend current scaler implementation to support scaling bases on query . Let's not rename anything yet !

@tomkerkhove
Copy link
Member

That's possible for sure, question is what the user experience would be then but I'm happy to wait as long as we don't have to "break" things.

@ankitbko
Copy link
Author

I am willing to contribute on this functionality and some members of my team may be interested as well. If the team agrees to support the feature let me know what is the plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants