Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need PredictionFunction to provide object-pooling or to be thread-safe #1718

Closed
CESARDELATORRE opened this issue Nov 26, 2018 · 13 comments
Closed
Labels
API Issues pertaining the friendly API documentation Related to documentation of ML.NET enhancement New feature or request P2 Priority of the issue for triage purpose: Needs to be fixed at some point.

Comments

@CESARDELATORRE
Copy link
Contributor

CESARDELATORRE commented Nov 26, 2018

This issue is suggesting an enhancement since the current approaches/workarounds have both drawbacks, meaning we're not getting the scalability and performace as good as it could be in multithreaded scenarios like ASP.NET Core web apps or Web API services.

Context:
In multiple scenarios, but especially in ASP.NET Core web apps, the model-object (ITransformer) and the prediction function (PredictionFunction) object should be re-used because they are "expensive" objects when initializing and will impact when having many concurrent users.

In the case of the model-object (ITransformer), it is a thread-safe object, so the model-object loaded from the .ZIP file can be registered as singleton. That way it'll be re-used by any thread or Http request in the ASP.NET Core app. See this code as an example: https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/end-to-end-apps/Regression-SalesForecast/src/eShopDashboard/Startup.cs#L53

Main issue with the Prediction Function:
The Prediction function is also "expensive" as it takes significant milliseconds when creating the prediction-function object from the model-object with the model.MakePredictionFunction() method.
Ideally, because it is "expensive", it should be re-used across executions in the same app. But since it is not thread-safe, the current options in an ASP.NET Core web app are the following:

- OPTION 1: Register the PredictionFunction object as Scoped (AddScoped()) for its DI/IoC object lifetime, as in this code:
https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/end-to-end-apps/Regression-SalesForecast/src/eShopDashboard/Startup.cs#L61

But, the problem with this approach is that in many of the cases you don't get much benefit unless within the same Http request you make multiple calls to the .Predict() method. The point is that since the Scope lifetime is only re-used within a single Http request, for most of the Http requests it'll need to call the model.MakePredictionFunction() method.

The only way to share an object across Http requests in .NET Core DI/IoC is with singleton, but that requires the object to be thread-safe.

The possible service/object lifetimes in .NET Core IoC/DI are:

  • Singleton (shared across all threads)
  • Transient (per object call/new)
  • Scoped (Per Http request of object scope)

See the available object lifetimes here:
https://docs.microsoft.com/en-us/aspnet/core/fundamentals/dependency-injection?view=aspnetcore-2.1#service-lifetimes
https://blogs.msdn.microsoft.com/cesardelatorre/2017/01/26/comparing-asp-net-core-ioc-service-life-times-and-autofac-ioc-instance-scopes/ (Autofac does support Thread Scope, though..)

- OPTION 2: Register the PredictionFunction object as Singleton (AddSingleton()) for its DI/IoC object lifetime, as in this code (Commented line):
https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/end-to-end-apps/Regression-SalesForecast/src/eShopDashboard/Startup.cs#L60

Benefits: If you register the prediction-function object as Singleton for its ASP.NET Core DI/IoC object lifetime, since it is Singleton, any new Http request (except the first Http request since the app started) will just use the prediction function by calling the Predict() method. Therefore the performance of a single Http request would be significantly better.

Problem: However, the issue with this approach is that since the prediction function is not thread-safe, you need to use mechanisms like a critical section to lock the prediction function object to be used by a single thread when executing the Predict() method, like in this code (Commented line):
https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/end-to-end-apps/Regression-SalesForecast/src/eShopDashboard/Controllers/CountrySalesForecastController.cs#L57

The issue with this approach is that since you are limiting the use of "Predict()" to a single thread, that will be a bottleneck when scaling. In the cases when you could have many concurrent Http requests and many of them trying to run the Predict() method, the scalability won't be as good as it could be because we're limiting that code execution to a single thread able to run the Predict() method.

Basically, with this approach you might significantly limiting the scalability of the app (in regards the model execution/scoring) across threads when having many Http requests.

Workarounds: Currently, use any of the two mentioned methods, being aware of the handicaps from each:

  • PredictionFunction as Scoped object
  • Predictionfunction as Singleton but using critical section when running "Predict()"

Long term solutions:
I see two possible solutions:

  1. Create an object pooling of "prediction function objects". That way, most Http requests won't need to call the "expensive" .CreatePredictionFunction() method while it would be scalable since many threads could be using the multiple objects available in the object pooling.

  2. Make the prediction Function thread-safe. If the prediction-function was thread-safe while scalable enough, it could be simply registered as Singleton in the DI/IoC system in ASP.NET Core without needing to use a critical section or comparable approaches.

Is there any other possible approach available?

Once these scenarios are further discussed in this thread or in our API reviews, I'll document the current possible approaches when consuming/running an ML.NET morel in an ASP.NET Core web app or Web API service.

Related issues:
#421

@CESARDELATORRE CESARDELATORRE added the enhancement New feature or request label Nov 26, 2018
@CESARDELATORRE CESARDELATORRE added question Further information is requested API Issues pertaining the friendly API documentation Related to documentation of ML.NET labels Nov 26, 2018
@shmoradims shmoradims self-assigned this Nov 26, 2018
@shmoradims shmoradims added the need info This issue needs more info before triage label Nov 26, 2018
@shmoradims
Copy link
Contributor

@eerhardt @TomFinley @Zruty0 could you please discuss a concrete action path for this issue so it can be triaged?

@eerhardt
Copy link
Member

Another option is to use a [ThreadStatic] field in your ASP.NET Core application. Since threads are pooled by default, if you re-use a thread from the thread pool, the field will already be populated. And you are guaranteed that only the current thread can access the field at the current time (since it is thread static).

However, you need to be careful not to cache the variable across async boundaries, or other places where you may jump between threads. Because if you did that, now you may have multiple threads using the same PredictionFunction.

@TomFinley
Copy link
Contributor

Hi @shmoradims and @CESARDELATORRE, and thank you @CESARDELATORRE for such a complete description of your problem. Stateful objects in libraries that are not somehow intrinsically threadsafe generally should not handle their own synchronization, because what constitutes "thread safe" is very much an application specific question. Complicating this further, this is especially true in ML.NET where these stateful objects can themselves sometimes be quite substantial in terms of memory/compute footprint (e.g., the intermediate buffers needed in doing something like evaluating a DNN). There's also of course the issue that the prediction function.

This is of course why so few items in .NET that are stateful handle their own synchronization. Indeed, the only exception I am aware of are of course those basic primitive structures that enable basic multithreading scenarios, but as far as a simple object itself being threadsafe? E.g., would anyone ever try to make System.Random threadsafe? No, never. Not because the .NET team is lazy, but because it is simply impossible to know how it is being used.

This is quite evident in this specific scenario: I think what might serve the interests in this project would be to register a singleton object pool that creates/holds new prediction functions, e.g., make the object pool a singleton, and have the serving method (1) fetch an instance, (2) use it, (3) return it.

...Unless the model in question is somehow a time series model in a way which is stateful (e.g., the scenario we're trying to unblock with the work in #1618), in which case a singleton instance of a function with a global lock might be appropriate, maybe.

...Unless ASP.NET eventually gets the ability to run something akin to Django worker servers (assuming I remember how it works correctly, it's been some years), which would be even better than what ASP.NET does since then there could be some "worker" with a bunch of assets they personally own, which would be far better, since they could just natively do setup and teardown themselves.

...Unless what you're doing has nothing whatsoever to do with ASP.NET or web services at all, e.g., batch processing of a large file with many instances, in which case, in which case even if you wanted multithreading you could just spawn a bunch of separate objects, pass them to separate tasks using Parallel.ForEach or somesuch.

...Unless the model happens to be something like evaluating DNNs where it would be more harmful than helpful to try to achieve throughput through multithreading, and batch prediction would be a better approach.

The point, of course, is that this sort of object is not made thread safe because what constitutes thread safety is very, very application and model dependent. This is quite obvious in this case, as we see from the list above. (ANd that since writing it, I see @eerhardt has come up with yet another way to do it.) You'll note that a significant part of the issue's is steeped both implicitly and explicitly in the precise method by which ASP.NET itself divies up work across multiple threads. This of course is a huge red flag for anyone designing an API that they'd best not get involved, since if things were to change just a little bit, or we were to have a different scenario, a totally different solution would be appropriate.

Indeed, the .NET team itself learned this lesson with their collections once upon a time. They used to have a thing called SyncRoot. Then they realized correctly that basic objects trying to handle their own synchronization somehow was worse than useless and more harmful than helpful. So they stopped. The closest thing we have now is System.Collections.Concurrent, but those are from the ground up meant to handle a very specific form of synchronization (we make good use of them for this throughout ML.NET codebase), since, again, what constitutes thread safety is very specific to teh scenario.

Even now I occasionally run across a library that with the best of intentions tried to handle synchronization for the user. E.g., an implementation of object pooling at some level of their stack, for everyone's "convenience" -- followed immediately, inevitably, by lengthly description of how and why to turn it off for the 75% of scenarios where such a simplistic approach is insufficient or leads to bad perf.

We did in fact make precisely this same mistake in ML.NET v0.1, where we made our "prediction models" thread safe. What a disaster that was.

So, the action item might be one of education. Maybe we explain what objects are and are not thread safe, maybe even go into why. (Though, on the other hand, this seemed to be known in this case?) So maybe something for the documentation folks to take up? I'm not sure.

@CESARDELATORRE
Copy link
Contributor Author

CESARDELATORRE commented Nov 26, 2018

@eerhardt - Agree, but the problem is not just to protect/lock the prediction function. The other problem is "how to re-use" the prediction function so the CreatePredictionfunction() is called the smaller number of times possible in a web/Http based application.

My point is that in most of the cases, for most Http requests, either using [ThreadStatic] or Scoped in DI/IoC, the method CreatePredictionFunction() is going to be called once per Http request. Then, if per Http request the app does a single prediction, the "cost" seems high..

What we need is a way to share PredictionFunction-objects across Http requests and the two ways that I see are these:

  • A. Singleton and making PredictionFunction thread safe. --> Tom is explaining above why this might not be such a good idea, so we might want to discard this option.

  • B. Having an object pool of PredictionFunction objects. --> I don't think regular .NET developers will build their own object pooling system, we need to make it easy for them. My question is if we should create such an object pooling for PredictionFunction objects.

About the ...Unless cases mentioned by Tom. Sure, there might be cases like those where you don't need it, but in a large % of .NET applications and developers, they will be scoring the model in .NET web apps or Web API services where our main guidance is ASP.NET Core, so we need a prescriptive way of doing it efficiently.

Additionally, it could also be improved from an "easier to use" point of view by simplifying it:

Let's say the model-object and the object pooling are under a single object which can be singleton, it would be a lot easier for the web developer to score a model by interacting with a single object.

Right now, the developer has to differentiate between the model-object and the prediction-function-object which have different recommended lifetimes (singleton vs. scoped/thread-safe). For sure, the current implementation to be done in ASP.NET Core is not straightforward.

PRIORITY?
In any case, with this object pooling system we're talking about a possible improvement . It is not a blocking issue since the approaches explained above do work okay.
But it could be improved from an "easier to use" point of view and from a performance perspective by re-using the prediction-functions when possible. :)

@eerhardt
Copy link
Member

My point is that in most of the cases, for most Http requests, either using [ThreadStatic] or Scoped in DI/IoC, the method CreatePredictionFunction() is going to be called once per Http request. Then, if per Http request the app does a single prediction, the "cost" seems high..

CreatePredictionFunction() won't be called once per Http request when using [ThreadStatic]. The point of the [ThreadStatic] solution is to cache the PredictionFunction on the thread. Then when the thread is re-used to service the next Http request, the PredictionFunction gets re-used as well. Thus on every subsequent Http request that the thread services, a new PredictionFunction isn't created. But the one created the first time is used.

ASP.NET pools threads by default. A new thread isn't created to service each Http request. That would be a massive overhead.

@TomFinley
Copy link
Contributor

TomFinley commented Nov 27, 2018

I imagine it's pretty clear by now just from this discussion @CESARDELATORRE why we won't be doing this, but just to be explicit: you and @eerhardt are having a discussion about how best to synchronize this resource for just this one library ASP.NET. And that's ignoring the fact again that this particular scheme would not be appropriate for different classes of models, in the set I just enumerated. (In particular: I could see how either of your approaches might be situationally correct, depending on the circumstances. Or situationally incorrect.) Could we imagine a one-size-fits-all, or even one-sized-fits-most solution to this problem?

they will be scoring the model in .NET web apps or Web API services where our main guidance is ASP.NET Core, so we need a prescriptive way of doing it efficiently.

So, at the risk of stating the obvious, ML.NET absolutely did not invent the concept of an object that would benefit from pooling and sharing. ML.NET is certainly not the only library we'd expect ASP.NET to call. If ASP.NET handles this sort of poolable sharable resources as awfully as your message seems to indicate (does it really?), this is something for them to fix, not something for every library that might conceivably be called through them to fix for them.

@CESARDELATORRE
Copy link
Contributor Author

CESARDELATORRE commented Nov 29, 2018

@eerhardt - We can have a discussion off-line about [ThreadStatic]. I said for "most" Http requests (the ones that are not reused or are reused but still didn't created a MakePredictionFunction object).

I'm saying that because in an ASP.NET Core app, the thread pool has by default the following available threads:
https://docs.microsoft.com/en-us/dotnet/api/system.threading.threadpool.getavailablethreads?view=netcore-2.1
// Worker threads: 32,767
// Asynchronous I/O threads: 1,000

Effectively, until the app is not used quite a lot, the probability of getting a reused thread that at the same time already used/created the PredictionFunction object, is kind of low.

In many cases the MakePredictionFunction() will be called when getting an Http request because the app has 32,767 threads available in the pool and chances of getting a re-used thread that at the same time already created a Prediction Function might be low until the application has been used for significant time.

If we had an object pool of PredictionFunction objects, it would be viceversa, most of the threads would take an already available PredictionFunction object from the pool.

There's an existing case in ASP.NET Core with HttpClientFactory which uses an object pool of HttpMessageHandler objects:

https://docs.microsoft.com/en-us/aspnet/core/fundamentals/http-requests?view=aspnetcore-2.1#httpclient-and-lifetime-management

It is a comparable case because of issues when using HttpClient is used as Singleton or as Transient, and the best choice was to create object pooling.

@TomFinley - About "just this one library ASP.NET" well..., it is not about if it is a single library ,which is not just a single library, ASP.NET is a higher level framework on top of .NET. But the important point is about the % of .NET developers using ASP.NET. And ASP.NET means a very large portion of the development in .NET and even larger % for the new/modern .NET development.

About "ASP.NET handles this sort of poolable sharable resources as awfully" I don't understand why you say that. Either thread pooling and DI/IoC work very well in ASP.NET Core. The issue is about how to quickly reuse the non-threadsafe PredictionFunction across Http requests. I think this problem will happen with any web framework in .NET, either ASP.NET Core, Nancy, etc.

I believe that just by using [ThreadStatic] the application might take long until the PredictionFunction objects are re-used because of the large number of available threads.

In any case, this feature might not be a priority, I just want to highlight the possibilities for improvement. :)

@shmoradims
Copy link
Contributor

Great discussions everyone. Based on all the comments, two conclusions seem clear to me:

  1. ML.NET is a general purpose ML framework and shouldn't handle thread-safety, which is application specific. So we shouldn't add PredictionFunction pools to ML.NET.

  2. ML.NET is geared for .NET developers after-all and a lot of them use ASP.NET for request-response scenarios, which can benefit from pooling PredictionFunction objects. So it's a great value-add if show .NET developers how to do that with ML.NET PredictionFunction.

I suggest the following action items:

  1. Close this issue, since we don't want to add object pools in ML.NET
  2. Create a new enhancement issue "Provide guidance for PredictionFunction object-pooling for ASP.NET applications". We could further discuss the how-to and location of that code in that issue, but it wouldn't be part of ML.NET. Adding it to samples is one option. Also since this is a common use-case for ASP.NET, they seem to have provided some built-in object pools as part of the framework: https://docs.microsoft.com/en-us/dotnet/api/microsoft.extensions.objectpool.objectpool-1?view=aspnetcore-2.1

@CESARDELATORRE
Copy link
Contributor Author

CESARDELATORRE commented Nov 29, 2018

@shmoradims - I was just chatting with Eric about this.
First, we think that the [ThreadStatic] approach or even a Thread Scope lifetime in DI/IoC could be dangerous (if the by default IoC container in .NET Core had it, like AutoFac has it, see here this blog post I wrote time ago comparing the DI/IoC lifetimes in by-default .NET Core and AutoFac. In asynchronous calls in ASP.NET Core, the return method could get a different thread and could be using the wrong object, etc.

When talking with the ASP.NET team they also told me time ago that Thread Scope lifetime in DI/IoC could be dangerous in async web/service apps, so that's why they don't have it in the by default IoC container in .NET Core.

In any case, this issue is not a blocking issue. It is just an area I think we should investigate more in order to get the best developer experience in the future for one of the most important workloads in .NET and certainly the most important workload in .NET Core (ASP.NET Core means 90% of apps in .NET Core).

I agree that for now, the best action to do is to implement the object pooling approach in our sample applications, document it in docs.microsoft.com and get feedback from users.

I personally think it is not a very straight forward experience that in order to just score a model in a web app or Http service you need to understand multiple concepts that can be complex and even confusing when exploring for the first time:

  1. The Model object (ITransformer)
    • It can and must be Singleton since it is expensive to create, so it must be shared
  2. The PredictionFunction object (Which is not a function, but an object.. might need to be renamed)
    • It is expensive to create, so it must be shared, but it is not thread-safe, so it cannot be Singleton and you must find another way to share this object between Http requests...
  3. You can create by yourself an object pooling system of PredictionFunction objects and use that whenever you are scoring. This would initially be the best performing/optimized choice.

I think the experience should be simplified in the future, but for now, I agree, let's implement sample code with the object pooling approach, document it in docs.microsoft.com and when we get further feedback from customers, we can evolve accordingly. 👍

@shmoradims
Copy link
Contributor

@CESARDELATORRE I created a concrete work item #1789. I assigned it to you and project 0.9, but feel free to change those. Based the the feedback we get, we can revisit this issue.

@TomFinley @eerhardt thanks for all your comments.

@CESARDELATORRE
Copy link
Contributor Author

CESARDELATORRE commented Dec 19, 2018

@shmoradims - I just implemented sample code on how to use object pooling for the PredictionFunction / PredictionEngine objects in this PR for the ASP.NET Core web app eShopDashboardML, here:
dotnet/machinelearning-samples#184

However, after creating the code which I find very generic and usable for any end-user app while at the same time it is significant code to create/understand, I still think that this kind of implementation/code should be part of ML.NET API not really part of user's code.

We need to have a further discussion about it reviewing the code I wrote for this approach and since I still think this should be part of ML.NET API, I'm re-opening the issue, ok? 👍

@shmoradims shmoradims added the P2 Priority of the issue for triage purpose: Needs to be fixed at some point. label May 21, 2019
@michaelgsharp
Copy link
Member

@eerhardt does the PredictionEnginePool resolve this issue?

@eerhardt
Copy link
Member

Yes. I believe this can now be closed.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API documentation Related to documentation of ML.NET enhancement New feature or request P2 Priority of the issue for triage purpose: Needs to be fixed at some point.
Projects
None yet
Development

No branches or pull requests

6 participants