Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance design issue in CNTKLibraryManagedDll, chatty interop interfaces #2374

Closed
vermorel opened this issue Sep 18, 2017 · 2 comments
Closed
Assignees

Comments

@vermorel
Copy link
Contributor

vermorel commented Sep 18, 2017

It's thrilling to see that training is coming to .NET, however, the current design from CNTKLibraryManagedDll is going to suffer from severe performance issues.

Basically, Helper.AsFloatVector<T>(IEnumerable<T> input) relies on iterative calls to FloatVector.Add(float) (when T is float) with is going to make a call to CNTKLibPINVOKE.FloatVector_Add() for every single value. Interop calls are expensive. This is the 101 guidance from Improving Interop Performance: you need to avoid chatty interfaces.

In the present case, it is of primary importance to offer a way to expose an interface that would look like: FloatVector.AddRange(float[] buffer, int index, int count). Let remark that buffer is a buffer as the name suggests. In .NET, recycling and pooling arrays is critical for most high-performance code. Thus, the CNTK lib should not force the client C# code to allocate myriads of arrays just to push data to CNTK.

See the code at /pull/2271 for an example of better design in .NET/C#.

As it stands, I am near 100% sure that it will be 10x faster (or more) in practice to just dump from C# the data in binary format to the local filesystem and have the data re-read through cntk.exe rather than push the data through the managed lib.

@liqunfu
Copy link
Contributor

liqunfu commented Sep 20, 2017

Thanks @vermorel! We will fix this performance issue. FloatVector.AddRange(float[] buffer, int index, int count) is a very good suggestion.

@liqunfu
Copy link
Contributor

liqunfu commented Oct 24, 2017

Fix to CreateBatch is checked in. The fix allows to use c# buffer with offset and size to avoid FloatVector data copying. The original API is kept. It uses AsFloatVector to do the conversion which also improves the performance.

BTW FloatVector is swig generated so we avoid to modify it.

@liqunfu liqunfu closed this as completed Oct 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants