Skip to content

Update gather to use multiple threads#11524

Merged
RyanUnderhill merged 5 commits intomasterfrom
ryanunderhill/gather_perf2
May 17, 2022
Merged

Update gather to use multiple threads#11524
RyanUnderhill merged 5 commits intomasterfrom
ryanunderhill/gather_perf2

Conversation

@RyanUnderhill
Copy link
Copy Markdown
Contributor

Description: GatherElements wouldn't distribute the work across multiple threads

Motivation and Context
A user was comparing the performance of Onnxruntime vs Pytorch and saw that the latest Pytorch was 2x faster than Onnxruntime. A profile showed that they were using all CPU cores but we were limited to 1.

There is a separate issue where they were using Onnxruntime inefficiently (there's a memcpy on every Run() call to copy the output tensor, using io-bindings avoids the memcpy).

Here's some performance data comparing the old vs new. Note that there is a slight perf hit for the single threaded case, as it has to divide up the work into independent chunks vs a slightly faster incremental calculation between chunks.

New version using 8 threads:

onnx model: 0.770249s after 10000 iterations
pytorch model: 3.669060s after 10000 iterations

New version limited to one thread:

onnx model: 3.474726s after 10000 iterations
pytorch model: 3.579055s after 10000 iterations

Old version:

onnx model: 2.905542s after 10000 iterations
pytorch model: 3.634197s after 10000 iterations

hariharans29
hariharans29 previously approved these changes May 16, 2022
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

#include <string>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious - why was this header inclusion required now ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a lint warning to not include it. I forget the exact text but something about including libraries for types you use.

@RyanUnderhill RyanUnderhill merged commit deef214 into master May 17, 2022
@RyanUnderhill RyanUnderhill deleted the ryanunderhill/gather_perf2 branch May 17, 2022 02:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants