Implement LoadGen RegisterIssueQueryThread() #620

nvpohanh · 2020-06-15T22:42:50Z

This is built on top of #618

Changes:

Added a new TestSettings flag: server_num_issue_query_threads.
Implemented multi-thread IssueQuery() functionality in issue_query_controller.cc/h. This feature is enabled only when in Server scenario, server_num_issue_query_threads is set to positive numbers, and if there are threads calling RegisterIssueQueryThread().
Moved SampleMetadata, QueryMetadata, ResponseDelegate, QueryScheduler implementations and part of IssueQueries() to issue_query_controller.cc/h with very minor modifications.

TODO:

Add Python binding for RegisterIssueQueryThread()
Add small Python repro program to test this new binding function.

If you want to see a pure incremental diff for the second feature, see: nvpohanh#1

christ1ne · 2020-06-16T20:30:06Z

How do I set the threads as a user? Is it in my own run.py in the benchmarks? Or I need to recompile loadgen with the number of threads that I want to set? @nvpohanh

nvpohanh · 2020-06-18T18:37:42Z

How do I set the threads as a user?

You create a thread, and call RegisterIssueQueryThread(). This call will be blocking until the entire test ends. You can see the MultiBasicSUT in repro.cpp as an example.

Is it in my own run.py in the benchmarks?

I haven't thought about how we can do this in Python (not a Python expert 😢). Proposal would be:
(1) Add Python binding for RegisterIssueQueryThread().
(2) In your run.py, do something like

import loadgen
import threading

def start_issue_query_thread():
    loadgen.register_issue_query()

def main():
    x = threading.Thread(target=start_issue_query_thread)
    x.start()
    # Create SUT, QSL, test_settings
    test_settings.server_num_issue_query_threads = 1
    # Configure other test settings
    loadgen.StartTest(...)
    x.join()

Or I need to recompile loadgen with the number of threads that I want to set?

No recompile is needed. You only need to change the test settings and make sure the correct number of threads call RegisterIssueQueryThread()

Note that I haven't added the Python binding yet, but I can do it. Let me add a TODO

christ1ne · 2020-06-18T20:25:47Z

What's the performance increase going from 1 thread to 2 on a typical GPU on any inference model (e.g. resnet50) with loadgen in terms of percentage? 10%? 50%? 100%? What's the host CPU usage for 1 and 2 threads?

DilipSequeira · 2020-06-19T03:13:18Z

Unfortunately we're not able to disclose measured scaling factors, since that would to some extent reveal overall performance. However, the current implementation represents a significant scaling bottleneck on one of our systems containing multiple high-end GPUs, and the submission scores will illustrate this.

If Intel or any other submitter believes that these MRs inhibit their ability to demonstrate the performance of which their system is capable, we're happy to take input on alternative approaches for improving scalability.

Regarding CPU utilization... it will depend on the SUT. If the SUT and QSL are empty and loadgen is operating at maximum throughput, loadgen will consume 100% of the thread(s). However, using multiple threads is opt-in, and we expect submitters will opt in only if it increases their score. If opting in is helpful for Intel on any of the benchmarks and you have performance improvements for this path, we'll be happy to review.

christ1ne · 2020-06-23T22:36:12Z

This change looks good to me @nvpohanh Thank you for your contribution.

aaronzhongii · 2020-06-25T19:46:12Z

Tried the changes at my local with default behavior and no performance impact is observed. I did not enable the issuequery thread as the issueQuery call back function in my implementation is really a thin function as below

Query* q = new Query;
q->qid = qid;
q->qs = qs;
q->cntQs = cntQs;
q->offset = 0;

// Send the query to query queue and return right after
qQueueMutexes[qid].lock();
qQueues[qid].push(q);
qQueueMutexes[qid].unlock();
qQueueCVs[qid].notify_one();
cntTotalQs += cntQs;

There are multiple worker thread in the implementation to process the query in the queues where the real work is done. Since the issuequery function is so thin in my implementation, I don't think there is any benefit to make it multithreaded.

Thus I am curious why NVidia can't employ the same approach, make the issueQuery thin and small and use multi-threaded workers in the background?

…e-query-threads Implement LoadGen RegisterIssueQueryThread() Former-commit-id: 3c6e6d1

nvpohanh added 5 commits June 7, 2020 22:32

Add LoadGen benchmark to test LoadGen performance in Server scenario

d4120c4

Implement LoadGen server_coalesce_queries flag

f1446bc

Clang-format the modified files. Address comments

814ba38

Implement LoadGen RegisterIssueQueryThread()

8f9c02c

Add comments for LoadGen IssueQueryController

bfba2d0

nvpohanh mentioned this pull request Jun 18, 2020

Implement LoadGen server_coalesce_queries flag #618

Merged

christ1ne merged commit 3c6e6d1 into mlcommons:master Jun 30, 2020

arjunsuresh pushed a commit to GATEOverflow/inference that referenced this pull request Apr 29, 2024

Merge pull request mlcommons#620 from nvpohanh/dev-loadgen-multi-issu…

6d170fb

…e-query-threads Implement LoadGen RegisterIssueQueryThread() Former-commit-id: 3c6e6d1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement LoadGen RegisterIssueQueryThread() #620

Implement LoadGen RegisterIssueQueryThread() #620

Uh oh!

nvpohanh commented Jun 15, 2020 •

edited

Loading

Uh oh!

christ1ne commented Jun 16, 2020 •

edited

Loading

Uh oh!

nvpohanh commented Jun 18, 2020

Uh oh!

christ1ne commented Jun 18, 2020 •

edited

Loading

Uh oh!

DilipSequeira commented Jun 19, 2020

Uh oh!

christ1ne commented Jun 23, 2020

Uh oh!

aaronzhongii commented Jun 25, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Implement LoadGen RegisterIssueQueryThread() #620

Implement LoadGen RegisterIssueQueryThread() #620

Uh oh!

Conversation

nvpohanh commented Jun 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

christ1ne commented Jun 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nvpohanh commented Jun 18, 2020

Uh oh!

christ1ne commented Jun 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DilipSequeira commented Jun 19, 2020

Uh oh!

christ1ne commented Jun 23, 2020

Uh oh!

aaronzhongii commented Jun 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nvpohanh commented Jun 15, 2020 •

edited

Loading

christ1ne commented Jun 16, 2020 •

edited

Loading

christ1ne commented Jun 18, 2020 •

edited

Loading

aaronzhongii commented Jun 25, 2020 •

edited

Loading