Skip to content

How to define infer/eval configs to accelerate when nginx provides multiple API/url model inference calls? #138

Answered by Leymore
hanjr92 asked this question in Q&A
Discussion options

You must be logged in to vote

Yes, max_num_workers can be used for parallel inference. But I suggest a round-robin over the urls in the Model class, which is easier to implement and more intuitive in concept.
You may find this document helpful

Replies: 2 comments 6 replies

Comment options

You must be logged in to vote
0 replies
Answer selected by hanjr92
Comment options

You must be logged in to vote
6 replies
@hanjr92
Comment options

@gaotongxiao
Comment options

@hanjr92
Comment options

@hanjr92
Comment options

@gaotongxiao
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants