Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Batch Inference Toolkit

Batch Inference Toolkit(batch-inference) is a Python package that batches model input tensors coming from multiple users dynamically, executes the model, un-batches output tensors and then returns them back to each user respectively. This will improve system throughput because of better compute parallelism and better cache locality. The entire process is transparent to developers.
Batch Inference Toolkit(batch-inference) is a Python package that batches model input tensors coming from multiple requests dynamically, executes the model, un-batches output tensors and then returns them back to each request respectively. This will improve system throughput because of better compute parallelism and better cache locality. The entire process is transparent to developers.

## When to use

Expand Down Expand Up @@ -59,7 +59,7 @@ from batch_inference.batcher.concat_batcher import ConcatBatcher
@batching(batcher=ConcatBatcher(), max_batch_size=32)
class MyModel:
def __init__(self, k, n):
self.weights = np.random.randn((k, n)).astype("f")
self.weights = np.random.randn(k, n).astype("f")

# shape of x: [batch_size, m, k]
def predict_batch(self, x):
Expand All @@ -75,6 +75,7 @@ def process_request(x):
y = host.predict(x)
return y

host.stop()
```

**Batcher** is responsible to merge queries and split outputs. In this case ConcatBatcher will concat input tensors into a batched tensors at first dimension. We provide a set of built-in Batchers for common scenarios, and you can also implement your own Batcher. See [What is Batcher](https://microsoft.github.io/batch-inference/batcher/what_is_batcher.html) for more information.