Add GenerationMixin class #29

artek0chumak · 2022-07-16T13:13:47Z

Add generation abstraction, that's using inference_session.
Added modes:

Greedy, top-k/top-p sampling
Multibatch generation
Constraint abstraction
In the future, I'll add prefix-tuned generation, beam-search and more hf-like stuff.

justheuristic · 2022-07-16T13:55:45Z

src/server/backend.py

@@ -23,7 +23,7 @@ def __init__(self, *args, memory_cache: MemoryCache, **kwargs):
        for name, buf in self.module.named_buffers():
            assert not buf.requires_grad, f"Bloom layer parameters must not accumulate gradients, but {name} does"

-        self.inference_pool = TaskPool(self.inference_step, max_batch_size=1, name=f"{self.name}_inference")
+        self.inference_pool = TaskPool(self.inference_step, max_batch_size=4096, name=f"{self.name}_inference")


This can have an adverse effect of grouping together concurrent requests for inference into one pytorch call. The current inference code will break in that case.

justheuristic

setting max batch size other than 0 in TaskPool for inference will cause it to merge requests from different users, which is not supported on the backend (both queries will fail)

There are several options to work around that:

implement multi-source inference on the backend side
make a custom task pool for inference
set batch size 1 for now, fix in a subsequent PR

artek0chumak · 2022-07-18T12:12:59Z

I fix test/black/isort. Remove change in max_batch_size in server/handler.py.

artek0chumak added 10 commits July 16, 2022 12:55

Init inference generation

b29937b

batch_ids

12d874f

fix generate

2e0abe4

remove batch_ids

389d220

after style fixes

285080f

fix imports

526942c

working one batch

aee0799

more hf like

e688429

Add multibatch mode

a3be17a

black

7ea1ea0

justheuristic reviewed Jul 16, 2022

View reviewed changes

artek0chumak added 2 commits July 18, 2022 08:57

style

d351431

Fix tests

1d7c550

justheuristic merged commit 6ee942e into main Jul 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GenerationMixin class #29

Add GenerationMixin class #29

artek0chumak commented Jul 16, 2022

justheuristic Jul 16, 2022

justheuristic left a comment

artek0chumak commented Jul 18, 2022

Add GenerationMixin class #29

Add GenerationMixin class #29

Conversation

artek0chumak commented Jul 16, 2022

justheuristic Jul 16, 2022

Choose a reason for hiding this comment

justheuristic left a comment

Choose a reason for hiding this comment

artek0chumak commented Jul 18, 2022