Added database to inference server #1446

yk · 2023-02-10T20:46:44Z

also some robustifications of the inference code and some cleanup

andreaskoepf · 2023-02-10T21:11:52Z

inference/server/main.py

                await asyncio.sleep(1)
                continue

-            chat.message_request_state = MessageRequestState.in_progress
+            chat_repository.set_chat_state(chat.id, interface.MessageRequestState.in_progress)


The "dequeue" operation needs to set the chat immediately into a state that prevents others clients from also dequeuing it. Imagine we have 5k connected inference clients and all want to handle chats... this dequeue op will be a massive congestion point. A possible strategy could be to find and update in one sql-statement sent to the db .. or to use a dedicated queue service.

yes absolutely, this part will have to be re-worked anyway, probably best to do it with redis, too.
I'll merge this one in and improve later

andreaskoepf

Looks good. Need to actually test to get a feeling for it. My major concern is the chat-dequeue/worker-handshake op .. as it is implemented currently the DB session would probably have to run in serializable mode. The deque-op (find and assign matching chat) IMO must be atomic from a db-client perspective so that even 10k concurrent requests are no problem.

andreaskoepf · 2023-02-10T21:39:29Z

inference/worker/utils.py

+            if time.time() > time_limit:
+                raise
+            logger.warning("Inference server not ready. Retrying...")
+            time.sleep(5)


Maybe we could randomize this delay to de-synchronize client retry requests. Otherwise all clients will try to reconnect at a similar time in case of a server restart etc.

yk added 3 commits February 10, 2023 02:52

added db for inference

d6eb6d4

fixed dockerfiles for inference

bbe17c1

Merge branch 'main' into inference-db

3fecf03

yk added the inference label Feb 10, 2023

yk requested a review from andreaskoepf as a code owner February 10, 2023 20:46

andreaskoepf reviewed Feb 10, 2023

View reviewed changes

andreaskoepf approved these changes Feb 10, 2023

View reviewed changes

andreaskoepf reviewed Feb 10, 2023

View reviewed changes

randomized connect sleep duration

6ab5411

yk merged commit 90c3d56 into main Feb 10, 2023

yk deleted the inference-db branch February 10, 2023 21:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added database to inference server #1446

Added database to inference server #1446

yk commented Feb 10, 2023 •

edited

andreaskoepf Feb 10, 2023

yk Feb 10, 2023

andreaskoepf left a comment

andreaskoepf Feb 10, 2023

yk Feb 10, 2023

Added database to inference server #1446

Added database to inference server #1446

Conversation

yk commented Feb 10, 2023 • edited

andreaskoepf Feb 10, 2023

Choose a reason for hiding this comment

yk Feb 10, 2023

Choose a reason for hiding this comment

andreaskoepf left a comment

Choose a reason for hiding this comment

andreaskoepf Feb 10, 2023

Choose a reason for hiding this comment

yk Feb 10, 2023

Choose a reason for hiding this comment

yk commented Feb 10, 2023 •

edited