feat: `gRPC`-based backends #743

mudler · 2023-07-14T20:31:47Z

Description

This PR is a multi-fold PR:

Fixes the falcon backend. It uses now https://github.com/cmp-nct/ggllm.cpp
Get rids of hacks to workaround duplicate symbols due to libraries using different versions of ggml
- Converts the backends to gRPC services
Various refactors. merges back an old branch I had laying around to refactor and break down the packages. I couldn't get at it before due to other compilation issues that now seems went away
Adds tests for
- token stream
- stablediffusion
- tts
- functions

Coverage now is quite good - we just miss testing the backends 1:1. We do test however already: openllama, rwkv and gpt4all

Notes for Reviewers

Moving to gRPC increase code complexity but overall minimize maintenance. The hacks needed to compile all in a single-fat binary are now gone, and if a backend crashes doesn't crash the main process (which will attempt to recover the grpc service automatically).

Downsides are that the resulting binary is bigger and starting internal services is a bit convoluted.
The gain is notable despite the cons, as now we are free to have also different versions of the same backend with quite some ease. We can, also now, support multiple requests in parallel by allocating more services per model, but this can be done on a following batch

Signed commits

Yes, I signed my commits.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

This finally makes everything more consistent Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Previously the libs were added by other deps that made the linker add those as well (by chance). Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler · 2023-07-15T07:50:39Z

been playing with this here. going to merge this and run few rounds of tests and fix things on master if necessary with follow-ups

tmc · 2023-07-18T01:45:21Z

@mudler This is a great change -> I don't see a good example of how to use the new grpc-based models. Can you point me in the right direction to run falcon7b via grpc?

mudler marked this pull request as draft July 14, 2023 20:32

mudler force-pushed the grpc branch from 503ac46 to 0130bd5 Compare July 14, 2023 20:35

mudler linked an issue Jul 14, 2023 that may be closed by this pull request

Issue regarding falcon-7b quantized #728

Closed

mudler force-pushed the grpc branch 2 times, most recently from a68c474 to dcc3a90 Compare July 14, 2023 20:48

mudler changed the title ~~[wip] grpc~~ [wip] feat: gRPC-based backends Jul 14, 2023

mudler added 12 commits July 15, 2023 01:19

feat: add falcon ggllm via grpc client

b816009

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

feat: move llama to a grpc

58f6aab

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

feat: move gpt4all to a grpc service

ae533ca

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

feat: use gRPC for transformers

f2f1d7f

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

feat: various refactorings

5dcfdbe

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

feat: move other backends to grpc

1d0ed95

This finally makes everything more consistent Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

feat: run all tests

189cb3a

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

fix: fix makefile error

7f3de3c

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

fix: CI fixes

98e73ed

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

fix: Makefile

26e510b

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

fix: fix LDFLAGS for rwkv.cpp

c0a91ab

Previously the libs were added by other deps that made the linker add those as well (by chance). Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

fix: fix copy

f193f56

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the grpc branch from dcb1060 to f193f56 Compare July 14, 2023 23:19

mudler changed the title ~~[wip] feat: gRPC-based backends~~ feat: gRPC-based backends Jul 15, 2023

mudler marked this pull request as ready for review July 15, 2023 07:43

mudler merged commit e3cabb5 into master Jul 15, 2023
14 checks passed

mudler deleted the grpc branch July 15, 2023 07:50

mudler added the enhancement New feature or request label Jul 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: `gRPC`-based backends #743

feat: `gRPC`-based backends #743

mudler commented Jul 14, 2023 •

edited

mudler commented Jul 15, 2023

tmc commented Jul 18, 2023

feat: gRPC-based backends #743

feat: gRPC-based backends #743

Conversation

mudler commented Jul 14, 2023 • edited

mudler commented Jul 15, 2023

tmc commented Jul 18, 2023

feat: `gRPC`-based backends #743

feat: `gRPC`-based backends #743

mudler commented Jul 14, 2023 •

edited