Document inference pipeline #1473

yk · 2023-02-11T11:54:37Z

Currently, the inference system lives under /inference. After installing the various dependencies, plus tmux, you can run it using the full-dev-setup.sh script (or use the docker compose inference profile).

The inference system consists of multiple parts:

one central inference server that connects to a postgres and a redis database.
many external workers that perform the actual inference
- each worker consist again of two parts, a python connector and a text-generation backend
- the text-generation backend exposes an HTTP API
- the python connector connects to the HTTP API and also to the central server via websocket
optionally: the text-client. this one is mainly for testing

The goal of this issue is to document the inference pipeline, both how the individual parts are built (e.g. how the central server manages users, workers, etc.), and how they connect. This can be done using text, diagrams, etc. whatever. If a diagram, then preferrably a mermaid diagram directly in markdown.

The text was updated successfully, but these errors were encountered:

alando46 · 2023-02-17T17:28:36Z

I've actually been working on this independently, just as a way to undedrstand how the inference works. Would be happy to get a PR up with what I understand thus far (can do mermaid markdown) and we can iterate. Where should the documentation live? inside /inference? @yk

yk · 2023-02-24T20:53:03Z

Hey @alando46 sorry just seeing this now :) I think inside docs/ would be cool as that gets deployed to: https://projects.laion.ai/Open-Assistant/docs/intro

andrewm4894 · 2023-02-24T22:08:31Z

@alando46 i'm happy to help too on adding anything into /docs

check out this https://github.com/LAION-AI/Open-Assistant/blob/main/docs/README.md#contributing

but feel free to ping me if need any help making a PR

alando46 · 2023-03-07T02:16:57Z

thanks @andrewm4894 for the offer to help. Things are going well on the documentation, I've been able to make my way through the bulk of the open-assistant inference code (can expand to hf's text-generation-inference next), i just want to run some tests with print statements to confirm my notes on the control flow.

I've tried following inference development variants 2&3 (variant 1, the docker-compose workflow seemed to be missing some required services), and with both, I get the following error when invoking the inference worker:

(inf) gitpod /workspace/Open-Assistant/inference/worker (feature/inference_documentation) $ API_KEY=0000 python __main__.py
2023-03-07 02:07:55.853 | INFO     | __main__:main:16 - Inference protocol version: 1
Traceback (most recent call last):
  File "/workspace/Open-Assistant/inference/worker/__main__.py", line 149, in <module>
    main()
  File "/workspace/Open-Assistant/inference/worker/__main__.py", line 18, in main
    tokenizer = Tokenizer.from_pretrained(settings.model_id)
Exception: Model "distilgpt2" on the Hub doesn't have a tokenizer

I should note that (inf) is a virtualenv I created with all required dependencies.

In order to get the tokenizer, do I need to authenticate with hugging face? Their documentation seems to suggest (unless I'm missing something) that this workflow should be allowed -> https://huggingface.co/docs/hub/models-downloading#integrated-libraries

Any suggestions? Thanks in advance.

yk · 2023-03-07T22:51:29Z

that doesn't really make sense, it should just let you download it. maybe there's another problem, like some network issue (i.e. you get a 404), or a cert issue, or HF temporarily down... could be any number of things

alando46 · 2023-03-10T01:09:47Z

ok, thanks for the clarification @yk. I was able to download the tokenizer on a different machine so indeed some weird connection issue w gitpod. working on pr now...

andreaskoepf · 2023-05-05T10:30:11Z

A proper documentation of the inference pipeline is very important. @alando46 Is there any progress, are you still working on it or should be look for someone else? Do you already have intermediate results?

alando46 · 2023-05-05T23:28:06Z

@andreaskoepf Yup made a good amount of progress. Thanks for checking in, let me finalize what I have and I can get something up soon.

andrewm4894 · 2023-05-06T08:00:59Z

have made a pr here to add inference server fastapi docs to the docs site #3059

alando46 · 2023-05-10T15:30:28Z

@andreaskoepf here is the WIP: #3119

Need to wrap up that final section, I am mostly complete but the codebase has been updated so need to review and verify things are correct.

let me know what you think.

This is a mostly done (although not totally complete) PR with a technical overview of the inference architecture. I'm looking forward to high level feedback (general layout, flow of documentation) or specific suggestions (I'm sure I made some errors or missed some details.) I will try to wrap up the final section soon. See related discussion on the issue: #1473 (comment) --------- Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com> Co-authored-by: Oliver Stanley <olivergestanley@gmail.com>

yk added documentation inference labels Feb 11, 2023

yk added this to the Inference Pipeline MVP milestone Feb 11, 2023

yk assigned alando46 Feb 24, 2023

olliestanley mentioned this issue Apr 30, 2023

Cleanup repo for inference #2985

Closed

alando46 mentioned this issue May 10, 2023

Inference Documentation #3119

Merged

olliestanley linked a pull request Jun 12, 2023 that will close this issue

Inference Documentation #3119

Merged

olliestanley closed this as completed in #3119 Jun 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document inference pipeline #1473

Document inference pipeline #1473

yk commented Feb 11, 2023

alando46 commented Feb 17, 2023 •

edited

Loading

yk commented Feb 24, 2023

andrewm4894 commented Feb 24, 2023

alando46 commented Mar 7, 2023

yk commented Mar 7, 2023

alando46 commented Mar 10, 2023

andreaskoepf commented May 5, 2023

alando46 commented May 5, 2023

andrewm4894 commented May 6, 2023 •

edited

Loading

alando46 commented May 10, 2023 •

edited

Loading

Document inference pipeline #1473

Document inference pipeline #1473

Comments

yk commented Feb 11, 2023

alando46 commented Feb 17, 2023 • edited Loading

yk commented Feb 24, 2023

andrewm4894 commented Feb 24, 2023

alando46 commented Mar 7, 2023

yk commented Mar 7, 2023

alando46 commented Mar 10, 2023

andreaskoepf commented May 5, 2023

alando46 commented May 5, 2023

andrewm4894 commented May 6, 2023 • edited Loading

alando46 commented May 10, 2023 • edited Loading

alando46 commented Feb 17, 2023 •

edited

Loading

andrewm4894 commented May 6, 2023 •

edited

Loading

alando46 commented May 10, 2023 •

edited

Loading