Fix plugins #3017

olliestanley · 2023-05-02T23:07:25Z

Plugins was compatible only with text-generation-inference based workers and therefore worked on Dragan's machines but did not work on OA prod. This resolves the incompatibility.

olliestanley · 2023-05-02T23:36:23Z

Marking this as draft as Dragan has identified another problem which will need to be remedied for plugins to work. No point merging this until that is fixed

andreaskoepf

lgtm!

…an be used in chat_chain.py for proper truncation so its safe with usage of basic hf inference Added torch.manual_seed to basic_hf_worker.py

pevogam

@olliestanley Can you elaborate a bit more on what is being fixed or extended here regarding the original pull request? I noticed the merged original pull request no longer introduces the sample calculator plugin, what are the plans regarding this?

inference/worker/chat_chain_utils.py

olliestanley · 2023-05-04T08:37:19Z

@olliestanley Can you elaborate a bit more on what is being fixed or extended here regarding the original pull request? I noticed the merged original pull request no longer introduces the sample calculator plugin, what are the plans regarding this?

There were three problems in prod which did not show up in local testing, because prod uses basic_hf_server worker whereas testing was using text-generation-inference worker:

the chain code uses generate endpoint which only exists in text-generation-inference rather than generate_stream
basic HF server does not support custom stopping sequences
basic HF server does not support fixed random seeds

This PR resolves all three of those

flozi00 · 2023-05-05T10:29:08Z

Am i missing something to get it working ?

{
  "model_config": {
    "model_id": "OpenAssistant/oasst-sft-7-llama-30b",
    "max_input_length": 1024,
    "max_total_length": 1792,
    "quantized": false
  },
  "sampling_parameters": {
    "top_k": 50,
    "top_p": null,
    "typical_p": null,
    "temperature": 0.35,
    "repetition_penalty": 1.1111111111111112,
    "max_new_tokens": 1024
  },
  "do_sample": true,
  "plugins": [
    {
      "url": "https://www.klarna.com/.well-known/ai-plugin.json",
      "enabled": true,
      "plugin_config": {
        "schema_version": "v1",
        "name_for_model": "KlarnaProducts",
        "name_for_human": "Klarna Shopping",
        "description_for_human": "Search and compare prices from thousands of online shops.",
        "description_for_model": "Assistant uses the Klarna plugin to get relevant product suggestions for any shopping or product discovery purpose. Assistant will reply with the following 3 paragraphs 1) Search Results 2) Product Comparison of the Search Results 3) Followup Questions. The first paragraph contains a list of the products with their attributes listed clearly and concisely as bullet points under the product, together with a link to the product and an explanation. Links will always be returned and should be shown to the user. The second paragraph compares the results returned in a summary sentence starting with \"In summary\". Assistant comparisons consider only the most important features of the products that will help them fit the users request, and each product mention is brief, short and concise. In the third paragraph assistant always asks helpful follow-up questions and end with a question mark. When assistant is asking a follow-up question, it uses it's product expertise to provide information pertaining to the subject of the user's request that may guide them in their search for the right product.",
        "api": {
          "type": "openapi",
          "url": "https://www.klarna.com/us/shopping/public/openai/v0/api-docs/",
          "has_user_authentication": false,
          "is_user_authenticated": false
        },
        "auth": {
          "type": "none"
        },
        "logo_url": "https://www.klarna.com/assets/sites/5/2020/04/27143923/klarna-K-150x150.jpg",
        "contact_email": "openai-products@klarna.com",
        "legal_info_url": "https://www.klarna.com/us/legal/",
        "endpoints": null
      },
      "trusted": false
    }
  ]
}

Plugins was compatible only with `text-generation-inference` based workers and therefore worked on Dragan's machines but did not work on OA prod. This resolves the incompatibility. --------- Co-authored-by: draganjovanovich <draganele@gmail.com>

olliestanley added 10 commits May 2, 2023 21:42

Add generate endpoint without streaming

4e5452a

Remove 'plugins' before generation

6fe01cb

Fix

a0cedca

Simplify

b2e43f7

Fix response parsing

96608fb

Fix

df34640

Handle error

84f53b3

Fix

9812398

Use sseclient

7e16274

Reduce diff

dfe358d

olliestanley added the inference label May 2, 2023

olliestanley requested review from yk, andreaskoepf and AbdBarho as code owners May 2, 2023 23:07

Refactor

6372d5c

olliestanley marked this pull request as draft May 2, 2023 23:35

olliestanley added 7 commits May 3, 2023 09:35

Add stopping criteria to basic HF server

b1c9133

Support multi-token stop sequences

a776cc4

Fixes

614704a

Fix

790f2ad

Add debugs

dbace1b

Fix

a1dbef6

Remove debugs

c7f1d18

andreaskoepf approved these changes May 3, 2023

View reviewed changes

Extracted prompt truncation function wrom work.py to utils.py so it c…

145c210

…an be used in chat_chain.py for proper truncation so its safe with usage of basic hf inference Added torch.manual_seed to basic_hf_worker.py

pevogam reviewed May 4, 2023

View reviewed changes

inference/worker/chat_chain_utils.py Show resolved Hide resolved

Added shared lock for tokenizer

7d4c788

olliestanley marked this pull request as ready for review May 4, 2023 22:19

draganjovanovich and others added 2 commits May 5, 2023 00:19

Add missing shared_lock.py

0ad939b

Shared tokenizer lock -> utils

1e42dc1

olliestanley enabled auto-merge (squash) May 4, 2023 22:31

olliestanley merged commit 5dd1025 into main May 4, 2023
1 check passed

olliestanley deleted the plugins-hotfix branch May 4, 2023 22:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix plugins #3017

Fix plugins #3017

olliestanley commented May 2, 2023

olliestanley commented May 2, 2023

andreaskoepf left a comment

pevogam left a comment

olliestanley commented May 4, 2023

flozi00 commented May 5, 2023 •

edited

Fix plugins #3017

Fix plugins #3017

Conversation

olliestanley commented May 2, 2023

olliestanley commented May 2, 2023

andreaskoepf left a comment

Choose a reason for hiding this comment

pevogam left a comment

Choose a reason for hiding this comment

olliestanley commented May 4, 2023

flozi00 commented May 5, 2023 • edited

flozi00 commented May 5, 2023 •

edited