Add migration guide. Pull model name from config dict for LangChain m…

…odels. Update usage example readme.
explosion · Jun 28, 2023 · 502697f · 502697f
1 parent b9305f4
commit 502697f
Show file tree

Hide file tree

Showing 7 changed files with 116 additions and 24 deletions.
diff --git a/README.md b/README.md
@@ -1038,3 +1038,7 @@ PRs are always welcome!
 If you have questions regarding the usage of `spacy-llm`, or want to give us feedback after giving it a spin, please use the
 [discussion board](https://github.com/explosion/spaCy/discussions).
 Bug reports can be filed on the [spaCy issue tracker](https://github.com/explosion/spaCy/issues). Thank you!
+
+## Migration guides
+
+Please refer to our [migration guide](migration_guide.md).
diff --git a/migration_guide.md b/migration_guide.md
@@ -0,0 +1,85 @@
+# Migration guides
+
+<details open>
+  <summary>0.3.x to 0.4.x</summary>
+
+## `0.3.x` to `0.4.x`
+
+`0.4.x` significantly refactors the code to make it more robust and the config more intuitive. 0.4.0 changes the config 
+paradigm from backend- to model-centric. This is reflected in the external API in a different config structure.
+
+Remember that there are three different types of models: the first uses the native REST implementation to communicate
+with hosted LLMs, the second builds on HuggingFace's `transformers` model to run models locally and the third leverages
+`langchain` to operate on hosted or local models. While they config for all three is rather similar (especially in 
+0.4.x), there are differences in how these models have to be configured. We show how to migrate your config from 0.3.x
+to 0.4.x for each of these model types.
+
+For all model types: 
+- The registry name has changed - instead of `llm_backends`, use `llm_models`.
+- The `api` attribute has been removed.
+
+### Models using REST
+
+This is the default method to communicate with hosted models. Whenever you don't explicitly use LangChain models
+(see section at the bottom) or run models locally, you are using this kind of model.
+
+In `0.3.x`:
+```ini
+[components.llm.backend]
+@llm_backends = "spacy.REST.v1"
+api = "OpenAI"
+config = {"model": "gpt-3.5-turbo", "temperature": 0.3}
+```
+In `0.4.x`:
+```ini
+[components.llm.model]
+@llm_models = "spacy.gpt-3-5.v1"
+name = "gpt-3-5-turbo"
+config = {"temperature": 0.3}
+```
+Note that the factory function (marked with `@`) refers to the name of the model. Variants of the same model can be 
+specified with the `name` attribute - for `gpt-3.5` this could be `"gpt-3-5-turbo"` or `"gpt-3-5-turbo-16k"`.
+
+### Models using HuggingFace
+
+On top of the changes described in the section above, HF models like `spacy.Dolly.v1` now accept `config_init` and 
+`config_run` to reflect that differerent arguments can be passed at init or run time.
+
+In `0.3.x`:
+```ini
+[components.llm.backend]
+@llm_backends = "spacy.Dolly_HF.v1"
+model = "databricks/dolly-v2-3b"
+config = {}
+```
+In `0.4.x`:
+```ini
+[components.llm.model]
+@llm_models = "spacy.Dolly.v1"
+name = "dolly-v2-3b"  # or databricks/dolly-v2-3b - the prefix is optional
+config_init = {}  # Arguments passed to HF model at initalization time
+config_run = {}  # Arguments passed to HF model at inference time 
+```
+
+### Models using LangChain
+
+LangChain models are now accessible via `langchain.[API].[version]`, e. g. `langchain.OpenAI.v1`. Other than that the
+changes from 0.3.x to 0.4.x are identical with REST-based models.
+
+In `0.3.x`:
+```ini
+[components.llm.backend]
+@llm_backends = "spacy.LangChain.v1"
+api = "OpenAI"
+config = {"temperature": 0.3}
+```
+
+In `0.4.x`:
+```ini
+[components.llm.model]
+@llm_models = "langchain.OpenAI.v1"
+name = "gpt-3-5-turbo"
+config = {"temperature": 0.3}
+```
+
+</details>
diff --git a/spacy_llm/models/langchain/model.py b/spacy_llm/models/langchain/model.py
@@ -75,6 +75,7 @@ def register_models() -> None:
         for class_id, cls in type_to_cls_dict.items():
 
             def langchain_model(
+                name: str,
                 query: Optional[
                     Callable[
                         ["langchain.llms.base.BaseLLM", Iterable[str]], Iterable[str]
@@ -83,11 +84,10 @@ def langchain_model(
                 config: Dict[Any, Any] = SimpleFrozenDict(),
                 langchain_class_id: str = class_id,
             ) -> Optional[Callable[[Iterable[Any]], Iterable[Any]]]:
-                model = config.pop("model")
                 try:
                     return LangChain(
                         langchain_model=type_to_cls_dict[langchain_class_id](
-                            model_name=model, **config
+                            model_name=name, **config
                         ),
                         query=query_langchain() if query is None else query,
                     )

diff --git a/spacy_llm/tests/models/test_langchain.py b/spacy_llm/tests/models/test_langchain.py
@@ -6,7 +6,8 @@
 PIPE_CFG = {
     "model": {
         "@llm_models": "langchain.OpenAI.v1",
-        "config": {"temperature": 0.3, "model": "ada"},
+        "name": "ada",
+        "config": {"temperature": 0.3},
         "query": {"@llm_queries": "spacy.CallLangChain.v1"},
     },
     "task": {"@llm_tasks": "spacy.NoOp.v1"},

diff --git a/spacy_llm/tests/test_combinations.py b/spacy_llm/tests/test_combinations.py
@@ -39,7 +39,7 @@ def test_combinations(model: str, task: str, n_process: int):
     config = copy.deepcopy(PIPE_CFG)
     config["model"]["@llm_models"] = model
     if "langchain" in model:
-        config["model"]["config"] = {"model": "ada"}
+        config["model"]["name"] = "ada"
     config["task"]["@llm_tasks"] = task
 
     # Configure task-specific settings.

diff --git a/usage_examples/README.md b/usage_examples/README.md
@@ -7,8 +7,8 @@ configuration and an optional `examples.yml` file for few-shot annotation.
 ## The configuration file
 
 Each configuration file contains an `llm` component that takes in a `task` and a
-`backend` as its parameters. The `task` defines how the prompt is structured and
-how the corresponding LLM output will be parsed whereas the `backend` defines
+`model` as its parameters. `task` defines how the prompt is structured and
+how the corresponding LLM output will be parsed whereas `model` defines
 which model to use and how to connect to it.
 
 ```ini
@@ -24,8 +24,8 @@ factory = "llm"
 ...
 
 # Defines which model to use (open-source or third-party API) and how to connect
-# to it (e.g., REST, MiniChain, LangChain).
-[components.llm.backend]
+# to it (e.g., REST, LangChain, locally via HuggingFace, ...).
+[components.llm.model]
 ...
 ```
 
@@ -53,7 +53,7 @@ need to implement two functions:
 - **`generate_prompts(docs: Iterable[Doc]) -> Iterable[str]`**: a function that
   takes in a list of spaCy [`Doc`](https://spacy.io/api/doc) objects and transforms
   them into a list of prompts. These prompts will then be sent to the LLM in the
-  `backend`.
+  `model`.
 - **`parse_responses(docs: Iterable[Doc], responses: Iterable[str]) -> Iterable[Doc]`**: a function for parsing the LLM's outputs into spaCy
   [`Doc`](https://spacy.io/api/doc) objects. You also have access to the input
   `Doc` objects so you can store the outputs into one of its attributes.
@@ -97,43 +97,44 @@ You can check sample tasks for Named Entity Recognition and text categorization
 in the `spacy_llm/tasks/` directory. We also recommend checking out the
 `spacy.NoOp.v1` task for a barebones implementation to pattern your task from.
 
-## Using LangChain and other integrated third-party prompting libraries
+## Using LangChain
 
 `spacy-llm` integrates bindings to a number of libraries centered on prompt management and LLM usage to allow users
 to leverage their functionality in their spaCy workflows. A built-in example for this is [LangChain](https://github.com/hwchase17/langchain)
 
-An integrated third-party library can be used by configuring the `llm` component to use the respective backend, e. g.:
+An integrated third-party library can be used by configuring the `llm` component to use the respective model, e. g.:
 
 ```ini
-[components.llm.backend]
-@llm_models = "spacy.LangChain.v1"
+[components.llm.model]
+@llm_models = "langchain.OpenAI.v1"
+name = "gpt-3.5-turbo"
 ```
 
 
 <!-- The `usage_examples` directory contains example for all integrated third-party -->
 
-## Writing your own backend
+## Writing your own model
 
-In `spacy-llm`, the [**backend**](../README.md#backend) is responsible for the
+In `spacy-llm`, the [**model**](../README.md#models) is responsible for the
 interaction with the actual LLM model. The latter can be an
 [API-based service](../README.md#spacyrestv1), or a local model - whether
 you [downloaded it from the Hugging Face Hub](../README.md#spacydollyhfv1)
 directly or finetuned it with proprietary data.
 
-`spacy-llm` lets you implement your own custom backend so you can try out the
+`spacy-llm` lets you implement your own custom model so you can try out the
 latest LLM interface out there. Bear in mind that tasks are responsible for
 creating the prompt and parsing the response – and both can be arbitrary objects.
-Hence, a backend's call signature should be consistent with that of the task you'd like it to run.
+Hence, a model's call signature should be consistent with that of the task you'd like it to run.
 
 In other words, `spacy-llm` roughly performs the following pseudo-code behind the scenes:
 
 ```python
 prompts = task.generate_prompts(docs)
-responses = backend(prompts)
+responses = model(prompts)
 docs = task.parse_responses(docs, responses)
 ```
 
-Let's write a dummy backend that provides a random output for the
+Let's write a dummy model that provides a random output for the
 [text classification task](../README.md#spacytextcatv1).
 
 ```python
@@ -158,18 +159,18 @@ def random_textcat(labels: str):
 labels = LABEL1,LABEL2,LABEL3
 
 
-[components.llm.backend]
+[components.llm.model]
 @llm_models = "RandomClassification.v1"
 labels = ${components.llm.task.labels}  # Make sure to use the same label
 ...
 ```
 
-Of course, this particular backend is not very realistic
+Of course, this particular model is not very realistic
 (it does not even interact with an actual LLM model!).
 But it does show how you would go about writing custom
 and arbitrary logic to interact with any LLM implementation.
 
-Note that in all built-in tasks prompts and responses are expected to be of type `str`, while all built-in backends
-support `str` (or `Any`) types. All built-in tasks and backends are therefore inter-operable. It's possible to work with 
+Note that in all built-in tasks prompts and responses are expected to be of type `str`, while all built-in model
+support `str` (or `Any`) types. All built-in tasks and models are therefore inter-operable. It's possible to work with 
 arbitrary objects instead of `str` though - which might be useful if you want some third-party abstractions for prompts
 or responses.
diff --git a/usage_examples/ner_langchain_openai/ner.cfg b/usage_examples/ner_langchain_openai/ner.cfg
@@ -14,4 +14,5 @@ examples = null
 
 [components.llm.model]
 @llm_models = "langchain.OpenAI.v1"
-config = {"model": "text-davinci-002"}
+name = "text-davinci-002"
+config = {}