Skip to content

Commit

Permalink
Add migration guide. Pull model name from config dict for LangChain m…
Browse files Browse the repository at this point in the history
…odels. Update usage example readme.
  • Loading branch information
rmitsch committed Jun 28, 2023
1 parent b9305f4 commit 502697f
Show file tree
Hide file tree
Showing 7 changed files with 116 additions and 24 deletions.
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1038,3 +1038,7 @@ PRs are always welcome!
If you have questions regarding the usage of `spacy-llm`, or want to give us feedback after giving it a spin, please use the
[discussion board](https://github.com/explosion/spaCy/discussions).
Bug reports can be filed on the [spaCy issue tracker](https://github.com/explosion/spaCy/issues). Thank you!

## Migration guides

Please refer to our [migration guide](migration_guide.md).
85 changes: 85 additions & 0 deletions migration_guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Migration guides

<details open>
<summary>0.3.x to 0.4.x</summary>

## `0.3.x` to `0.4.x`

`0.4.x` significantly refactors the code to make it more robust and the config more intuitive. 0.4.0 changes the config
paradigm from backend- to model-centric. This is reflected in the external API in a different config structure.

Remember that there are three different types of models: the first uses the native REST implementation to communicate
with hosted LLMs, the second builds on HuggingFace's `transformers` model to run models locally and the third leverages
`langchain` to operate on hosted or local models. While they config for all three is rather similar (especially in
0.4.x), there are differences in how these models have to be configured. We show how to migrate your config from 0.3.x
to 0.4.x for each of these model types.

For all model types:
- The registry name has changed - instead of `llm_backends`, use `llm_models`.
- The `api` attribute has been removed.

### Models using REST

This is the default method to communicate with hosted models. Whenever you don't explicitly use LangChain models
(see section at the bottom) or run models locally, you are using this kind of model.

In `0.3.x`:
```ini
[components.llm.backend]
@llm_backends = "spacy.REST.v1"
api = "OpenAI"
config = {"model": "gpt-3.5-turbo", "temperature": 0.3}
```
In `0.4.x`:
```ini
[components.llm.model]
@llm_models = "spacy.gpt-3-5.v1"
name = "gpt-3-5-turbo"
config = {"temperature": 0.3}
```
Note that the factory function (marked with `@`) refers to the name of the model. Variants of the same model can be
specified with the `name` attribute - for `gpt-3.5` this could be `"gpt-3-5-turbo"` or `"gpt-3-5-turbo-16k"`.

### Models using HuggingFace

On top of the changes described in the section above, HF models like `spacy.Dolly.v1` now accept `config_init` and
`config_run` to reflect that differerent arguments can be passed at init or run time.

In `0.3.x`:
```ini
[components.llm.backend]
@llm_backends = "spacy.Dolly_HF.v1"
model = "databricks/dolly-v2-3b"
config = {}
```
In `0.4.x`:
```ini
[components.llm.model]
@llm_models = "spacy.Dolly.v1"
name = "dolly-v2-3b" # or databricks/dolly-v2-3b - the prefix is optional
config_init = {} # Arguments passed to HF model at initalization time
config_run = {} # Arguments passed to HF model at inference time
```

### Models using LangChain

LangChain models are now accessible via `langchain.[API].[version]`, e. g. `langchain.OpenAI.v1`. Other than that the
changes from 0.3.x to 0.4.x are identical with REST-based models.

In `0.3.x`:
```ini
[components.llm.backend]
@llm_backends = "spacy.LangChain.v1"
api = "OpenAI"
config = {"temperature": 0.3}
```

In `0.4.x`:
```ini
[components.llm.model]
@llm_models = "langchain.OpenAI.v1"
name = "gpt-3-5-turbo"
config = {"temperature": 0.3}
```

</details>
4 changes: 2 additions & 2 deletions spacy_llm/models/langchain/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ def register_models() -> None:
for class_id, cls in type_to_cls_dict.items():

def langchain_model(
name: str,
query: Optional[
Callable[
["langchain.llms.base.BaseLLM", Iterable[str]], Iterable[str]
Expand All @@ -83,11 +84,10 @@ def langchain_model(
config: Dict[Any, Any] = SimpleFrozenDict(),
langchain_class_id: str = class_id,
) -> Optional[Callable[[Iterable[Any]], Iterable[Any]]]:
model = config.pop("model")
try:
return LangChain(
langchain_model=type_to_cls_dict[langchain_class_id](
model_name=model, **config
model_name=name, **config
),
query=query_langchain() if query is None else query,
)
Expand Down
3 changes: 2 additions & 1 deletion spacy_llm/tests/models/test_langchain.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
PIPE_CFG = {
"model": {
"@llm_models": "langchain.OpenAI.v1",
"config": {"temperature": 0.3, "model": "ada"},
"name": "ada",
"config": {"temperature": 0.3},
"query": {"@llm_queries": "spacy.CallLangChain.v1"},
},
"task": {"@llm_tasks": "spacy.NoOp.v1"},
Expand Down
2 changes: 1 addition & 1 deletion spacy_llm/tests/test_combinations.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def test_combinations(model: str, task: str, n_process: int):
config = copy.deepcopy(PIPE_CFG)
config["model"]["@llm_models"] = model
if "langchain" in model:
config["model"]["config"] = {"model": "ada"}
config["model"]["name"] = "ada"
config["task"]["@llm_tasks"] = task

# Configure task-specific settings.
Expand Down
39 changes: 20 additions & 19 deletions usage_examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ configuration and an optional `examples.yml` file for few-shot annotation.
## The configuration file

Each configuration file contains an `llm` component that takes in a `task` and a
`backend` as its parameters. The `task` defines how the prompt is structured and
how the corresponding LLM output will be parsed whereas the `backend` defines
`model` as its parameters. `task` defines how the prompt is structured and
how the corresponding LLM output will be parsed whereas `model` defines
which model to use and how to connect to it.

```ini
Expand All @@ -24,8 +24,8 @@ factory = "llm"
...

# Defines which model to use (open-source or third-party API) and how to connect
# to it (e.g., REST, MiniChain, LangChain).
[components.llm.backend]
# to it (e.g., REST, LangChain, locally via HuggingFace, ...).
[components.llm.model]
...
```

Expand Down Expand Up @@ -53,7 +53,7 @@ need to implement two functions:
- **`generate_prompts(docs: Iterable[Doc]) -> Iterable[str]`**: a function that
takes in a list of spaCy [`Doc`](https://spacy.io/api/doc) objects and transforms
them into a list of prompts. These prompts will then be sent to the LLM in the
`backend`.
`model`.
- **`parse_responses(docs: Iterable[Doc], responses: Iterable[str]) -> Iterable[Doc]`**: a function for parsing the LLM's outputs into spaCy
[`Doc`](https://spacy.io/api/doc) objects. You also have access to the input
`Doc` objects so you can store the outputs into one of its attributes.
Expand Down Expand Up @@ -97,43 +97,44 @@ You can check sample tasks for Named Entity Recognition and text categorization
in the `spacy_llm/tasks/` directory. We also recommend checking out the
`spacy.NoOp.v1` task for a barebones implementation to pattern your task from.

## Using LangChain and other integrated third-party prompting libraries
## Using LangChain

`spacy-llm` integrates bindings to a number of libraries centered on prompt management and LLM usage to allow users
to leverage their functionality in their spaCy workflows. A built-in example for this is [LangChain](https://github.com/hwchase17/langchain)

An integrated third-party library can be used by configuring the `llm` component to use the respective backend, e. g.:
An integrated third-party library can be used by configuring the `llm` component to use the respective model, e. g.:

```ini
[components.llm.backend]
@llm_models = "spacy.LangChain.v1"
[components.llm.model]
@llm_models = "langchain.OpenAI.v1"
name = "gpt-3.5-turbo"
```


<!-- The `usage_examples` directory contains example for all integrated third-party -->

## Writing your own backend
## Writing your own model

In `spacy-llm`, the [**backend**](../README.md#backend) is responsible for the
In `spacy-llm`, the [**model**](../README.md#models) is responsible for the
interaction with the actual LLM model. The latter can be an
[API-based service](../README.md#spacyrestv1), or a local model - whether
you [downloaded it from the Hugging Face Hub](../README.md#spacydollyhfv1)
directly or finetuned it with proprietary data.

`spacy-llm` lets you implement your own custom backend so you can try out the
`spacy-llm` lets you implement your own custom model so you can try out the
latest LLM interface out there. Bear in mind that tasks are responsible for
creating the prompt and parsing the response – and both can be arbitrary objects.
Hence, a backend's call signature should be consistent with that of the task you'd like it to run.
Hence, a model's call signature should be consistent with that of the task you'd like it to run.

In other words, `spacy-llm` roughly performs the following pseudo-code behind the scenes:

```python
prompts = task.generate_prompts(docs)
responses = backend(prompts)
responses = model(prompts)
docs = task.parse_responses(docs, responses)
```

Let's write a dummy backend that provides a random output for the
Let's write a dummy model that provides a random output for the
[text classification task](../README.md#spacytextcatv1).

```python
Expand All @@ -158,18 +159,18 @@ def random_textcat(labels: str):
labels = LABEL1,LABEL2,LABEL3


[components.llm.backend]
[components.llm.model]
@llm_models = "RandomClassification.v1"
labels = ${components.llm.task.labels} # Make sure to use the same label
...
```

Of course, this particular backend is not very realistic
Of course, this particular model is not very realistic
(it does not even interact with an actual LLM model!).
But it does show how you would go about writing custom
and arbitrary logic to interact with any LLM implementation.

Note that in all built-in tasks prompts and responses are expected to be of type `str`, while all built-in backends
support `str` (or `Any`) types. All built-in tasks and backends are therefore inter-operable. It's possible to work with
Note that in all built-in tasks prompts and responses are expected to be of type `str`, while all built-in model
support `str` (or `Any`) types. All built-in tasks and models are therefore inter-operable. It's possible to work with
arbitrary objects instead of `str` though - which might be useful if you want some third-party abstractions for prompts
or responses.
3 changes: 2 additions & 1 deletion usage_examples/ner_langchain_openai/ner.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,5 @@ examples = null

[components.llm.model]
@llm_models = "langchain.OpenAI.v1"
config = {"model": "text-davinci-002"}
name = "text-davinci-002"
config = {}

0 comments on commit 502697f

Please sign in to comment.