Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin hook: register_models #53

Closed
Tracked by #49
simonw opened this issue Jun 17, 2023 · 28 comments
Closed
Tracked by #49

Plugin hook: register_models #53

simonw opened this issue Jun 17, 2023 · 28 comments
Labels
enhancement New feature or request plugins
Milestone

Comments

@simonw
Copy link
Owner

simonw commented Jun 17, 2023

Blocks:

@simonw simonw changed the title register_models - return list of llm.Model Plugin hook: register_models - return list of llm.Model Jun 17, 2023
@simonw simonw added enhancement New feature or request plugins labels Jun 17, 2023
@simonw simonw added this to the 0.5 milestone Jun 17, 2023
@simonw
Copy link
Owner Author

simonw commented Jun 17, 2023

I don't think this needs to take any arguments - the signature is:

@hookspec
def register_models():
    """Return a list of Models"""

I think the list of models is a list of classes that are subclasses of a Model base class - or should it be a list of instances? I think classes.

@simonw
Copy link
Owner Author

simonw commented Jun 17, 2023

Or... maybe it takes an llm argument which is similar to datasette in that it's an object offering a documented API for various useful things, like looking up configuraiton settings and loading templates and suchlike.

@hookspec
def register_models(llm):
    """Return a list of Models"""

@simonw
Copy link
Owner Author

simonw commented Jun 17, 2023

I think it returns objects, not classes - because then plugins like the OpenAI one can define a single class and return multiple instances of it, each for a different model that uses the same underlying API.

@simonw
Copy link
Owner Author

simonw commented Jun 17, 2023

I want to call the common base model llm.BaseModel but I'm a little concerned that it clashes with Pydantic's BaseModel. I could call it llm.Model instead.

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

I played around with https://github.com/nomic-ai/gpt4all - it runs a localhost web server which emulates the OpenAI API, such that all you need to do is set a OPENAI_API_BASE environment variable and then use the existing openai library - though it has to be used in Completion and not ChatCompletion mode.

Experimental code for that here: a196db1

Had to turn on this option:

CleanShot 2023-06-23 at 07 56 27@2x

Then set this environment variable:

export OPENAI_API_BASE=http://localhost:4891/v1

Then I created this template called mpt:

prompt: "### Human:\n$input\n### Assistant:"
model: mpt-7b-chat

And now I can do this:

llm -t mpt --no-stream 'What is the capital city of Australia?'
Canberra.

In the logs:

llm logs -n 1
[
  {
    "id": 538,
    "model": "mpt-7b-chat",
    "timestamp": "2023-06-23 14:58:30.208931",
    "prompt": "### Human:\nWhat is the capital city of Australia?\n### Assistant:",
    "system": null,
    "response": "Canberra.",
    "chat_id": null,
    "debug": "{\"model\": \"mpt-7b-chat\", \"usage\": {\"completion_tokens\": 4, \"prompt_tokens\": 26, \"total_tokens\": 30}}",
    "duration_ms": 1756,
    "prompt_json": null
  }
]

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

That gpt4all experiment demonstrated that I do need a way to define models that are "classes" of models that the user can then customize.

In this case the gpt4all plugin model has a URL which should be configurable, including multiple times - what if you have a localhost:4891 instance but your team also has a gpt4all.mycompany.corp:4891 instance that you want to be able to query?

Likewise, I found https://github.com/ortegaalfredo which offers models via Tor where a plugin (encapsulating this library https://github.com/ortegaalfredo/neuroengine/blob/main/neuroengine.py - see https://news.ycombinator.com/item?id=36423590) could be configured to point at different hosted instances.

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

Design challenge: how to best handle ongoing conversations.

Models based on openai.ChatCompletion need a messages= list of specifically formatted JSON message objects.

Other models might instead implement chat by building their own strings that look something like this:

### Human:
What is the capital city of Australia?
### Assistant:
Canberra
### Human:
What is its population?
### Assistant:

Which bit of the code should be responsible for these things? Both retrieving previous conversation messages and figuring out how to format them to be sent to the model.

The conversation prompt there has some overlap with the existing prompt template mechanism, although that doesn't currently support loops.

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

My intuition is that the Model class should do as much of this as possible, since that way future unanticipated model features can still be built in plugins without needing changes to the design of core.

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

At which point, what is the responsibility of LLM itself? I think it's these things:

  • Provide a consistent CLI interface for running prompts, including easily running the same prompt against different models
  • Provide a web UI for the same purpose
  • Maintain a store of prompts and responses in a SQLite database (plus maybe a plugin hook for sending those store prompts and responses to other systems)

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

Inspired by:

Each model instance should be able to validate these, so that when the CLI is called like this:

llm -m my-model -o temprature 0.3

A validation error can be shown if the option names are invalid (or the data types passed are wrong).

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

Having key/value options like this should be flexible to handle pretty much anything - especially since option values could be strings, which means that super-complex concepts like logit_bias could even be fed a JSON string:

llm -m gpt-4 "Once upon a" -o logit_bias '{2435:-100, 640:-100}'

Example from https://help.openai.com/en/articles/5247780-using-logit-bias-to-define-token-probability

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

Got that prototype working here:

Mainly learned that models are going to need to do type conversions on their options - the OpenAI client library needs floating point for temperature, and logit_bias needs to be a {int: int} dictionary which is invalid JSON because in JSON keys must be strings.

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

I should support the same option being passed multiple times, so I can have a gpt-4 option for suppressing tokens without looking up their IDs that looks like this:

llm -m gpt-4 "Once upon a" -o suppress "time" -o suppress " time" -o supress "Time"

Or maybe:

llm -m gpt-4 "Once upon a" -o suppress '["time", " time", "Time", "_time"]`

If I'm going to use Pydantic for option validation then supporting multiple -o with the same name is a bit tricky, so I think I'll go with the second option.

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

This also means models need a mechanism by which they can return help.

Maybe llm models show gpt-4 should be able to output help for the various options.

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

I model instance can execute a prompt, and can validate its options.

model.prompt(prompt, system=system, options=options)
errors = model.validate_options(options)

Where options is a list of str, Any tuples.

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

First prototype of validating options:

class Model:
    options = {}

    def validate_options(
        self, options: Iterable[Tuple[str, Any]]
    ) -> Iterable[Tuple[str, Any]]:
        invalid_options = {}
        valid_options = []
        for key, value in options:
            if key not in self.options:
                invalid_options[(key, value)] = "Invalid option: {}".format(key)
            else:
                expected_type = self.options[key]
                try:
                    cleaned_value = expected_type(value)
                except ValueError:
                    invalid_options[
                        (key, value)
                    ] = "Option {}: value {} should be {}".format(
                        key, value, expected_type
                    )
                valid_options.append((key, cleaned_value))
        if invalid_options:
            raise OptionsError(invalid_options)
        return valid_options


class Foo(Model):
    options = {
        "id": str,
        "score": float,
    }


class OptionsError(Exception):
    def __init__(self, invalid_options: Dict[Tuple[str, Any], str]):
        super().__init__(f"Invalid options found: {invalid_options}")
        self.invalid_options: Dict[Tuple[str, Any], str] = invalid_options

Then I decided to try Pydantic instead, since I already use that for validating YAML for the templates.

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

This seems to work:

from pydantic import BaseModel, ValidationError, validator

class Model:
    class Options(BaseModel):
        class Config:
            extra = "forbid"

    def validate_options(
        self, options: Iterable[Tuple[str, Any]]
    ) -> Iterable[Tuple[str, Any]]:
        try:
            return self.Options(**options)
        except ValidationError:
            raise


class Foo(Model):
    class Options(Model.Options):
        id: str
        score: float
        keywords: List[str]
        @validator('keywords', pre=True)
        def keywords_must_be_list(cls, v):
            if isinstance(v, str):
                try:
                    return json.loads(v)
                except json.JSONDecodeError:
                    raise ValueError('keywords must be a list or a JSON-encoded list')
            return v

Then:

>>> from llm import Foo
>>> foo = Foo()
>>> foo.validate_options({'id': 1, 'score': '3.4', 'keywords': '["one", "two"]'})
Options(id='1', score=3.4, keywords=['one', 'two'])
>>> foo.validate_options({'id': 1, 'score': '3.4', 'keywords': ["one", "two"]})
Options(id='1', score=3.4, keywords=['one', 'two'])
>>> foo.validate_options({'id': 1, 'score': '3.4', 'keywords': ["one", "two", 3]})
Options(id='1', score=3.4, keywords=['one', 'two', '3'])
>>> foo.validate_options({'id': 1, 'score': '3.4', 'keywords': ["one", "two", {"one"}]})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/simon/Dropbox/Development/llm/llm/__init__.py", line 18, in validate_options
    return self.Options(**options)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for Options
keywords -> 2
  str type expected (type=type_error.str)
>>> 

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

That validate_options method may not be worth it - I can do this instead:

>>> Foo.Options(**{'id': 1, 'score': '3.4', 'keywords': ["one", "two", 3]})
Options(id='1', score=3.4, keywords=['one', 'two', '3'])

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

Should models perhaps have a prompt() method an a separate conversation() method, where the second method handles ongoing conversations that might need to include previous messages? (-c/--continue mode)

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

The model .prompt() method should return a Response which can be iterated to get a stream of tokens, but also offers a .text accessor for the competed string and a .debug property with debug information.

Accessing the properties causes the iterator to be exhausted first.

Maybe also a .json property with raw response JSON. And perhaps a .prompt that refers back to a representation of the original prompt.

That way the returned object from .prompt() has everything needed to both display the result and to store it in the database.

Maybe the .log() method deals with instances of Response.

@simonw
Copy link
Owner Author

simonw commented Jun 23, 2023

Question: if a prompt is part of a conversation is it necessary to serialize the full sent prompt in the DB, even though doing so will duplicate messages in storage many times over?

Reason not to is efficiency of storage, but maybe that's a premature optimization.

@simonw
Copy link
Owner Author

simonw commented Jun 26, 2023

New concept for the plugin hook:

@hookspec
def register_models(register):
    """Register additional models by returning one or more Model subclasses"""

The hook implementations then call that register(...) function for each model they want to register.

I can then add further arguments to the hook in the future, while shipping a minimally viable version quickly. I don't need to design the full llm interface.

@simonw
Copy link
Owner Author

simonw commented Jun 26, 2023

... except, maybe the version that returns a list of models is better? I'm going to try that one first - so register_models() taking no arguments at all.

@simonw
Copy link
Owner Author

simonw commented Jun 26, 2023

This seems a little confusing though:

@hookspec
def register_commands(cli):
    """Register additional CLI commands, e.g. 'llm mycommand ...'"""


@hookspec
def register_models():
    "Return a list of model instances representing LLM models that can be called"

register_commands(cli) doesn't return anything, it requires adding things to cli.

register_models() returns a list.

So perhaps register_models() should have a different name, if it works differently. Perhaps load_models().

@simonw
Copy link
Owner Author

simonw commented Jun 26, 2023

Aha! The reason I wanted to do register_models(register) was to handle aliases:

@hookimpl
def register_models(register):
    register(Chat("gpt-3.5-turbo"), aliases=("3.5", "chatgpt"))
    register(Chat("gpt-3.5-turbo-16k"), aliases=("chatgpt-16k", "3.5-16k"))
    register(Chat("gpt-4"), aliases=("4", "gpt4"))
    register(Chat("gpt-4-32k"), aliases=("4-32k",))

@simonw
Copy link
Owner Author

simonw commented Jun 26, 2023

Moving this to a PR.

@simonw simonw changed the title Plugin hook: register_models - return list of llm.Model Plugin hook: register_models Jun 26, 2023
@simonw
Copy link
Owner Author

simonw commented Jul 7, 2023

Here's that detailed tutorial as a preview: https://llm--65.org.readthedocs.build/en/65/tutorial-model-plugin.html

simonw added a commit to simonw/llm-mpt30b that referenced this issue Jul 8, 2023
simonw referenced this issue in simonw/llm-gpt4all Jul 9, 2023
@simonw
Copy link
Owner Author

simonw commented Jul 10, 2023

Now that I've merged the PR I'm closing this as done - though it's possible the details may change a little before I release it, depending on this work:

@simonw simonw closed this as completed Jul 10, 2023
simonw added a commit that referenced this issue Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request plugins
Projects
None yet
Development

No branches or pull requests

1 participant