Respect natural JSON order by default #1184

lukestanley · 2024-02-13T09:36:36Z

The current behaviour of the JSON schema to grammar string converter is to sort the properties, even when no specific order was specified, which is a big pain when you want the existing, natural order of a JSON schema from Pydantic since order can have a big impact on the output the LLM's have.

abetlen · 2024-02-13T17:31:24Z

@lukestanley good catch, I think we both found the issue at the same time last night, it should be fixed now from d1822fe onward

lukestanley · 2024-02-13T17:38:01Z

Hahaha! That's nuts! I was just putting together an example:

import json
from typing import List
from pprint import pprint
from pydantic import BaseModel, Field
import llama_cpp



class Actor(BaseModel):
    name: str = Field(..., description="Name of an actor")
    film_names: List[str] = Field(..., description="List of films they starred in")

    class Config:
        json_schema_extra = {
            "example": {
                "name": "Jim Carrey",
                "film_names": [
                    "Ace Ventura: Pet Detective",
                    "The Mask",
                    "Dumb and Dumber",
                ],
            }
        }


schema = Actor.model_json_schema()
del schema["example"]
json_schema = json.dumps(schema)
example = Actor.model_config['json_schema_extra']['example']
grammar = llama_cpp.LlamaGrammar.from_json_schema(json_schema)


prompt = f"""Instruction:
Provide the name of actors in JSON format matching this: {json_schema}.
Please provide a comedy actor.
Output:
{json.dumps(example)}
Instruction: Now please provide an actor who started as playing a mafia figure but plays lots of different roles.
Output:
"""

llm = llama_cpp.Llama(model_path="/home/user/Downloads/phi-2.Q4_K_M.gguf")
output_text = llm(
    prompt,
    max_tokens=300,
    stop=["Instruction:", "Output:"],
    echo=False,
    temperature=0.3,
    grammar=grammar,
)["choices"][0]["text"]
model = Actor.model_validate_json(output_text)
pprint(model, sort_dicts=False, indent=2)


stream = llm(
    prompt,
    max_tokens=300,
    stop=["Instruction:", "Output:"],
    echo=False,
    temperature=0.3,
    grammar=grammar,
    stream=True
)

output_text = ""
for chunk in stream:
    result = chunk["choices"][0]
    print(result["text"], end='', flush=True)
    output_text = output_text + result["text"]
model = Actor.model_validate_json(output_text)
pprint(model, sort_dicts=False, indent=2)

Would something like that be useful somewhere in the examples directory?
If so I can do a PR.
@abetlen

abetlen · 2024-02-14T04:33:33Z

Sure thing! For showing off the json prop order thing instructor has some good examples of adding a chain_of_thought property before an answer property.

Respect natural JSON order by default

9c3af58

abetlen closed this Feb 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Respect natural JSON order by default #1184

Respect natural JSON order by default #1184

lukestanley commented Feb 13, 2024 •

edited

abetlen commented Feb 13, 2024

lukestanley commented Feb 13, 2024 •

edited

abetlen commented Feb 14, 2024

Respect natural JSON order by default #1184

Respect natural JSON order by default #1184

Conversation

lukestanley commented Feb 13, 2024 • edited

abetlen commented Feb 13, 2024

lukestanley commented Feb 13, 2024 • edited

abetlen commented Feb 14, 2024

lukestanley commented Feb 13, 2024 •

edited

lukestanley commented Feb 13, 2024 •

edited