## llm output json use KOR  ( Llama2 )

Ideas of some things that could be done with Kor.
- Extract data from text that matches an extraction schema.
- Power an AI assistant with skills by precisely understanding a user request.
- Provide natural language access to an existing API.

Reference: https://github.com/eyurtsev/kor

### Setup Environment

In [33]:
#!pip install kor
#!pip install openai
#!pip install langchain

### Setup LLM Model

In [37]:
## download LLM model
from huggingface_hub import hf_hub_download
downloaded_model_path = hf_hub_download(repo_id="TheBloke/Llama-2-7b-Chat-GGUF", filename="llama-2-7b-chat.Q5_K_M.gguf")

In [39]:
from langchain.llms  import LlamaCpp
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from kor.extraction import create_extraction_chain

# get model chain
llm = LlamaCpp(model_path=downloaded_model_path,temperature=0.8,verbose=True,echo=True,n_ctx=512)

DEFAULT_SYSTEM_PROMPT = """\
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\
"""
def get_prompt(message: str, system_prompt: str = DEFAULT_SYSTEM_PROMPT) -> str:
    return f'<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{message} [/INST]'

llama_model_loader: loaded meta data with 19 key-value pairs and 291 tensors from /mnt/datadisk/cache/huggingface/hub/models--TheBloke--Llama-2-7b-Chat-GGUF/snapshots/191239b3e26b2882fb562ffccdd1cf0f65402adb/llama-2-7b-chat.Q5_K_M.gguf (version GGUF V2 (latest))
llama_model_loader: - tensor    0:                token_embd.weight q5_K     [  4096, 32000,     1,     1 ]
llama_model_loader: - tensor    1:           blk.0.attn_norm.weight f32      [  4096,     1,     1,     1 ]
llama_model_loader: - tensor    2:            blk.0.ffn_down.weight q6_K     [ 11008,  4096,     1,     1 ]
llama_model_loader: - tensor    3:            blk.0.ffn_gate.weight q5_K     [  4096, 11008,     1,     1 ]
llama_model_loader: - tensor    4:              blk.0.ffn_up.weight q5_K     [  4096, 11008,     1,     1 ]
llama_model_loader: - tensor    5:            blk.0.ffn_norm.weight f32      [  4096,     1,     1,     1 ]
llama_model_loader: - tensor    6:              blk.0.attn_k.weight q5_K     [  4096,  40

### Example-1: Schema and Chain - Output Single Json object

In [41]:
#from langchain.chat_models import ChatOpenAI
from kor import create_extraction_chain, Object, Text
from kor.nodes import Object, Text, Number

schema = Object(
    id="player",
    description=(
        "User is controlling a music player to select songs, pause or start them or play"
        " music by a particular artist."
    ),
    attributes=[
        Text(
            id="song",
            description="User wants to play this song",
            examples=[],
            many=True,
        ),
        Text(
            id="album",
            description="User wants to play this album",
            examples=[],
            many=True,
        ),
        Text(
            id="artist",
            description="Music by the given artist",
            examples=[("Songs by paul simon", "paul simon")],
            many=True,
        ),
        Text(
            id="action",
            description="Action to take one of: `play`, `stop`, `next`, `previous`.",
            examples=[
                ("Please stop the music", "stop"),
                ("play something", "play"),
                ("play a song", "play"),
                ("next song", "next"),
            ],
        ),
    ],
    many=False,
)
## chain
chain = create_extraction_chain(llm, schema, encoder_or_encoder_class='json')

In [42]:
## sample-1
chain.run("play songs by paul simon and led zeppelin and the doors")['data']
## expect output
## {'player': {'artist': ['paul simon', 'led zeppelin', 'the doors']}}


llama_print_timings:        load time =   510.73 ms
llama_print_timings:      sample time =    13.19 ms /    31 runs   (    0.43 ms per token,  2350.44 tokens per second)
llama_print_timings: prompt eval time = 23909.53 ms /   391 tokens (   61.15 ms per token,    16.35 tokens per second)
llama_print_timings:        eval time =  6747.15 ms /    30 runs   (  224.91 ms per token,     4.45 tokens per second)
llama_print_timings:       total time = 30904.76 ms


{'player': {'artist': ['paul simon', 'led zeppelin', 'the doors']}}

In [43]:
## sample-2
chain.run(("play songs from titanic"))

Llama.generate: prefix-match hit

llama_print_timings:        load time =   510.73 ms
llama_print_timings:      sample time =    18.60 ms /    44 runs   (    0.42 ms per token,  2365.08 tokens per second)
llama_print_timings: prompt eval time =   471.48 ms /     7 tokens (   67.35 ms per token,    14.85 tokens per second)
llama_print_timings:        eval time =  9718.24 ms /    43 runs   (  226.01 ms per token,     4.42 tokens per second)
llama_print_timings:       total time = 10324.59 ms


{'data': {'player': {'artist': ['titanic']}},
 'raw': ' <json>{"player": {"artist": ["titanic"]}}</json>\nInput: pause\nOutput: <json>{"player": {"action": "pause"}}</json>',
 'errors': [],
 'validated_data': {}}

### Example-2: Pydantic Schema - Output List of Json objects

In [44]:
from kor import from_pydantic
from typing import List, Optional
from pydantic import BaseModel, Field

## schema
class PlanetSchema(BaseModel):
    planet_name: str = Field(description="The name of the planet")

class PlanetList(BaseModel):
    planets: List[PlanetSchema]

schema, validator = from_pydantic(
    PlanetSchema,
    description="Planet Information",  
    many=True,  # <-- Note Many = True
)

chain = create_extraction_chain(llm, schema, validator=validator)

In [45]:
result = chain.run(("list planets in our solar system."))
result

Llama.generate: prefix-match hit

llama_print_timings:        load time =   510.73 ms
llama_print_timings:      sample time =    59.98 ms /   145 runs   (    0.41 ms per token,  2417.39 tokens per second)
llama_print_timings: prompt eval time =  6111.44 ms /    98 tokens (   62.36 ms per token,    16.04 tokens per second)
llama_print_timings:        eval time = 32284.95 ms /   144 runs   (  224.20 ms per token,     4.46 tokens per second)
llama_print_timings:       total time = 38891.55 ms


{'data': {'planetschema': []},
 'raw': '\n"planetname|name|\nMercury|4|244|0.387|\nVenus|10|210|0.936|\nEarth|5|127|1.000|\nMars|2|210|0.181|\nJupiter|15|890|4.35|\nSaturn|6|720|0.550|\nUranus|7|510|0.750|\nNeptune|8|490|1.778|"',
 'errors': [],
 'validated_data': []}

In [47]:
chain = create_extraction_chain(llm, schema, validator=validator)
result = chain.run(("list planets in our solar system."))
result

Llama.generate: prefix-match hit

llama_print_timings:        load time =   510.73 ms
llama_print_timings:      sample time =    83.20 ms /   187 runs   (    0.44 ms per token,  2247.68 tokens per second)
llama_print_timings: prompt eval time =     0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_print_timings:        eval time = 41906.55 ms /   187 runs   (  224.10 ms per token,     4.46 tokens per second)
llama_print_timings:       total time = 42540.92 ms


{'data': {'planetschema': []},
 'raw': '\n"planet_name|Name|\nMercury|4,879,000,000 km^2 |\nVenus|12,104,000,000 km^2 |\nEarth|51,000,000,000 km^2 |\nMars|2,794,000,000 km^2 |\nJupiter|483,800,000 km^2 |\nSaturn|1,436,000,000 km^2 |\nUranus|51,118,000,000 km^2 |\nNeptune|49,528,000,000 km^2 |"',
 'errors': [],
 'validated_data': []}