# LLaMA Guidance

This notebook shows how to control the <a href="https://ai.facebook.com/blog/large-language-model-llama-meta-ai/">LLaMA</a> model using the guidance library. Note that this notebook uses a <a href="https://huggingface.co/huggyllama/llama-7b">Transformers version of the model</a>, so please check out the special license terms noted on the HuggingFace model page before downloading.

In [1]:
import guidance

# replace your_path with a version of the LLaMA model
guidance.llm = guidance.llms.transformers.LLaMA(
    "your_path/llama-7b", device="cpu"
)  # use cuda for GPU if you have >27GB of VRAM

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

2023-04-25 14:09:30.517932: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64
2023-04-25 14:09:31.019828: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64
2023-04-25 14:09:31.019892: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64


## A basic example

Note that we have changed the anachronism detection example here to use more detailed guidance than we used for the GPT 3.5 example. This is because the 7B LLaMA model is much smaller and so we need more guidance to get it to answer how we want. Since we are using the Transformers version of the model, we can rely on <a href="../guidance_acceleration.ipynb">guidance accelaration</a> to speed up inference as well using this guidance (as opposed to the OpenAI API which would slow down with detailed guidance since it does not yet support guidance acceleration).

In [3]:
# define the few shot examples
examples = [
    {
        "input": "I wrote about shakespeare",
        "entities": [
            {"entity": "I", "time": "present"},
            {"entity": "Shakespeare", "time": "16th century"},
        ],
        "reasoning": "I can write about Shakespeare because he lived in the past with respect to me.",
        "answer": "No",
    },
    {
        "input": "Shakespeare wrote about me",
        "entities": [
            {"entity": "Shakespeare", "time": "16th century"},
            {"entity": "I", "time": "present"},
        ],
        "reasoning": "Shakespeare cannot have written about me, because he died before I was born",
        "answer": "Yes",
    },
    {
        "input": "A Roman emperor patted me on the back",
        "entities": [
            {"entity": "Roman emperor", "time": "1st-5th century"},
            {"entity": "I", "time": "present"},
        ],
        "reasoning": "A Roman emperor cannot have patted me on the back, because he died before I was born",
        "answer": "Yes",
    },
]

# define the guidance program
structure_prompt = guidance(
    """How to solve anachronism problems:
Below we demonstrate how to test for an anachronism (i.e. whether it could have happened or not based on the time periods associated with the entities).
----

{{~! display the few-shot examples ~}}
{{~#each examples}}
Sentence: {{this.input}}
Entities and dates:{{#each this.entities}}
{{this.entity}}: {{this.time}}{{/each}}
Reasoning: {{this.reasoning}}
Anachronism: {{this.answer}}
---
{{~/each}}

{{~! place the real question at the end }}
Sentence: {{input}}
Entities and dates:{{#geneach 'entities' stop="\\nReasoning:"}}
{{gen 'this.entity' stop=":"}}: {{gen 'this.time' stop="\\n"}}{{/geneach}}
Reasoning:{{gen "reasoning" stop="\\n"}}
Anachronism: {{#select "answer"}}Yes{{or}}No{{/select}}"""
)

out = structure_prompt(examples=examples, input="The T-rex bit my dog")

In [4]:
# the entities generated are in the output
out["entities"]

[{'entity': 'T-rex', 'time': '65-68 million years ago'},
 {'entity': 'Dog', 'time': '10,000 years ago'}]

In [5]:
# ...as is the reasoning
out["reasoning"]

' The T-rex cannot have bit my dog, because it died before my dog was born'

In [6]:
# ...and the answer
out["answer"]

'Yes'

<hr style="height: 1px; opacity: 0.5; border: none; background: #cccccc;">
<div style="text-align: center; opacity: 0.5">Have an idea for more helpful examples? Pull requests that add to this documentation notebook are encouraged!</div>