# Anachronism Test with Guidance Server

In this notebook, we will re-implement the Anachronism example notebook, but use Guidance Server as the model.

# The Server Process

First, we need to start the Guidance server itself. This needs to run in a separate process, so we have to write out a simple script file:

In [1]:
%%writefile guidance_server.py

import logging
import os
import pathlib

import guidance

_logger = logging.getLogger(__file__)
_logger.setLevel(logging.INFO)


def main():
    win_home = target_model_path = os.getenv("MODEL_PATH")
    _logger.info(f"Attempting to load {target_model_path}")

    lm = guidance.models.LlamaCpp(target_model_path, n_gpu_layers=-1)
    _logger.info("Model loaded")

    server = guidance.Server(lm, api_key="SDFSDF")
    _logger.info("Server object created")
    server.run(port=8392)


if __name__ == "__main__":
    main()

Overwriting guidance_server.py


Now we can run this in a separate python process:

In [2]:
import subprocess

server_process = subprocess.Popen(["python", "./guidance_server.py"])

print(f"Server PID: {server_process.pid}")

Server PID: 2026


## Creating the 'client' model

With the server running, we create a model, giving it the URI of the Guidance server endpoint:

In [3]:
import guidance

lm = guidance.models.Model("http://localhost:8392", api_key="SDFSDF")

We can run a trial generation. Note that you may have to wait for a few minutes for the server process to be ready (you will get a 'Connection Refused' error if it is not):

In [7]:
lm + "Tell me a joke." + guidance.gen("simple_joke", max_tokens=20)

## Loading the data

Next, we load the anachronism data as before:

In [8]:
import datasets

# load the data
data = datasets.load_dataset('bigbench', 'anachronisms')
inputs = [x.split('\n')[0] for x in data['validation']['inputs']]
labels = [x[0] for x in data['validation']['targets']]

print(f"Loaded {len(labels)} data items")

2024-02-05 09:23:53.849454: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-05 09:23:53.849571: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-05 09:23:53.855983: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-05 09:23:53.925994: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Loaded 46 data items


## A Guidance Program

We can then define a guidance program which creates a simple prompt and sends it to the model

In [9]:
@guidance
def anachronism_query(llm, query, examples):
    prompt_string = """Given a sentence tell me whether it contains an anachronism (i.e. whether it could have happened or
not based on the time periods associated with the entities).

Here are some examples:
"""
    for ex in examples:
        prompt_string += f"Sentence: { ex['input'] }" + "\n"
        prompt_string += "Entities and dates:\n"
        for en in ex['entities']:
            prompt_string += f"{en['entity']} : {en['time']}" + "\n"
        prompt_string += f"Reasoning: {ex['reasoning']}" + "\n"
        prompt_string += f"Anachronism: {ex['answer']}" + "\n"

    llm += f'''{prompt_string}
Now determine whether the following is an anachronism:
    
Sentence: { query }
Entities and dates:
{ guidance.gen(name="entities", max_tokens=100, stop="Reason") }'''
    llm += "Reasoning :"
    llm += guidance.gen(name="reason", max_tokens=100, stop="\n")
    llm += f'''\nAnachronism: { guidance.select(["Yes", "No"], name="answer") }'''
    return llm

Now, call the model:

In [10]:
# define the few shot examples
fewshot_examples = [
    {'input': 'I wrote about shakespeare',
    'entities': [{'entity': 'I', 'time': 'present'}, {'entity': 'Shakespeare', 'time': '16th century'}],
    'reasoning': 'I can write about Shakespeare because he lived in the past with respect to me.',
    'answer': 'No'},
    {'input': 'Shakespeare wrote about me',
    'entities': [{'entity': 'Shakespeare', 'time': '16th century'}, {'entity': 'I', 'time': 'present'}],
    'reasoning': 'Shakespeare cannot have written about me, because he died before I was born',
    'answer': 'Yes'}
]

# Invoke the model
generate = lm + anachronism_query("The T-Rex bit my dog", fewshot_examples)

# Show the extracted generations
print("entities:\n{0}".format(generate['entities']))
print(f"reasoning: {generate['reason']}")
print(f"answer: {generate['answer']}")

entities:
T-Rex : 65 million years ago
Dog : present

reasoning:  The T-Rex is extinct, so it cannot bite my dog.
answer: Yes


Now, let's run on the whole dataset:

In [11]:
import numpy as np

fews = []
zero_shot = []
count = 0
for input, label in zip(inputs, labels):
    print(f"Working on item {count}")
    f = lm + anachronism_query(input, fewshot_examples)
    f = 'Yes' if 'Yes' in f['answer'] else 'No'
    fews.append(f)
    count += 1
fews = np.array(fews)


print('Few-shot Accuracy', (np.array(labels) == fews).mean())

Few-shot Accuracy 0.41304347826086957


Finally, we should shut down the server process we started:

In [12]:
server_process.terminate()

INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [2026]
