# Prompting for Information Extraction

In this notebook we'll prompt a language model to extract entities from text.

First lets make sure our libraries are up to date:

In [None]:
!pip install git+https://github.com/guidance-ai/guidance

Collecting git+https://github.com/guidance-ai/guidance
  Cloning https://github.com/guidance-ai/guidance to /tmp/pip-req-build-cl0tm6kd
  Running command git clone --filter=blob:none --quiet https://github.com/guidance-ai/guidance /tmp/pip-req-build-cl0tm6kd
  Resolved https://github.com/guidance-ai/guidance to commit d36601b62096311988fbba1ba15ae4126fb695df
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting gptcache (from guidance==0.1.11)
  Downloading gptcache-0.1.43-py3-none-any.whl (131 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m131.5/131.5 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting openai>=1.0 (from guidance==0.1.11)
  Downloading openai-1.9.0-py3-none-any.whl (223 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m223.4/223.4 kB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
Collecting tiktoke

In [None]:
!pip install -U transformers

Collecting transformers
  Downloading transformers-4.36.2-py3-none-any.whl (8.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.2/8.2 MB[0m [31m18.9 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: transformers
  Attempting uninstall: transformers
    Found existing installation: transformers 4.35.2
    Uninstalling transformers-4.35.2:
      Successfully uninstalled transformers-4.35.2
Successfully installed transformers-4.36.2


We can now load a chat model. In this demo we're going to use TinyLLama as it will load quickly and run within colab. For better performance, you can experiment with running larger models.

In [None]:
from transformers import pipeline

model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
pipe = pipeline("conversational", model_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/608 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.29k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/551 [00:00<?, ?B/s]

We now write a prompt which describes our problem and how we want the model to answer:

In [None]:
MAIN_PROMPT = """\
Entity Definition:
1. PROBLEM: Any disease, syndrome, or symptom.
2. TREATMENT: medical care given to fix a problem.
3. TEST: Any diagnostic test used to investigate a problem.

Output Format:
{{'PROBLEM': [list of entities present], 'TREATMENT': [list of entities present], 'TEST': [list of entities present]}}
If no entities are presented in any categories, output 'None'}
]"""

Finally we create a sequence of messages.

- The first prompt is a general description of what we want the model to behave as.
- We then include the prompt we defined above.
- We include a few example input and output pairs.
- Finally, we provide the input we wish the model to solve.

In [None]:
input_sentence = "Archie is a 10-year-old diabetic cat. He currently receives 3 units of ProZinc insulin."

messages = [
     {"role": "system", "content": "You are a smart and intelligent Named Entity Recognition (NER) system."},
     {"role": "user", "content": MAIN_PROMPT},
     {"role": "user", "content": "My dog developed lumps on her skin. This has been diagnosed as keratoacanthomas and treated with anti-itch medication"},
     {"role": "assistant", "content": "{{'PROBLEM': ['lumps on her skin', 'keratoacanthomas'], 'TREATMENT': ['anti-itch medication'], 'TEST': ['None']}}"},
     {"role": "user", "content": "Jess has been sneezing for 2 months or more. Today we took a nasal scope and CT. Placed on a week of Clavamox."},
     {"role": "assistant", "content": "{{'PROBLEM': ['sneezing'], 'TREATMENT': ['Clavamox'], 'TEST': ['nasal scope', 'CT']}}"},
     {"role": "user", "content": input_sentence}
]

output = pipe(messages)
messages[-1]["content"]

"{{'PROBLEM': ['diabetic cat'], 'TREATMENT': ['ProZinc insulin'], 'TEST': ['3 units of ProZinc insulin']}}"

# Under the hood

The above example used the [chat templating feature](https://huggingface.co/docs/transformers/main/en/chat_templating) of the transformers library.

Behind-the-scenes this is turned into a single long input for the model which includes special tokens indicating who is "speaking" in the chat dialogue.

For example, the sequence of messages:

In [None]:
messages = [
   {"role": "system", "content": "You are a helpful chatbot."},
   {"role": "user", "content": "Hello, how are you?"},
   {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
   {"role": "user", "content": "I'd like to show off how chat templating works!"},
]

will be turned into the following prompt under-the-hood:

In [None]:
print(pipe.tokenizer.apply_chat_template(messages, tokenize=False))

<|system|>
You are a helpful chatbot.</s>
<|user|>
Hello, how are you?</s>
<|assistant|>
I'm doing great. How can I help you today?</s>
<|user|>
I'd like to show off how chat templating works!</s>



Notice how the special tokens `<|system|>`, `<|user|>`, `<|assistant|>`, and `</s>` are added between each round of dialogue.

Each LLM (Large Language Model) is trained using different formats so these special tokens are model-specific. The chat templating feature hides this away for us so we don't have to remember which tokens to use.

This isn't yet supported for all LLMs in the transformers library howerver, so sometimes you may need to construct the above prompt by hand.

# Guiding the model

In the previous example, we told the model which format to output the NER result in and gave it some examples.

Most of the time this is enough, but if you're asking the LLM to solve a task it's not seen before, it may struggle with the output format. This is a problem if we want to parse the model's output and we're expecting it to be in a specific format.

We can ensure that the model outputs in the correct format by using the guidance library.

Firstly we make sure we have all the dependencies and load the library and a model. We're using the LLM defined in the previous section but you could use other things here including the OpenAI api.

In [None]:
from guidance import models, select

lm = models.Transformers(pipe.model, pipe.tokenizer)

Lets try applying this to classifying the sex of an animal in the input.

We can now define the rules for the output. In this example we want the model to only output one of 3 options:


*   Male
*   Female
*   Unknown

In [None]:
def classify_sex(input_sentence):
  messages = [
   {"role": "user", "content": "Output the sex of the animal (not the sex of the owner) in the following document:\n\n" + input_sentence },
  ]
  prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
  prompt = prompt.replace("</s>", "")

  return lm + prompt + select(['male', 'female', 'unknown'])

In [None]:
classify_sex("The Owner brought his dog into the surgery yesterday. He mentioned a history with diabetes.")

In [None]:
classify_sex("I have been working with a 10-year-old diabetic cat. He is treated with 3 units of ProZinc insulin.")

# Exercises

1. Experiment with different prompts. The structure of the prompt makes a big difference to the performance of the model
2. Explore some of the other things the [guidance library](https://github.com/guidance-ai/guidance/tree/main) can do.