The purpose of this notebook is to explore the use of open source models such as "facebook/opt-125m" and "neuralmagic/Llama-2-7b-chat-quantized.w8a8", these models are relatively small in size and can be used from my g4dn.2xlarge instance.
In addition I compare the output from the open source model to openai.
The opensource model is loaded with vllm serving

In [1]:
!pip install transformers torch -q
!pip install langchain -q
!pip install -U langchain-community -q
!pip install python-dotenv openai -q
!pip3 install pysqlite3-binary -q

In [3]:
import boto3
import os
def download_file_from_s3(bucket_name, s3_file_key):
    # download files to local environment
    # Create an S3 client
    s3 = boto3.client('s3')
    local_file_path = s3_file_key.split('/')[-1]
    # Download the file from S3
    s3.download_file(bucket_name, s3_file_key, local_file_path)
    print(f"File {s3_file_key} downloaded from {bucket_name} to {local_file_path}")

def delete_file(file_path):
    os.remove(file_path)

In [4]:
import re
import string

_RE_COMBINE_WHITESPACE = re.compile(r"[ ]+", re.ASCII)
_RE_SHORT_LINES = re.compile("^.{1,3}\n", re.MULTILINE)
_RE_MULTILINE_BREAKS = re.compile("\n+", re.MULTILINE)
_RE_PAGE_CHAR = "\x0c"
_RE_LATIN_WHITESPACE_CHAR = re.compile("\xa0", re.ASCII)


# @markdown  - **clean_text** - clean text spaces,non-printable and line breaks
def clean_text(text):
    """Clean text from several white-space and line-breaks"""
    # remove several line breaks
    text = _RE_LATIN_WHITESPACE_CHAR.sub(" ", text)
    # remove several white spaces
    text = _RE_COMBINE_WHITESPACE.sub(" ", text).strip()
    # remove very short lines
    text = _RE_SHORT_LINES.sub("\n", text)
    # remove several line breaks
    text = _RE_MULTILINE_BREAKS.sub("\n", text)
    # remove unknown characters or non printable
    text = "".join([x for x in text if x in string.printable])

    return text.strip()

## Load the dataset

In [5]:
#for the test data I will use some (parsed) files from here s3://contract-intelligence-data/client-data/AAA/NY State Insurance/06-FRM-AR1/ 
# these are files of good quality

download_file_from_s3("contract-intelligence-data", "client-data/AAA/NY State Insurance/06-FRM-AR1/FRM-AR117-21-1230-2624_2024_163320/FRM-AR117-21-1230-2624_2024_163320.json")
download_file_from_s3("contract-intelligence-data", "client-data/AAA/NY State Insurance/06-FRM-AR1/FRM-AR117-21-1230-2638_2024_162334/FRM-AR117-21-1230-2638_2024_162334.json")
download_file_from_s3("contract-intelligence-data", "client-data/AAA/NY State Insurance/06-FRM-AR1/FRM-AR117-22-1252-6330_2024_16400/FRM-AR117-22-1252-6330_2024_16400.json")

download_file_from_s3("contract-intelligence-data","client-data/AAA/NY State Insurance/04-RPT-INIT/17-22-1250-8464/17-22-1250-8464.json")

File client-data/AAA/NY State Insurance/06-FRM-AR1/FRM-AR117-21-1230-2624_2024_163320/FRM-AR117-21-1230-2624_2024_163320.json downloaded from contract-intelligence-data to FRM-AR117-21-1230-2624_2024_163320.json
File client-data/AAA/NY State Insurance/06-FRM-AR1/FRM-AR117-21-1230-2638_2024_162334/FRM-AR117-21-1230-2638_2024_162334.json downloaded from contract-intelligence-data to FRM-AR117-21-1230-2638_2024_162334.json
File client-data/AAA/NY State Insurance/06-FRM-AR1/FRM-AR117-22-1252-6330_2024_16400/FRM-AR117-22-1252-6330_2024_16400.json downloaded from contract-intelligence-data to FRM-AR117-22-1252-6330_2024_16400.json
File client-data/AAA/NY State Insurance/04-RPT-INIT/17-22-1250-8464/17-22-1250-8464.json downloaded from contract-intelligence-data to 17-22-1250-8464.json


In [6]:
import json
import glob
from tqdm import tqdm

def read_files(docs_dir: str):
    files = glob.glob(os.path.join(docs_dir,"*.json"), recursive=True)
    print(f"Total number of docs: {len(files)}")
    return files

def compose_dataset(docs_dir: str):
    files = read_files(docs_dir)
    print(files)
    # Read & Load the Dataset
    dataset = []
    for file in tqdm(files):
        # data in json format after ocr
        with open(file) as f:
            pdoc = json.load(f)
        dataset.append(pdoc)

    return dataset

In [7]:
dataset = compose_dataset(".")  

Total number of docs: 4
['./FRM-AR117-21-1230-2624_2024_163320.json', './17-22-1250-8464.json', './FRM-AR117-21-1230-2638_2024_162334.json', './FRM-AR117-22-1252-6330_2024_16400.json']


100%|██████████| 4/4 [00:00<00:00, 3413.47it/s]


In [8]:
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain_community.embeddings import CohereEmbeddings, OpenAIEmbeddings
from langchain.chains.retrieval_qa.base import RetrievalQA
from langchain.schema import Document
from langchain.chat_models import ChatOpenAI

import pandas as pd

In [9]:
# The rag part, based on the one in LINT api

DEFAULT_CHUNK_SIZE = 3500  #1400 (had to reduce to fit into the facebook/opt-125m model)
DEFAULT_CHUNK_OVERLAP = 500
EMBEDDING_MODEL = "text-embedding-ada-002"#I will still use openai for embeddings
# next step can also try and replace the embeddings for opensource ones
LLM_MODEL_OPENAI = "gpt-3.5-turbo"
vector_db_path = './chroma_db'

In [11]:
from dotenv import load_dotenv, find_dotenv, dotenv_values
import openai
path_to_keys = 'keys.env'
temp = dotenv_values(path_to_keys)
openai_api_key = temp["OPENAI_API_KEY"]

### lets put the data to chroma db

In [12]:
__import__('pysqlite3')
import sys
sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

In [13]:
# !pip install hnswlib==0.7.0 -q
# !pip install chroma-hnswlib==0.7.3 -q
# !pip uninstall hnswlib chroma-hnswlib -y

In [14]:
%pip install chromadb==0.5 tiktoken -q

Note: you may need to restart the kernel to use updated packages.


In [15]:
from langchain.vectorstores import Chroma

def put_in_Chroma(doc_pages, doc_name):
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=DEFAULT_CHUNK_SIZE, chunk_overlap=DEFAULT_CHUNK_OVERLAP)
    doc = [
                Document(page_content=clean_text(page), metadata={"page": i, "doc_name": doc_name})
                for i, page in enumerate(doc_pages)
            ]
    chunks = text_splitter.split_documents(doc)
    print('chunks: ', len(chunks))
    # Retrieve embedding function from code env resources
    embeddings = OpenAIEmbeddings(model=EMBEDDING_MODEL, openai_api_key=openai_api_key)

    # Index the vector database by embedding then inserting document chunks
    db = Chroma.from_documents(chunks,
                            embedding=embeddings,
                            ids=[str(i) for i in range(len(chunks))],
                            persist_directory=vector_db_path)

    # Save vector database as persistent files in the output folder
    # db.persist()
    return db

In [16]:
%%time
# file_name = 'FRM-AR117-21-1230-2624_2024_163320'
# file_name = 'FRM-AR117-21-1230-2638_2024_162334'
file_name = 'FRM-AR117-22-1252-6330_2024_16400'
# file_name = '17-22-1250-8464'
for i in dataset:
    if i['name'] == file_name:
        doc_pages = i['text']
        break
print('pages: ', len(doc_pages))
db = put_in_Chroma(doc_pages, doc_name=file_name)

pages:  10
chunks:  9


  embeddings = OpenAIEmbeddings(model=EMBEDDING_MODEL, openai_api_key=openai_api_key)


CPU times: user 1.31 s, sys: 112 ms, total: 1.42 s
Wall time: 2.53 s


In [17]:
def get_gpt_llm():
    chat_params = {
        "model": "gpt-3.5-turbo", # Bigger context window
        "openai_api_key": openai_api_key,
        "temperature": 0.000001, 
    }
    llm = ChatOpenAI(**chat_params)
    return llm

def qa_retriever_openai(query, vector_db_path, file_id, k=4):
    embeddings = OpenAIEmbeddings(model=EMBEDDING_MODEL, openai_api_key=openai_api_key)
    vectordb = Chroma(persist_directory=vector_db_path, embedding_function=embeddings)

    retriever = vectordb.as_retriever(search_kwargs={"k": k, "filter": {"doc_name": file_id}})

    qa = RetrievalQA.from_chain_type(llm=get_gpt_llm(), chain_type="stuff", 
                                    retriever=retriever, return_source_documents=True)
    res = qa({"query": query, "k": k})
    return res, retriever

In [18]:
question = "Who are the parties?"

# file_name = 'FRM-AR117-21-1230-2624_2024_163320'
# file_name = 'FRM-AR117-21-1230-2638_2024_162334'
# file_name = '17-22-1250-8464'
file_name = 'FRM-AR117-22-1252-6330_2024_16400'
answer, retriever = qa_retriever_openai(question, vector_db_path="/home/ubuntu/yulia/vllm-exploratory/llm/xplore/chroma_db", file_id=file_name, k=4)

  vectordb = Chroma(persist_directory=vector_db_path, embedding_function=embeddings)
  llm = ChatOpenAI(**chat_params)
  res = qa({"query": query, "k": k})


In [19]:
answer

{'query': 'Who are the parties?',
 'k': 4,
 'result': 'The parties involved in this case are:\n\n1. Applicant Attorney: Korsunskiy Legal Group P.C.\n2. Medical Provider: Mazal Pharmacy Inc d/b/a Mirage Pharmacy\n3. Injured Party: Annmarie Morris',
 'source_documents': [Document(metadata={'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 9}, page_content='Your name:\nSigned on:'),
  Document(metadata={'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 0}, page_content='New York Motor Vehicle No-Fault Insurance Law Arbitration Request Form\nDetails of the parties\nApplicant Attorney Details\nFull Legal Name Korsunskiy Legal Group P.C.\nAddress 3237 Long Beach Road\nAddress Suite 110\nCity Oceanside\nState NY\nZip 11572\nEmail dkorsunskiy @korsunskiy-law.com\nPhone 718-758-4755\nApplicant File Number DK22-243552'),
  Document(metadata={'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 4}, page_content='Details of the Accident\nDid the accident occur in New York\nState?

In [20]:
print('Openai answer: ', answer['result'])

Openai answer:  The parties involved in this case are:

1. Applicant Attorney: Korsunskiy Legal Group P.C.
2. Medical Provider: Mazal Pharmacy Inc d/b/a Mirage Pharmacy
3. Injured Party: Annmarie Morris


ran in terminal: `vllm serve neuralmagic/Llama-2-7b-chat-quantized.w8a8 --chat-template templates/template_chatml.jinja`

In [21]:
inference_server_url = "http://localhost:8000/v1"

# MODEL = "facebook/opt-125m"
MODEL = "neuralmagic/Llama-2-7b-chat-quantized.w8a8"
    
llm = ChatOpenAI(
    model=MODEL,
    openai_api_key="EMPTY",
    openai_api_base=inference_server_url,
    max_tokens=100,
    temperature=0,
)

In [22]:
def qa_retriever_llama(query, vector_db_path, file_id, k=4):
    embeddings = OpenAIEmbeddings(model=EMBEDDING_MODEL, openai_api_key=openai_api_key)
    vectordb = Chroma(persist_directory=vector_db_path, embedding_function=embeddings)

    retriever = vectordb.as_retriever(search_kwargs={"k": k, "filter": {"doc_name": file_id}})

    qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", 
                                    retriever=retriever, return_source_documents=True)
    res = qa({"query": query})
    return res, retriever

In [23]:
question = "Who are the parties?"
# file_name = 'FRM-AR117-21-1230-2624_2024_163320'
# file_name = 'FRM-AR117-21-1230-2638_2024_162334'
# file_name = '17-22-1250-8464'

file_name = 'FRM-AR117-22-1252-6330_2024_16400'
answer, retriever = qa_retriever_llama(question, vector_db_path="/home/ubuntu/yulia/vllm-exploratory/llm/xplore/chroma_db", file_id=file_name, k=4)

In [24]:
answer

{'query': 'Who are the parties?',
 'result': 'The parties involved in this arbitration request are:\n\n* Applicant: Mazal Pharmacy Inc d/b/a Mirage Pharmacy, the medical provider seeking payment of benefits for Annmarie Morris, the injured party.\n* Insurer: The insurance company that issued the policy to Annmarie Morris, the injured party.\n* Applicant Attorney: Korsunskiy Legal Group P.C., the law firm representing Maz',
 'source_documents': [Document(metadata={'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 9}, page_content='Your name:\nSigned on:'),
  Document(metadata={'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 0}, page_content='New York Motor Vehicle No-Fault Insurance Law Arbitration Request Form\nDetails of the parties\nApplicant Attorney Details\nFull Legal Name Korsunskiy Legal Group P.C.\nAddress 3237 Long Beach Road\nAddress Suite 110\nCity Oceanside\nState NY\nZip 11572\nEmail dkorsunskiy @korsunskiy-law.com\nPhone 718-758-4755\nApplicant File Number

In [26]:
print(answer['result'])

The parties involved in this arbitration request are:

* Applicant: Mazal Pharmacy Inc d/b/a Mirage Pharmacy, the medical provider seeking payment of benefits for Annmarie Morris, the injured party.
* Insurer: The insurance company that issued the policy to Annmarie Morris, the injured party.
* Applicant Attorney: Korsunskiy Legal Group P.C., the law firm representing Maz


In [27]:
text = '\n--\n'.join([i.page_content for i in answer['source_documents']])
print(text)

Your name:
Signed on:
--
New York Motor Vehicle No-Fault Insurance Law Arbitration Request Form
Details of the parties
Applicant Attorney Details
Full Legal Name Korsunskiy Legal Group P.C.
Address 3237 Long Beach Road
Address Suite 110
City Oceanside
State NY
Zip 11572
Email dkorsunskiy @korsunskiy-law.com
Phone 718-758-4755
Applicant File Number DK22-243552
--
Details of the Accident
Did the accident occur in New York
State? Yes
If no, Is the injured person or a
member of their household a New No
York State Automobile Policy
Holder?
Date Of Accident 11/20/2021
Description Of Accident
Position Of Injured Party
Date of last contact
Person contacted
Reason given by insurer for
nonpayment of claim(s)
Reason you believe the denied or
overdue benefits should be paid
--
Applicant for benefits - Medical Provider
Full Legal Name
Mazal Pharmacy Inc d/b/a Mirage Pharmacy
Address 1 1674 East 22 street
Address 2 2nd Floor
City Brooklyn
State NY
Zip 11229
Email donotemail5014701 @adr.org
Phone 000

In [28]:
from langchain_core.prompts.prompt import PromptTemplate

In [29]:
llm = ChatOpenAI(
    model=MODEL,
    openai_api_key="EMPTY",
    openai_api_base=inference_server_url,
    max_tokens=200,
    temperature=0,
)

In [None]:
prompt = """You are an AI assistant, use the following text to provide answer if you don't know, say you don't know
        Context: {context}
        Question: {question}
"""

# context = text
question = "Who are the parties?"
question = "Where did the accident occur?"
# question = "What is the date of the accident?"
# question = "Was the denial of claim based on late notice to the carrier?"
# question = "Who is the insurer?"
# question = "What type of form is that?"

file_name = 'FRM-AR117-22-1252-6330_2024_16400'
vector_db_path = "/home/ubuntu/yulia/vllm-exploratory/llm/xplore/chroma_db"
my_prompt = PromptTemplate(template=prompt, input_variables=["context", "question"])
embeddings = OpenAIEmbeddings(model=EMBEDDING_MODEL, openai_api_key=openai_api_key)
vectordb = Chroma(persist_directory=vector_db_path, embedding_function=embeddings)
retriever = vectordb.as_retriever(search_kwargs={"k": 4, "filter": {"doc_name": file_name}})

In [30]:
qa = RetrievalQA.from_chain_type(llm=llm, 
                                chain_type="stuff", 
                                retriever=retriever, 
                                return_source_documents=True,
                                chain_type_kwargs={"prompt": my_prompt})

In [33]:
%%time
llama_answer = qa.invoke(question)

CPU times: user 21.7 ms, sys: 471 μs, total: 22.2 ms
Wall time: 8.61 s


In [32]:
print(llama_answer)
print(llama_answer['result'])

{'query': 'Where did the accident occur?',
 'result': "I'm just an AI, I don't have access to specific information about the accident or the parties involved. However, I can provide general information on New York Motor Vehicle No-Fault Insurance Law and the arbitration process.\n\nThe New York Motor Vehicle No-Fault Insurance Law requires that all motor vehicles registered in New York State be covered by a no-fault insurance policy. This policy provides coverage for medical expenses and lost wages to the insured party and passengers in the event of an accident, regardless of fault.\n\nIf an insurer denies or overpays a claim, the insured party or their attorney can file an arbitration request with the New York State Department of Financial Services (DFS). The arbitration process is special expedited, which means that the case will be heard and decided by an arbitrator within 30 days of the request.\n\nTo",
 'source_documents': [Document(metadata={'doc_name': 'FRM-AR117-22-1252-6330_20

In [36]:
my_prompt = PromptTemplate(template=prompt, input_variables=["context", "question"])
embeddings = OpenAIEmbeddings(model=EMBEDDING_MODEL, openai_api_key=openai_api_key)
vectordb = Chroma(persist_directory=vector_db_path, embedding_function=embeddings)
retriever = vectordb.as_retriever(search_kwargs={"k": 4, "filter": {"doc_name": file_name}})
qa = RetrievalQA.from_chain_type(llm=get_gpt_llm(), 
                                chain_type="stuff", 
                                retriever=retriever, 
                                return_source_documents=True,
                                chain_type_kwargs={"prompt": my_prompt})

In [37]:
%%time
openai_answer = qa.invoke(question)

CPU times: user 22.9 ms, sys: 308 μs, total: 23.2 ms
Wall time: 719 ms


In [38]:
print(openai_answer)
print(openai_answer['result'])

{'query': 'Where did the accident occur?', 'result': 'The accident occurred in New York State.', 'source_documents': [Document(metadata={'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 4}, page_content='Details of the Accident\nDid the accident occur in New York\nState? Yes\nIf no, Is the injured person or a\nmember of their household a New No\nYork State Automobile Policy\nHolder?\nDate Of Accident 11/20/2021\nDescription Of Accident\nPosition Of Injured Party\nDate of last contact\nPerson contacted\nReason given by insurer for\nnonpayment of claim(s)\nReason you believe the denied or\noverdue benefits should be paid'), Document(metadata={'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 0}, page_content='New York Motor Vehicle No-Fault Insurance Law Arbitration Request Form\nDetails of the parties\nApplicant Attorney Details\nFull Legal Name Korsunskiy Legal Group P.C.\nAddress 3237 Long Beach Road\nAddress Suite 110\nCity Oceanside\nState NY\nZip 11572\nEmail dkorsun

In [39]:
db._collection.get(include=["metadatas","documents"])

{'ids': ['0', '1', '2', '3', '4', '5', '6', '7', '8'],
 'embeddings': None,
 'metadatas': [{'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 0},
  {'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 1},
  {'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 2},
  {'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 4},
  {'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 5},
  {'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 6},
  {'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 7},
  {'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 8},
  {'doc_name': 'FRM-AR117-22-1252-6330_2024_16400', 'page': 9}],
 'documents': ['New York Motor Vehicle No-Fault Insurance Law Arbitration Request Form\nDetails of the parties\nApplicant Attorney Details\nFull Legal Name Korsunskiy Legal Group P.C.\nAddress 3237 Long Beach Road\nAddress Suite 110\nCity Oceanside\nState NY\nZip 11572\nEmail dkorsunskiy @korsunskiy-law.com\nPhone 718-758-4755\nAppl

In [24]:
import torch
torch.cuda.empty_cache()

--------

In [27]:
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

model_id = "neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8"
number_gpus = 1
max_model_len = 8192

sampling_params = SamplingParams(temperature=0.6, top_p=0.9, max_tokens=256)

tokenizer = AutoTokenizer.from_pretrained(model_id)

llm = LLM(model=model_id, tensor_parallel_size=number_gpus, max_model_len=max_model_len, dtype=torch.float16)

INFO 10-03 11:02:21 llm_engine.py:223] Initializing an LLM engine (v0.6.1.post2) with config: model='neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8', speculative_config=None, tokenizer='neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=8192, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=compressed-tensors, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=neuralmagic/Meta-Llama-3.1-8B-Instruct-quantize

Loading safetensors checkpoint shards:   0% Completed | 0/2 [00:00<?, ?it/s]
Loading safetensors checkpoint shards:  50% Completed | 1/2 [00:02<00:02,  2.12s/it]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:03<00:00,  1.85s/it]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:03<00:00,  1.89s/it]



INFO 10-03 11:02:29 model_runner.py:1008] Loading model weights took 8.4939 GB
INFO 10-03 11:02:33 gpu_executor.py:122] # GPU blocks: 1108, # CPU blocks: 2048
INFO 10-03 11:02:35 model_runner.py:1311] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 10-03 11:02:35 model_runner.py:1315] CUDA graphs can take additional 1~3 GiB memory per GPU. If you are running out of memory, consider decreasing `gpu_memory_utilization` or enforcing eager mode. You can also reduce the `max_num_seqs` as needed to decrease memory usage.
INFO 10-03 11:03:03 model_runner.py:1430] Graph capturing finished in 28 secs.


Processed prompts: 100%|██████████| 1/1 [00:05<00:00,  5.59s/it, est. speed input: 9.48 toks/s, output: 21.10 toks/s]

Yer lookin' fer a swashbucklin' introduction, eh? Alright then, matey! Me name be Captain Chatbeard, the scurvy dog o' the seven seas... er, the seven screens! Me and me trusty crew o' code scallywags have been sailin' the digital waters, gatherin' knowledge and tellin' tales o' adventure and danger. So hoist the sails and set course fer a treasure trove o' information, me hearty! What be bringin' ye to these fair waters today?





In [28]:
messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "write a poem about waterlilies"},
]

prompts = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
outputs = llm.generate(prompts, sampling_params)

generated_text = outputs[0].outputs[0].text
print(generated_text)

Processed prompts: 100%|██████████| 1/1 [00:11<00:00, 11.97s/it, est. speed input: 4.68 toks/s, output: 21.38 toks/s]

Yer lookin' fer a poem, eh? Alright then, matey, settle yerself down with a pint o' grog and listen close to me tale o' the waterlilies.

Oh, waterlilies, ye floatin' beauties o' the deep,
Like treasure chests o' the pond, yer secrets ye keep.
Yer leaves like green sails, they gently sway,
Dancin' to the breeze, on a sunny day.

Yer flowers, like golden doubloons, shine so bright,
A treasure trove o' beauty, in the mornin' light.
The dragonflies and bees, they come to visit ye,
To drink from yer sweet nectar, and bask in yer glee.

In the still o' the night, ye rise up high,
Like a ghost ship, sailin' across the sky.
Yer fragrance, like a chest overflowin' with gold,
Fills the air, and makes the heart feel bold.

So here's to ye, oh waterlilies, a treasure rare,
A beauty o' the water, beyond compare.
May yer beauty continue, to shine so bright,
And bring joy to all, who sail through the night.

Now, go forth,



