## Summarizing a text using an LLM

As an LLM "understands" a language, it can be suited for tasks like translation or summarization.

In this Notebook, we are going to use our LLM to summarize some texts, especially claims examples.

### Requirements and Imports

If you have selected the right workbench image to launch as per the Lab's instructions, you should already have all the needed libraries. If not uncomment the first line in the next cell to install all the right packages.

In [None]:
from json import load
from os import listdir
from os.path import isfile, join

from langchain.chains import LLMChain
from langchain_community.llms import VLLMOpenAI
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.prompts import PromptTemplate

from certs import prepare_certs

### Langchain pipeline

Again, we are going to use Langchain to define our summarization pipeline.

In [None]:
inference_server_url = "REPLACE_ME"

prepare_certs(inference_server_url)
llm = VLLMOpenAI(
    openai_api_key="EMPTY",
    openai_api_base=f"{inference_server_url}/v1",
    model_name="/mnt/models/",
    max_tokens=512,
    top_p=0.95,
    temperature=0.01,
    presence_penalty=1.03,
    streaming=True,
    callbacks=[StreamingStdOutCallbackHandler()]
)

The **template** we will use is now formatted for this specific summarization task.

In [None]:
template = """<s>[INST]
You are a helpful, respectful and honest assistant.
Always assist with care, respect, and truth. Respond with utmost utility yet securely.
Avoid harmful, unethical, prejudiced, or negative content. Ensure replies promote fairness and positivity.
I will give you a text that you must summarize as best as you can.

### TEXT:
{input}

### SUMMARY:
[/INST]
"""
PROMPT = PromptTemplate(input_variables=["input"], template=template)

And we can now create the **conversation** object that we will use to query the model.

In [None]:
conversation = LLMChain(llm=llm, prompt=PROMPT, verbose=False)

We are now ready to query the model!

In the `claims` folder we have JSON files with examples of claims that could be received. We are going to read those files, display them, then the summary that the LLM made.

In [None]:
# Read the claims and populate a dictionary
claims_path = 'claims'
onlyfiles = [f for f in listdir(claims_path) if isfile(join(claims_path, f))]

claims = {}

for filename in onlyfiles:
    with open(join(claims_path, filename), 'r') as file:
        data = load(file)
    claims[filename] = data

In [None]:
for filename in onlyfiles:
    print(f"***************************")
    print(f"* Claim: {filename}")
    print(f"***************************")
    print("Original content:")
    print("-----------------")
    print(f"Subject: {claims[filename]['subject']}\nContent:\n{claims[filename]['content']}\n\n")
    print('Summary:')
    print("--------")
    summary_input = f"Subject: {claims[filename]['subject']}\nContent:\n{claims[filename]['content']}"
    conversation.predict(input=summary_input);
    print("\n\n                          ----====----\n")

We will come back to this notebook at section 3.5 for some exercises, so you can leave it open for the moment.