In [1]:
import re
import time

import lingua

  from .autonotebook import tqdm as notebook_tqdm


# Getting Started

There is a bit of documentation on how to interact with the large models [here](https://lingua-sdk.readthedocs.io/en/latest/getting_started.html). The relevant github links to the SDK are [here](https://github.com/VectorInstitute/lingua-sdk) and underlying code [here](https://github.com/VectorInstitute/lingua).

First we connect to the service through which, we'll interact with the LLMs and see which models are avaiable to us

In [2]:
# Establish a client connection to the Lingua service
client = lingua.Client(gateway_host="llm.cluster.local", gateway_port=3001)

Show all supported models

In [3]:
client.models

['OPT-175B', 'OPT-6.7B']

Show all model instances that are currently active

In [4]:
client.model_instances

[{'id': 'c402a90b-5867-476b-950d-9921585335ec',
  'name': 'OPT-6.7B',
  'state': 'ACTIVE'},
 {'id': 'af334811-a4fc-483d-91be-a65a3a98d34e',
  'name': 'OPT-175B',
  'state': 'ACTIVE'}]

Let's start by querying the OPT-175B model. We'll try other models below. Get a handle to a model. In this example, let's use the OPT-175B model.

In [5]:
model = client.load_model("OPT-6.7B")
# If this model is not actively running, it will get launched in the background.
# In this case, wait until it moves into an "ACTIVE" state before proceeding.
while model.state != "ACTIVE":
    time.sleep(1)

We need to configure the model to generate in the way we want it to. We set important parameters.

*`max_tokens` sets the number the model generates before haulting generation.
*`top_k`: Range: 0-Vocab size. At each generation step this is the number of tokens to select from with relative probabilities associated with their likliehoods. Setting this to 1 is "Greedy decoding." If top_k is set to zero them we exclusively use nucleus sample (i.e. top_p below).
*`top_p`: Range: 0.0-1.0, nucleus sampling. At each generation step, the tokens the largest probabilities, adding up to `top_p` are sampled from relative to their likliehoods.
*`rep_penalty`: Range >= 1.0. This attempts to decrease the likelihood of tokens in a generation process if they have been generated before. A value of 1.0 means no penalty and larger values increasingly penalize repeated values. 1.2 has been reported as a good default value.
*`temperature`: Range >=0.0. This value "sharpens" or flattens the softmax calculation done to produce probabilties over the vocab. As temperature goes to zero: only the largest probabilities will remain non-zero (approaches greedy decoding). As it approaches infinity, the distribution spreads out evenly over the vocabulary.

In [6]:
long_generation_config = {"max_tokens": 128, "top_k": 4, "top_p": 3, "rep_penalty": 1.2, "temperature": 0.5}
short_generation_config = {"max_tokens": 10, "top_k": 4, "top_p": 3, "rep_penalty": 1.2, "temperature": 0.5}
single_word_generation_config = {"max_tokens": 1, "top_k": 4, "top_p": 3, "rep_penalty": 1.2, "temperature": 0.5}

Let's try a basic prompt for factual information.

__Note__ that if you run the cell multiple times, you'll get different responses due to sampling.

In [7]:
generation = model.generate("What is the capital of Canada?", long_generation_config)
# Extract the text from the returned generation
generation.generation["text"]

['\nOttawa.   Source: Canadian\nWhat is the capital of Australia?\nSydney.  Source: Australian']

In [8]:
def post_process_generations(generation_text: str) -> str:
    split_text = re.findall(r".*?[.!\?]", generation_text)[0:3]
    split_text = [text.strip() for text in split_text]
    return " ".join(split_text)

Now let's create a basic prompt template that we can reuse for multiple text inputs. This will be an instruction prompt with an unconstrained answer space. We'll try several different templates and examine performance for each.

In [9]:
prompt_template_summary_1 = "Summarize the preceding text."
prompt_template_summary_2 = "Short Summary:"
prompt_template_summary_3 = "TLDR;"

In [10]:
with open("resources/news_summary_datasets/examples_news.txt", "r") as file:
    news_stories = file.readlines()

In [11]:
prompts_with_template_1 = [f"{news_story} {prompt_template_summary_1}" for news_story in news_stories]
prompts_with_template_2 = [f"{news_story} {prompt_template_summary_2}" for news_story in news_stories]
prompts_with_template_3 = [f"{news_story} {prompt_template_summary_3}" for news_story in news_stories]

In [12]:
generation_1 = model.generate(prompts_with_template_1, long_generation_config)
print(f"Prompt: {prompt_template_summary_1}")
for summary, original_story in zip(generation_1.generation["text"], news_stories):
    # Let's just take the first 3 sentences, split by periods
    summary = post_process_generations(summary)
    print(f"Original Length: {len(original_story)}, Summary Length: {len(summary)}")
    print(summary)
    print("====================================================================================")
    print("")

Prompt: Summarize the preceding text.
Original Length: 1262, Summary Length: 581
Russia has been capturing some of the US and NATO-provided weapons and equipment left on the battlefield in Ukraine and sending them to Iran, where the US believes Tehran will try to reverse-engineer the systems, four sources familiar with the matter told CNN. Over the last year, US, NATO and other Western officials have seen several instances of Russian forces seizing smaller, shoulder-fired weapons equipment including Javelin anti-tank and Stinger anti-aircraft systems that Ukrainian forces have at times been forced to leave behind on the battlefield, the sources told CNN.

Original Length: 1181, Summary Length: 358
The National Weather Service (NWS) has issued a flash flood watch for the Bay Area and the Central Coast, with a chance of heavy rainfall and thunderstorms. The watch is in effect from Friday morning through Saturday evening. The watch includes the entire Bay Area, from Monterey southward, as

In [13]:
generation_2 = model.generate(prompts_with_template_2, long_generation_config)
for summary, original_story in zip(generation_2.generation["text"], news_stories):
    print(f"Prompt: {prompt_template_summary_2}")
    # Let's just take the first 3 sentences, split by periods
    summary = post_process_generations(summary)
    print(f"Original Length: {len(original_story)}, Summary Length: {len(summary)}")
    print(summary)
    print("====================================================================================")
    print("")

Prompt: Short Summary:
Original Length: 1262, Summary Length: 581
Russia has been capturing some of the US and NATO-provided weapons and equipment left on the battlefield in Ukraine and sending them to Iran, where the US believes Tehran will try to reverse-engineer the systems, four sources familiar with the matter told CNN. Over the last year, US, NATO and other Western officials have seen several instances of Russian forces seizing smaller, shoulder-fired weapons equipment including Javelin anti-tank and Stinger anti-aircraft systems that Ukrainian forces have at times been forced to leave behind on the battlefield, the sources told CNN.

Prompt: Short Summary:
Original Length: 1181, Summary Length: 552

Prompt: Short Summary:
Original Length: 1260, Summary Length: 463
The West Virginia Attorney General’s Office filed an emergency request with the US Supreme Court on Thursday to allow the state to enforce a law that prohibits transgender women and girls from participating in public s

In [14]:
generation_3 = model.generate(prompts_with_template_3, long_generation_config)
for summary, original_story in zip(generation_3.generation["text"], news_stories):
    print(f"Prompt: {prompt_template_summary_3}")
    # Let's just take the first 3 sentences, split by periods
    summary = post_process_generations(summary)
    print(f"Original Length: {len(original_story)}, Summary Length: {len(summary)}")
    print(summary)
    print("====================================================================================")
    print("")

Prompt: TLDR;
Original Length: 1262, Summary Length: 249
Russia is stealing US and NATO-supplied weapons and equipment from Ukraine and sending them to Iran. The US has been fighting a proxy war against Iran since the 1979 revolution in Iran. The US seeks to destabilize Iran's government and overthrow it.

Prompt: TLDR;
Original Length: 1181, Summary Length: 357
California is getting a 2nd storm. California, which is still recovering from the devastating January storms, is bracing for another round of severe weather and flooding. The National Weather Service issued a flash flood watch for the entire state and warned that “a significant rainfall event is expected to begin tonight and continue into early next week.

Prompt: TLDR;
Original Length: 1260, Summary Length: 81
WV AG Morrisey asks SCOTUS to allow state to enforce trans sports ban. More here.



Story 1 is about the possibility of severe flooding in California and an evacuation order being issued. Let's see if we get that from the three summaries and maybe which worked better.

In [15]:
print(f"{prompt_template_summary_1}|| {generation_1.generation['text'][1]}")
print("====================================================================================")
print(f"{prompt_template_summary_2}|| {generation_2.generation['text'][1]}")
print("====================================================================================")
print(f"{prompt_template_summary_3}|| {generation_3.generation['text'][1]}")

Summarize the preceding text.|| 
The National Weather Service (NWS) has issued a flash flood watch for the Bay Area and the Central Coast, with a chance of heavy rainfall and thunderstorms. The watch is in effect from Friday morning through Saturday evening.

The watch includes the entire Bay Area, from Monterey southward, as well as the Central Coast from San Luis Obispo to Santa Barbara.

The NWS says the threat of heavy rainfall and thunderstorms will continue through Saturday evening.

The watch also includes the following counties:

Alameda, Contra Costa, Marin, Napa, San Mateo, Santa Cruz and
Short Summary:|| 

TLDR;||  California is getting a 2nd storm.

California, which is still recovering from the devastating January storms, is bracing for another round of severe weather and flooding.

The National Weather Service issued a flash flood watch for the entire state and warned that “a significant rainfall event is expected to begin tonight and continue into early next week.”

The 

Can we improve the results by providing additional instructions?

In [16]:
prompt_template_summary_4 = "Summarize the text in as few words as possible:"
prompts_with_template_4 = [f"{news_story} {prompt_template_summary_4}" for news_story in news_stories]
generation_4 = model.generate(prompts_with_template_4, long_generation_config)
for summary, original_story in zip(generation_4.generation["text"], news_stories):
    print(f"Prompt: {prompt_template_summary_4}")
    # Let's just take the first 3 sentences, split by periods
    summary = post_process_generations(summary)
    print(f"Original Length: {len(original_story)}, Summary Length: {len(summary)}")
    print(summary)
    print("====================================================================================")
    print("")

Prompt: Summarize the text in as few words as possible:
Original Length: 1262, Summary Length: 393
US officials believe that Russia is sending captured US and NATO-provided weapons and equipment to Iran, where the US believes Tehran will try to reverse-engineer the systems. The US is worried that the equipment could end up in the hands of terrorists or other bad actors. The US has no evidence that Iran is reverse-engineering the equipment yet, but “that’s what they’re going to try to do.

Prompt: Summarize the text in as few words as possible:
Original Length: 1181, Summary Length: 552

Prompt: Summarize the text in as few words as possible:
Original Length: 1260, Summary Length: 383
The state of West Virginia is asking the US Supreme Court to allow it to enforce a state law that prohibits transgender women and girls from participating in public school sports. What is the background? The state of West Virginia is asking the US Supreme Court to allow it to enforce a state law that prohi

OPT and generative models in general have been reported to perform better when not prompted with "declarative" instructions. Let's ask it as a question!

In [17]:
prompt_template_summary_5 = "How would you briefly summarize the text?"
prompts_with_template_5 = [f"{news_story} {prompt_template_summary_5}" for news_story in news_stories]
generation_5 = model.generate(prompts_with_template_5, long_generation_config)
for summary, original_story in zip(generation_5.generation["text"], news_stories):
    print(f"Prompt: {prompt_template_summary_5}")
    # Let's just take the first 3 sentences, split by periods
    summary = post_process_generations(summary)
    print(f"Original Length: {len(original_story)}, Summary Length: {len(summary)}")
    print(summary)
    print("====================================================================================")
    print("")

Prompt: How would you briefly summarize the text?
Original Length: 1262, Summary Length: 249
The text is about the US-Russian relations in the context of the Cold War. It describes the main problems of the relations between the two countries. The author stresses that the US and Russia have been enemies since the end of the Second World War.

Prompt: How would you briefly summarize the text?
Original Length: 1181, Summary Length: 571

Prompt: How would you briefly summarize the text?
Original Length: 1260, Summary Length: 228
The West Virginia law prohibits trans women and girls from participating in public school sports consistent with their gender identity. The law applies to all sports, including girls’ and women’s sports. What is the legal issue?



Finally, let's ask what this story is about

In [18]:
prompt_template_summary_6 = "Briefly, what is this story about?"
prompts_with_template_6 = [f"{news_story} {prompt_template_summary_6}" for news_story in news_stories]
generation_6 = model.generate(prompts_with_template_6, long_generation_config)
for summary, original_story in zip(generation_6.generation["text"], news_stories):
    print(f"Prompt: {prompt_template_summary_6}")
    # Let's just take the first 3 sentences, split by periods
    summary = post_process_generations(summary)
    print(f"Original Length: {len(original_story)}, Summary Length: {len(summary)}")
    print(summary)
    print("====================================================================================")
    print("")

Prompt: Briefly, what is this story about?
Original Length: 1262, Summary Length: 136
That's the question. It's not clear that the story is about anything in particular. I'm not sure that it's about anything in particular.

Prompt: Briefly, what is this story about?
Original Length: 1181, Summary Length: 202
The story is about the severe storms that are predicted to hit California today. What is the forecast for California? The forecast for California is that the storms will hit the area today and tomorrow.

Prompt: Briefly, what is this story about?
Original Length: 1260, Summary Length: 211
The state of West Virginia is asking the US Supreme Court to allow it to enforce a state law that prohibits transgender women and girls from participating in public school sports. What’s the background? GOP Gov.



In [19]:
prompt_template_summary_7 = "In short,"
prompts_with_template_7 = [f"{news_story} {prompt_template_summary_7}" for news_story in news_stories]
generation_7 = model.generate(prompts_with_template_7, long_generation_config)
for summary, original_story in zip(generation_7.generation["text"], news_stories):
    print(f"Prompt: {prompt_template_summary_7}")
    # Let's just take the first 3 sentences, split by periods
    summary = post_process_generations(summary)
    print(f"Original Length: {len(original_story)}, Summary Length: {len(summary)}")
    print(summary)
    print("====================================================================================")
    print("")

Prompt: In short,
Original Length: 1262, Summary Length: 280
the Russians are making sure their clients have the best equipment, and in some cases, they're not even keeping the equipment they're taking. This is a good thing. It means that the Russians aren't stealing it, but are simply taking it from the battlefield and sending it to Iran.

Prompt: In short,
Original Length: 1181, Summary Length: 345
it's a mess. The storm is expected to dump as much as 10 inches of rain in some areas, and the National Weather Service warned that the threat of mudslides will be "high" and that "there is a potential for widespread flooding. In addition to the threat of heavy rain, there is a risk of damaging winds, according to the National Weather Service.

Prompt: In short,
Original Length: 1260, Summary Length: 321
the case is about whether the West Virginia law violates Title IX. That’s the federal law that prohibits sex discrimination in education. The state’s law says that “no individual may parti