### Setup

In [None]:
%pip install transformers
%pip3 install torch --index-url https://download.pytorch.org/whl/cu126

In [None]:
%pip install --upgrade ipywidgets

In [None]:
%pip install huggingface_hub[hf_xet]

### Chat

In [18]:
import torch
torch.cuda.is_available()

True

In [19]:
from transformers import pipeline

In [20]:
chatbot = pipeline(task="text2text-generation", model="facebook/blenderbot-400M-distill")

Device set to use cuda:0


In [21]:
user_message = """
Who are you?
"""

In [22]:
response = chatbot(user_message)
print(response)

[{'generated_text': ' I am a man of many trades.  I am an engineer.  What do you do?'}]


### Translation

In [23]:
from transformers import pipeline
import torch

In [28]:
translator = pipeline(task="translation", model="facebook/nllb-200-distilled-600M", dtype=torch.bfloat16)

Device set to use cuda:0


In [29]:
text = """Hello. Who are you and what is the meaning of life?"""

In [30]:
# https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200
text_translated = translator(text, src_lang="eng_Latn", tgt_lang="kan_Knda")

In [31]:
text_translated

[{'translation_text': 'ಹಲೋ. ನೀವು ಯಾರು ಮತ್ತು ಜೀವನದ ಅರ್ಥವೇನು?'}]

In [33]:
# cleanup
import gc
del translator
gc.collect()

2925

### Summarization

In [3]:
from transformers import pipeline
import torch

In [4]:
summarizer = pipeline(task="summarization", model="facebook/bart-large-cnn", dtype=torch.bfloat16)

Device set to use cuda:0


In [17]:
# Knives Out - Plot
text = """The family of wealthy mystery novelist Harlan Thrombey attends his birthday party at his estate. The next morning, Harlan's housekeeper Fran discovers him dead with a slit throat. Police detectives Lieutenant Elliot and Trooper Wagner believe Harlan died by suicide, but private detective Benoit Blanc is anonymously hired to investigate Harlan's death. Blanc learns Harlan had strained relationships with his family members, giving several of them plausible motives for murder.

Unbeknownst to Blanc, Harlan's nurse Marta Cabrera believes she injected Harlan with a lethal dose of morphine after mixing up his bedtime medications the night of the party. To protect Marta from being blamed for his death, Harlan instructed Marta to create a false alibi before he slit his own throat: she was to be seen leaving the house, sneak back in through a window, and disguise herself as Harlan in order to make it appear that he was alive after she left for the night. Marta cannot lie without vomiting, so she gives accurate but incomplete answers when questioned. She agrees to assist Blanc's investigation and conceals evidence incriminating her. At the reading of Harlan's will, Marta is bequeathed his entire fortune and property, stunning the Thrombeys. Harlan's grandson Ransom Drysdale helps Marta flee but manipulates her into confessing to him. Ransom offers further assistance in exchange for a portion of Marta's inheritance. Meanwhile, the other Thrombeys unsuccessfully attempt to influence Marta to renounce the inheritance, even threatening to have her undocumented mother deported.

Marta receives a blackmail note containing a partial photocopy of Harlan's toxicology report. She and Ransom drive to the medical examiner's office, only to find it burned down. Marta receives an email proposing a meeting with the blackmailer; Blanc and the police spot them, leading to Ransom being arrested. At the meeting, Marta finds that Fran, the blackmailer, has been drugged; she performs CPR on Fran and calls an ambulance. Marta confesses to Blanc, but discovers Ransom has already implicated her. Out of moral obligation, Marta believes she must confess to the Thrombeys, which would invalidate the bequest under the slayer rule.

At the mansion, Marta finds Fran's copy of the full toxicology report, which shows Harlan had only trace amounts of morphine in his blood. Blanc reveals his deductions to the police, Marta, and Ransom. Blanc deduces Harlan told Ransom about his will, prompting Ransom to swap Harlan's medicines to cause Marta to kill him unknowingly. However, Marta actually gave Harlan the correct medication, recognizing it by viscosity without reading the label due to her experience as a nurse; she only believed she had poisoned Harlan after reading the label on the bottle with the switched content. When the death was reported as a suicide, Ransom anonymously hired Blanc to entrap Marta. Fran saw Ransom tampering with the crime scene to remove the switched medications, and sent him the blackmail note. After Ransom realized that Marta was not responsible for Harlan's death but believed she was, he forwarded the blackmail letter to Marta and burned down the medical examiner's office to destroy evidence of her innocence. Ransom then overdosed Fran with morphine, intending for Marta to be caught with Fran's corpse.

The hospital calls; Marta relays that Fran has survived and will implicate Ransom. Ransom insists he will avoid criminal charges because his attempt to kill Fran failed. Marta then vomits on Ransom, revealing she lied: Fran is dead. Realizing he has confessed to the murder, and that the police officers recorded his confession, Ransom grabs a knife from Harlan's collection and attacks Marta, but the knife is a harmless retractable stage knife, and the police promptly arrest him.

Blanc tells Marta he suspected early on that she played a part in Harlan's death, noting a drop of blood on her shoe. He tells Marta her innocence prevailed because she made ethical choices that obstructed Ransom's attempts to incriminate her. As Ransom is taken into custody and the rest of the family is gathered outside in defeat, Marta watches from the balcony of what is now her mansion, sipping from Harlan's coffee mug that reads "My House, My Rules, My Coffee!!". """

In [None]:
summary = summarizer(text,
                     min_length=10,
                     max_length=100)

Combined Summary:
The family of wealthy mystery novelist Harlan Thrombey attends his birthday party at his estate. The next morning, Harlan's housekeeper Fran discovers him dead with a slit throat. Police detectives Lieutenant Elliot and Trooper Wagner believe Harlan died by suicide, but private detective Benoit Blanc is anonymously hired to investigate. Blanc learns Harlan had strained relationships with his family members, giving several of them plausible motives for murder.


In [15]:
summary

[{'summary_text': "The family of wealthy mystery novelist Harlan Thrombey attends his birthday party at his estate. The next morning, Harlan's housekeeper Fran discovers him dead with a slit throat. Police detectives Lieutenant Elliot and Trooper Wagner believe Harlan died by suicide, but private detective Benoit Blanc is anonymously hired to investigate. Blanc learns Harlan had strained relationships with his family members, giving several of them plausible motives for murder."}]

In [None]:
# The issue is that BART tends to extract the beginning of text
# Let's pre-process by removing the first 1-2 sentences and summarizing the rest
# This forces the model to learn from other parts of the text

sentences = text.split(". ")
# Skip the first 2 sentences and summarize the middle/end
text_without_beginning = ". ".join(sentences[2:])

print("Summarizing without beginning sentences:")
summary_no_start = summarizer(text_without_beginning, min_length=50, max_length=150)
print(summary_no_start[0]['summary_text'])
print("\n" + "="*80 + "\n")

# Now let's create a true abstractive summary by summarizing the full text
# but with better parameters
print("Original BART summary (full text):")
summary_original = summarizer(text, min_length=50, max_length=150)
print(summary_original[0]['summary_text'])

Loading Pegasus model...


config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/2.28G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.28G [00:00<?, ?B/s]

Some weights of PegasusForConditionalGeneration were not initialized from the model checkpoint at google/pegasus-cnn_dailymail and are newly initialized: ['model.decoder.embed_positions.weight', 'model.encoder.embed_positions.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


generation_config.json:   0%|          | 0.00/280 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/88.0 [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/1.91M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/65.0 [00:00<?, ?B/s]

ImportError: 
 requires the protobuf library but it was not found in your environment. Check out the instructions on the
installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation.
