# Text Generation, Summarization, and Question Answering with Transformers


This notebook demonstrates how to use the `transformers` library for:
1. Generating text with GPT-2.
2. Summarizing documents with BART.
3. Answering questions using pre-trained models.

It also includes an example of reading text from a PDF file for question-answering.


## Text Generation with GPT-2

In [None]:
from transformers import pipeline, set_seed

generator = pipeline("text-generation", model="models/gpt2-large")

set_seed(42)
generator("The man worked as a", max_length=10, num_return_sequences=5)
generator("The woman worked as a", max_length=10, num_return_sequences=5)
generator("LLM workshop is", max_length=100, num_return_sequences=5)

## Summarization with BART

In [None]:
summarize_model = pipeline("summarization", model="models/bart-large-cnn")

txt = '''
Team India's below-par performance in the Border-Gavaskar Trophy could see big changes in the team and the leadership group. Rohit Sharma's captaincy is under the scanner and the selectors could take a call on him if India fail to reach the World Test Championship final. He has also struggled with the bat and only managed 31 runs in the ongoing series.
Amid India's poor performance in Australia, the Indian Express has reported that a senior player is portraying to be 'Mr Fix-it." The report states that the senior player is ready to project himself as an interim option for captaincy as he isn't convinced about the young players. The report doesn't mention the name of the senior player.
The report adds that Rohit may take a call about his career after the Border-Gavaskar Trophy. He made his ODI and T20I captaincy debut in 2007. Rohit made his Test debut in 2013.
'''

summarize_model(txt, max_length=int(len(txt.split(" "))/4), do_sample=False)

In [None]:
txt = '''This document is provided for information purposes only and shall not be regarded as a warranty of a certain functionality, condition, or quality of a product. NVIDIA
Corporation (“NVIDIA”) makes no representations or warranties, expressed or implied, as to the accuracy or completeness of the information contained in this document and
assumes no responsibility for any errors contained herein. NVIDIA shall have no liability for the consequences or use of such information or for any infringement of patents
or other rights of third parties that may result from its use. This document is not a commitment to develop, release, or deliver any Material (defined below), code, or
functionality.
'''

summarize_model(txt, max_length=int(len(txt.split(" "))/4), do_sample=False)

## Question Answering with Transformers

In [None]:
question_model = pipeline("question-answering", model="models/roberta-base-squad2")
question_model_legal = pipeline("question-answering", model="models/bert-large-question-answering-finetuned-legal")
question_model_bert = pipeline("question-answering", model="models/distilbert-base-cased-distilled-squad")

query = f"what are customer's responsibilities"
res = question_model(question=query, context=txt, top_k=3)
print(res)
print()
res = question_model_legal(question=query, context=txt, top_k=3)
print(res)
print()
res = question_model_bert(question=query, context=txt, top_k=3)
print(res)

## Reading and Extracting Text from PDF for Question Answering

In [None]:
from PyPDF2 import PdfReader

def read_pdf(file_path):
    reader = PdfReader(file_path)
    content = ""
    for page in reader.pages:
        content += page.extract_text() + "\n"  
    return content

file_path = "documents/LLM.pdf"  
pdf_content = read_pdf(file_path)

In [None]:
query = f"what are ways we can build LLMs?"
res = question_model(question=query, context=pdf_content, top_k=3)
print(res)
print()
res = question_model_legal(question=query, context=pdf_content, top_k=3)
print(res)
print()
res = question_model_bert(question=query, context=pdf_content, top_k=3)
print(res)