In [4]:
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Load pre-trained model and tokenizer
model_name = "gpt2"
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

# Example PDF text
pdf_text = """
Introduction
The history of natural language processing (NLP) generally started in the 1950s, although work can be found from earlier periods. In 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence.

NLP research has evolved along with the development of computers and artificial intelligence (AI). The initial goal was to enable computers to understand and generate human language. Early systems were based on complex sets of hand-written rules.

In recent years, deep learning techniques, especially transformer models like GPT, have revolutionized NLP. These models learn to understand and generate text by training on large amounts of data.

Question Answering
Question answering (QA) is a common task in NLP where a system is asked a question and it provides an answer. QA systems can range from simple rule-based systems to advanced deep learning models.

QA systems can be trained on various types of data, including Wikipedia articles, textbooks, or user-generated content. The goal is to provide accurate and relevant answers to a wide range of questions.

Implementation
We can implement a simple QA system using a pre-trained language model like GPT-2. We'll fine-tune the model on a dataset of QA pairs, and then use it to generate answers to user questions.

Let's get started with the implementation!
"""

# Tokenize PDF text
input_ids = tokenizer.encode(pdf_text, return_tensors="pt")

# Generate response function
def generate_response(prompt, max_length=100):
    input_ids_prompt = tokenizer.encode(prompt, return_tensors="pt")
    input_ids = input_ids_prompt  # Initialize input_ids properly
    output = model.generate(input_ids, max_length=max_length, pad_token_id=tokenizer.eos_token_id)
    response = tokenizer.decode(output[0], skip_special_tokens=True)
    return response

# Chat interface
print("Welcome to PDF Chatbot! Type 'exit' to end the conversation.")
while True:
    user_input = input("You: ")
    if user_input.lower() == 'exit':
        print("PDF Chatbot: Goodbye!")
        break
    response = generate_response(user_input)
    print("PDF Chatbot:", response)


Welcome to PDF Chatbot! Type 'exit' to end the conversation.
You: What is the history of natural language processing
PDF Chatbot: What is the history of natural language processing?

The history of natural language processing is a long one. The first major breakthrough was the discovery of the word "noun" in 1859. The word "noun" was first used in the English language in 1859, and it was used in the English language for a long time. The word "noun" was used in the English language for a long time. The word "noun" was used in the English language for
You: exit
PDF Chatbot: Goodbye!
