# Simple Bible Query

### Installing libraries

In [None]:
pip install PyPDF2 nltk

In [5]:
import PyPDF2
import nltk
from nltk.tokenize import sent_tokenize
import PyPDF2

In [None]:
pip install torch transformers

In [15]:
def read_pdf(file_path):
    # Open the PDF file in binary mode
    with open(file_path, 'rb') as file:
        # Create a PdfReader object from the PDF file
        pdf_reader = PyPDF2.PdfReader(file)
        
        # Get the total number of pages in the PDF
        num_pages = len(pdf_reader.pages)
        
        # Initialise a list to store content from each page
        content = []
        
        # Iterate through each page in the PDF
        for page_num in range(num_pages):
            # Extract text content from the current page
            page = pdf_reader.pages[page_num]
            content.append(page.extract_text())
    
    # Combine the text content from all pages into a single string
    return '\n'.join(content)


In [16]:
# Download the 'punkt' resource from NLTK (Natural Language Toolkit)
nltk.download('punkt')

# Import the sentence tokenizer from NLTK
from nltk.tokenize import sent_tokenize

def tokenize_text(text):
    # Tokenize the input text into sentences using NLTK's sentence tokenizer
    sentences = sent_tokenize(text)
    return sentences

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\vannor\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


### Using BERT Model

BERT, or Bidirectional Encoder Representations from Transformers, is a powerful natural language processing model that excels in understanding contextual relationships in text by considering both left and right contexts of words, allowing it to capture intricate semantic meanings and perform various language understanding tasks with high accuracy.


In [17]:
from transformers import pipeline

# Load the question answering pipeline
qa_pipeline = pipeline('question-answering', model='bert-large-uncased-whole-word-masking-finetuned-squad')

def process_query_bert(text, query):
    # Use BERT for question answering
    result = qa_pipeline(context=text, question=query)
    
    # Check if the answer confidence is above a certain threshold
    if result['score'] > 0.5:
        return f"Answer: {result['answer']}"
    
    return "No relevant information found for the query."

# Usage example:
file_path = 'bible_facts2.pdf'
document_content = read_pdf(file_path)

# Prompt the user to input a question
user_query = input("Ask a question: ")

# Process the user's query using BERT and print the result
result = process_query_bert(document_content, user_query)
print(result)


Some weights of the model checkpoint at bert-large-uncased-whole-word-masking-finetuned-squad were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Ask a question: what is the first book of the bible
Answer: Genesis


### Adjusting the code to create a loop allowing users to keep asking questions without rerunning the entire script

In [None]:
from transformers import pipeline

# Load the question answering pipeline
qa_pipeline = pipeline('question-answering', model='bert-large-uncased-whole-word-masking-finetuned-squad')

def process_query_bert(text, query):
    """
    Process a user query using BERT-based question answering.

    Parameters:
    - text (str): The document or context for BERT to analyze.
    - query (str): The user's question to be answered.

    Returns:
    - str: The answer to the user's question or a message indicating no relevant information.
    """
    # Use BERT for question answering
    result = qa_pipeline(context=text, question=query)
    
    # Check if the answer confidence is above a certain threshold
    if result['score'] > 0.5:
        return f"Answer: {result['answer']}"
    
    return "No relevant information found for the query."

# Usage example:
file_path = 'bible_facts2.pdf'
document_content = read_pdf(file_path)

while True:
    user_query = input("Ask a question (type 'exit' to end): ")
    
    if user_query.lower() == 'exit':
        print("Exiting the question-answering loop.")
        break
    
    result = process_query_bert(document_content, user_query)
    print(result)


Some weights of the model checkpoint at bert-large-uncased-whole-word-masking-finetuned-squad were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Ask a question (type 'exit' to end): what is the first book of the bible
Answer: Genesis
Ask a question (type 'exit' to end): what is the first last of the bible
No relevant information found for the query.
Ask a question (type 'exit' to end): what is the shortest verseof the bible
No relevant information found for the query.
Ask a question (type 'exit' to end): what is the shortest verse in the bible
No relevant information found for the query.
Ask a question (type 'exit' to end): what does bible mean
No relevant information found for the query.
Ask a question (type 'exit' to end): what is the first book of the bible
Answer: Genesis
Ask a question (type 'exit' to end): first book of the biblw
No relevant information found for the query.
Ask a question (type 'exit' to end): first book of the bible
No relevant information found for the query.
Ask a question (type 'exit' to end): number of books in the old testament
Answer: 39
Ask a question (type 'exit' to end): how many books are there