# **Hugging Face QA**

### **Step 1 : Creating Vector DB And Returning More Relavent Chunk**
#### **Models :**
##### **1. all-mpnet-base-v2**
##### **2. distilbert-base-nli-mean-tokens**

In [12]:
from sentence_transformers import SentenceTransformer
import faiss
from transformers import pipeline, BartForConditionalGeneration, BartTokenizer, GPT2LMHeadModel, GPT2Tokenizer
import numpy as np
import nltk
from nltk.tokenize import sent_tokenize
import warnings
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings('ignore', category=RuntimeWarning)
warnings.filterwarnings('ignore', category=UserWarning)


In [30]:
def chunker(max_chunk_length=50):
    with open("Extract_Content.txt","r",encoding="utf-8") as fh: 
        content = fh.read()

    sentences = sent_tokenize(content)
    chunks = []
    current_chunk = []
    current_length = 0

    for sentence in sentences:
        sentence_length = len(sentence.split())
        if current_length + sentence_length <= max_chunk_length:
            current_chunk.append(sentence)
            current_length += sentence_length
        else:
            chunks.append(' '.join(current_chunk))
            current_chunk = [sentence]
            current_length = sentence_length

    if current_chunk:
        chunks.append(' '.join(current_chunk))

    return chunks

In [32]:
text_chunks = chunker()
for i in text_chunks:
    print(i,"\n")

In 1667, a Danish scientist finally concluded that certain mysterious stones prized for their supposed medicinal powers, hadn�t fallen from the sky during lunar eclipses and weren�t serpent tongues. 

In fact, they were fossilized teeth� many belonging to a prehistoric species that would come to be called megalodon, the biggest shark to ever live. So what was it like when megalodon ruled the seas? And what brought this formidable predator to extinction? 

Because their skeletons were cartilaginous, what remains of megalodons are mostly scattered clues, like some isolated vertebrae and lots of their enamel-protected teeth. Like many sharks, megalodons could shed and replace thousands of teeth over the course of their lives. 

Interestingly, some fossil sites harbor especially high numbers of small megalodon teeth. Experts believe these were nurseries that supported countless generations of budding megalodons. They grew up in sheltered  and food-packed shallow waters before becoming unri

In [33]:
def get_vector_chunk_single(model_name, query):
    model = SentenceTransformer(model_name)

    embeddings = model.encode(text_chunks)
    embeddings = embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True)

    index = faiss.IndexFlatIP(embeddings.shape[1])
    index.add(embeddings)

    query_embedding = model.encode([query])[0]
    query_embedding = query_embedding / np.linalg.norm(query_embedding)

    D, I = index.search(query_embedding.reshape(1, -1), k=2)

    most_relevant_chunk = text_chunks[I[0][0]]
    return(f"{most_relevant_chunk}")


def Question_Answer(query,chunks):
    def answer_chunk(query=query,chunks=chunks):
        text_chunks = chunks

        #Function To Get Answer Chunk (Single Model)
        def get_vector_chunk_single(model_name, query, text_chunks=text_chunks):
            model = SentenceTransformer(model_name)

            embeddings = model.encode(text_chunks)
            embeddings = embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True)

            index = faiss.IndexFlatIP(embeddings.shape[1])
            index.add(embeddings)

            query_embedding = model.encode([query])[0]
            query_embedding = query_embedding / np.linalg.norm(query_embedding)

            D, I = index.search(query_embedding.reshape(1, -1), k=2)

            most_relevant_chunk = text_chunks[I[0][0]]
            return(f"{most_relevant_chunk}")


        models = ["all-mpnet-base-v2","distilbert-base-nli-mean-tokens"]
        output_chunks = list()
        for model in models:
            output_chunks.append(get_vector_chunk_single(model, query=query))
        if output_chunks[0]==output_chunks[1]:
            return(output_chunks[0])
        else:
            return(output_chunks[0]+"\n"+output_chunks[1])
    
    context = answer_chunk()

    return context


question = "How do scientists confirm they were apex predators?"
print(Question_Answer(question,text_chunks))

Generally, as carnivores consume protein-rich meat, certain nitrogen isotopes accumulate in their tissues� including the enamel of their teeth. Analyzing megalodon teeth, scientists confirmed they were apex predators that not only ate large prey species� but also other predators, perhaps even each other.


### **Testing Model For Most Accurate Response**

In [34]:
models = ["all-mpnet-base-v2","distilbert-base-nli-mean-tokens"]
questions = ["What evidence do scientists have that suggests megalodons were apex predators?"]
for model in models:
    for question in questions:
        print(f"\n\n{model}")
        print(f"{question}")
        print(get_vector_chunk_single(model, question))



all-mpnet-base-v2
What evidence do scientists have that suggests megalodons were apex predators?
Generally, as carnivores consume protein-rich meat, certain nitrogen isotopes accumulate in their tissues� including the enamel of their teeth. Analyzing megalodon teeth, scientists confirmed they were apex predators that not only ate large prey species� but also other predators, perhaps even each other.


distilbert-base-nli-mean-tokens
What evidence do scientists have that suggests megalodons were apex predators?
Generally, as carnivores consume protein-rich meat, certain nitrogen isotopes accumulate in their tissues� including the enamel of their teeth. Analyzing megalodon teeth, scientists confirmed they were apex predators that not only ate large prey species� but also other predators, perhaps even each other.


### **Step 2 : Getting Answer From Most Relavent Chunk**
#### **Models :**
##### **1. deepset/roberta-base-squad2**
##### **2. bert-large-uncased-whole-word-masking-finetuned-squad**

In [21]:
from transformers import pipeline

# Load the question-answering pipeline with adjusted parameters
qa_pipeline = pipeline("question-answering", model="deepset/roberta-large-squad2", tokenizer="deepset/roberta-large-squad2", model_kwargs={"max_length": 200})

# Define the context and question
context = """
Generally, as carnivores consume protein-rich meat, certain nitrogen isotopes accumulate in their tissues,
including the enamel of their teeth. Analyzing megalodon teeth, scientists confirmed they were apex predators
that not only ate large prey species, but also other predators, perhaps even each other.
"""
question = "How do scientists confirm they were apex predators?"

# Get the answer with adjusted length
result = qa_pipeline(question=question, context=context)

# Print the answer
print(result['answer'])

Analyzing megalodon teeth


In [4]:
import google.generativeai as genai
from g4f.client import Client

## **Main Q/A Function**

In [47]:
def gemini_answer(question, answer):
    try:
        gemini_api = "AIzaSyBNnKC9IwMUhGYbgpvJVDD4vJFfVZSOt5k"
        genai.configure(api_key=gemini_api)
        model = genai.GenerativeModel('gemini-1.5-flash')
        response = model.generate_content(f"question: {question}, keyword: {answer}\nwrite me an detailed answer to the question including the keyword")
        return response.text
    
    except Exception as e:
        print(f"Error: {e}")
        return None

#Example usage:
question = "How do scientists confirm they were apex predators?"
answer = "Analyzing megalodon teeth"
gemini_response = gemini_answer(question, answer)
print(gemini_response)


##  Confirming Apex Predators: A Bite into Megalodon's Past

While we can't directly observe megalodon hunting, scientists rely on a treasure trove of fossilized evidence to confirm its apex predator status. The most significant piece of this puzzle is **analyzing megalodon teeth**.

**Here's how scientists use these prehistoric chompers to piece together the megalodon's role in the ancient ocean:**

* **Tooth Size and Shape:**  Megalodon teeth are truly monstrous, some reaching over 7 inches long! This sheer size indicates a powerful bite capable of crushing bone and tearing through flesh. Their serrated edges suggest a predatory lifestyle, designed to kill large prey.

* **Tooth Wear Patterns:** By examining wear patterns on the teeth, scientists can infer the types of food megalodon consumed.  Evidence of deep gouges and scrapes, coupled with the absence of crushing surfaces, indicates a preference for large, flesh-filled prey.

* **Isotope Analysis:**  The chemical composition of f

In [49]:
def gpt_answer(question, answer):
    try:
        client = Client()

        sample_prompt = "Hi"
        sample_response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": sample_prompt}]
        )
        
        prompt = f"Generate a detailed answer to the question: {question}, including the keyword {answer}"
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content
    
    except Exception as e:
        print(f"Error: {e}")
        return None



#Example usage:
question = "How do scientists confirm they were apex predators?"
answer = "Analyzing megalodon teeth"
gemini_response = gpt_answer(question, answer)
print(gemini_response)

Scientists confirm that an animal was an apex predator by analyzing various factors such as its size, diet, and ecological role. One way they do this is by examining the teeth of the animal in question. For example, in the case of megalodon, scientists have analyzed the teeth of this ancient shark to determine its status as an apex predator.

Megalodon was a massive shark that lived approximately 23 to 3.6 million years ago. Its teeth, some of which measure over 7 inches in length, provide valuable insight into its predatory behavior. By studying the size, shape, and wear patterns of megalodon teeth, scientists can infer the shark's diet and hunting habits. For instance, the serrated edges and robust structure of megalodon teeth suggest that it was well-equipped to prey on large marine mammals such as whales. This indicates that megalodon likely occupied the top of the food chain in its ancient marine ecosystem.

In addition to tooth analysis, scientists also consider other evidence to

In [39]:
def get_vector_chunk_single(model_name, text_chunks, query):
    model = SentenceTransformer(model_name)

    embeddings = model.encode(text_chunks)
    embeddings = embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True)

    index = faiss.IndexFlatIP(embeddings.shape[1])
    index.add(embeddings)

    query_embedding = model.encode([query])[0]
    query_embedding = query_embedding / np.linalg.norm(query_embedding)

    D, I = index.search(query_embedding.reshape(1, -1), k=2)

    most_relevant_chunk = text_chunks[I[0][0]]
    return(f"{most_relevant_chunk}")

def most_relevant_chunk(query, text_chunks):
    models = ["all-mpnet-base-v2","distilbert-base-nli-mean-tokens"]
    output_chunks = list()
    for model in models:
        output_chunks.append(get_vector_chunk_single(model, text_chunks, query=query))
    if output_chunks[0]==output_chunks[1]:
        return(output_chunks[0])
    else:
        return(output_chunks[0]+"\n"+output_chunks[1])



In [40]:
question = "How do scientists confirm they were apex predators?"
content_chunks = chunker()
key_chunk = most_relevant_chunk(question, content_chunks)

In [44]:
def keyword_answer(question, chunk):
    qa_pipeline = pipeline("question-answering", model="deepset/roberta-large-squad2", tokenizer="deepset/roberta-large-squad2", model_kwargs={"max_length": 200})
    result = qa_pipeline(question=question, context=chunk)
    return(result['answer'])

keyword = keyword_answer(question, key_chunk)

In [48]:
final_answer = gemini_answer(question,keyword)
print(final_answer)

#Or Use GPT Response

## Analyzing Megalodon Teeth: Confirming an Apex Predator

While we can't directly observe megalodon in action, scientists rely on fossil evidence, particularly their teeth, to confirm their status as apex predators. Here's how analyzing megalodon teeth provides crucial insights:

**1. Size and Shape:** Megalodon teeth are enormous, some reaching over 7 inches in length. Their serrated edges, robust structure, and triangular shape are ideal for tearing through flesh and bone, indicating a predatory diet.

**2. Tooth Wear Patterns:** The distinctive wear patterns on megalodon teeth provide valuable information about their feeding habits. The presence of deep grooves and scratches suggests they were used for ripping and tearing large prey, consistent with apex predator behavior.

**3. Bite Force Estimates:** By comparing the size and shape of megalodon teeth to modern sharks, scientists can estimate their bite force. Studies suggest that megalodon possessed a bite force exceeding 18 tons

In [50]:
final_answer = gpt_answer(question, keyword)
print(final_answer)

#### How do scientists confirm they were apex predators?

Scientists confirm that certain animals were apex predators by analyzing various factors, such as their position in the food chain and their physical characteristics. One example of this is the study of megalodon teeth, which provides insights into the apex predator status of these ancient sharks [[1]](https://www.sciencealert.com/megalodons-were-apex-predators-at-the-highest-level).

#### Analyzing megalodon teeth

By studying the levels of nitrogen isotopes present in cells, scientists can determine where a creature was in the food chain. Nitrogen-15 builds up the higher in the food chain an animal is. Therefore, by analyzing the nitrogen isotope ratios in megalodon teeth, scientists can determine the position of megalodons in the food web and confirm their status as apex predators [[1]](https://www.sciencealert.com/megalodons-were-apex-predators-at-the-highest-level).

In addition to nitrogen isotope analysis, the shape of me