<a href="https://colab.research.google.com/github/erindakapllani/question_generator/blob/main/qa_pair_evaluator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [19]:
!pip install sentence-transformers


Collecting sentence-transformers
  Downloading sentence_transformers-3.0.1-py3-none-any.whl (227 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.1/227.1 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch>=1.11.0->sentence-transform

In [27]:
from sentence_transformers import SentenceTransformer, util
import random

# Initialize the model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Example text chunks
text_chunks = [
    "Effects predictions should consider the effects of climate and environmental change on populations of non-human biota that could adversely alter predicted environmental effects due to site activities or introduce new potential environmental effects.",
    "The mitigation plans for prevention or reduction of plant intake fouling should take into account projected effects of climate change, including frazil ice and bio-fouling (mussels, algae, marine plants).",
    "Future meteorological conditions (that is, accounting for climate change) and the extent of thermal plume from modelling should be used as a basis for extrapolating the long-term ice conditions / silt / fish / mussel / algae density observations for source water body and future potential for effects on the project."
]

# Example QA pairs generated by OpenAI
qa_pairs = [
    {"question": "Why should effects predictions also consider the effects of climate and environmental change on non-human biota populations?", "answer": "Effects predictions should also consider the effects of climate and environmental change on non-human biota populations because changes in these populations could adversely alter predicted environmental effects due to site activities or introduce new potential environmental effects."},
    {"question": "What does AI do in nuclear safety?", "answer": "AI has applications like predictive maintenance and automated inspections."},
    {"question": "What is QA-GANs?", "answer": "QA-GANs are a method for generating QA pairs using Generative Adversarial Networks."}
]

def check_relevance(chunk, qa_pair):
    """
    Check if the answer in the QA pair is relevant to the question within the context of the chunk
    using semantic similarity.
    """
    question = qa_pair['question']
    answer = qa_pair['answer']

    # Encode texts
    chunk_embedding = model.encode(chunk, convert_to_tensor=True)
    question_embedding = model.encode(question, convert_to_tensor=True)
    answer_embedding = model.encode(answer, convert_to_tensor=True)

    # Calculate cosine similarities
    question_similarity = util.pytorch_cos_sim(question_embedding, chunk_embedding).item()
    answer_similarity = util.pytorch_cos_sim(answer_embedding, chunk_embedding).item()

    # Determine relevance (this threshold can be adjusted)
    threshold = 0.5
    if question_similarity > threshold and answer_similarity > threshold:
        return True
    else:
        return False

# Randomly select a chunk and cross-check the QA pairs
random_chunk = random.choice(text_chunks)

print(f"Selected Chunk:\n{random_chunk}\n")

for qa in qa_pairs:
    is_relevant = check_relevance(random_chunk, qa)
    print(f"Question: {qa['question']}")
    print(f"Answer: {qa['answer']}")
    print(f"Relevant: {is_relevant}\n")


Selected Chunk:
Effects predictions should consider the effects of climate and environmental change on populations of non-human biota that could adversely alter predicted environmental effects due to site activities or introduce new potential environmental effects.

Question: Why should effects predictions also consider the effects of climate and environmental change on non-human biota populations?
Answer: Effects predictions should also consider the effects of climate and environmental change on non-human biota populations because changes in these populations could adversely alter predicted environmental effects due to site activities or introduce new potential environmental effects.
Relevant: True

Question: What does AI do in nuclear safety?
Answer: AI has applications like predictive maintenance and automated inspections.
Relevant: False

Question: What is QA-GANs?
Answer: QA-GANs are a method for generating QA pairs using Generative Adversarial Networks.
Relevant: False

