📄 Extracting Environment Variables and Importing Required Libraries

fitz: Used for working with PDF files (via PyMuPDF).

os: Enables interaction with the operating system, such as reading environment variables.

numpy: Useful for numerical computations.

json: Handles reading and writing JSON data.

dotenv: Loads environment variables from a .env file.

InferenceClient from huggingface_hub: Allows interaction with Hugging Face's inference endpoints.

In [1]:
import fitz
import os
import numpy as np
import json
from dotenv import load_dotenv
import os
from huggingface_hub import InferenceClient

load_dotenv()
token = os.getenv("HUGGING_FACE_TOKEN")

🔗 Initializing the Hugging Face Inference Client

In [2]:
embedding_client = InferenceClient(
    provider="hf-inference",
    api_key= token,
)

In [37]:
def extract_text_from_pdf(pdf_path):
    """
    Extracts text from a PDF file.

    Args:
    pdf_path (str): Path to the PDF file.

    Returns:
    str: Extracted text from the PDF.
    """
    # Open the PDF file
    mypdf = fitz.open(pdf_path)
    all_text = ""  # Initialize an empty string to store the extracted text

    # Iterate through each page in the PDF
    for page_num in range(mypdf.page_count):
        page = mypdf[page_num]  # Get the page
        text = page.get_text("text")  # Extract text from the page
        all_text += text  # Append the extracted text to the all_text string

    return all_text  # Return the extracted text


# Define the path to the PDF file
pdf_path = "C:/Users/Eray/Desktop/RAG/rag-learning/data/AI_Information.pdf"

# Extract text from the PDF file
extracted_text = extract_text_from_pdf(pdf_path)

print(extracted_text[0:256])

Understanding Artificial Intelligence 
Chapter 1: Introduction to Artificial Intelligence 
Artificial intelligence (AI) refers to the ability of a digital computer or computer-controlled robot 
to perform tasks commonly associated with intelligent beings. 


In [None]:
def chunk_text(text, n, overlap):
    """
    Chunks the given text into segments of n characters with overlap.

    Args:
    text (str): The text to be chunked.
    n (int): The number of characters in each chunk.
    overlap (int): The number of overlapping characters between chunks.

    Returns:
    List[str]: A list of text chunks.
    """
    chunks = []  # Initialize an empty list to store the chunks
    
    # Loop through the text with a step size of (n - overlap)
    for i in range(0, len(text), n - overlap):
        # Append a chunk of text from index i to i + n to the chunks list
        chunks.append(text[i:i + n])

    return chunks  # Return the list of text chunks

In [30]:
t_chunks = chunk_text("This is a test example for chunk_text function.", 6, 3)

for i in t_chunks:
    print(i)

This i
s is a
s a te
 test 
st exa
exampl
mple f
e for 
or chu
chunk_
nk_tex
text f
t func
unctio
tion.
n.


In [40]:
text_chunks = chunk_text(extracted_text, 512, 200)

# # Print the number of text chunks created
print("Number of text chunks:", len(text_chunks))

# # Print the first text chunk
print("\nFirst text chunk:")
print(text_chunks[0])

Number of text chunks: 108

First text chunk:
Understanding Artificial Intelligence 
Chapter 1: Introduction to Artificial Intelligence 
Artificial intelligence (AI) refers to the ability of a digital computer or computer-controlled robot 
to perform tasks commonly associated with intelligent beings. The term is frequently applied to 
the project of developing systems endowed with the intellectual processes characteristic of 
humans, such as the ability to reason, discover meaning, generalize, or learn from past 
experience. Over the past few decades, 


This snippet iterates over a list of texts (text_list) and generates embeddings for each text using the Hugging Face Inference API client

The raw output result from the Hugging Face API is converted into a NumPy array called embedding_vector.

This allows for efficient numerical operations on the embedding.

The resulting vector is then appended to the embeddings list for later use, such as similarity calculations.

In [None]:
def create_embeddings(text_list):
        
    embeddings = []
    for text in text_list:
        result = embedding_client.feature_extraction(
            text,
            model="sentence-transformers/all-MiniLM-L6-v2")
        
        embedding_vector = np.array(result)

        embeddings.append(embedding_vector)
        
        
    return embeddings   # ::--> [nparray, nparray2, ...] 



response = create_embeddings(text_chunks)

In [72]:
print(type(response))
print(type(response[0]))
print(len(response))
print(response[0][0:12])

<class 'list'>
<class 'numpy.ndarray'>
108
[-0.02458423  0.00881549  0.00182075  0.0167284  -0.0075     -0.05978368
  0.08769932  0.04123058 -0.04862984  0.0482662  -0.06162594 -0.02015465]


📐 Calculating Cosine Similarity Between Two Vectors

The cosine_similarity function computes the cosine similarity metric, which measures the cosine of the angle between two vectors in a multi-dimensional space.

    Inputs: Two NumPy arrays (vec1 and vec2) representing vectors.

    Process: Calculates the dot product of the vectors divided by the product of their magnitudes (norms).

    Output: A float value between -1 and 1 indicating similarity:

        1 means vectors are identical in direction.

        0 means vectors are orthogonal (no similarity).

        -1 means vectors are diametrically opposed.

In [73]:
def cosine_similarity(vec1, vec2):
    """
    Calculates the cosine similarity between two vectors.

    Args:
    vec1 (np.ndarray): The first vector.
    vec2 (np.ndarray): The second vector.

    Returns:
    float: The cosine similarity between the two vectors.
    """
    # Compute the dot product of the two vectors and divide by the product of their norms
    return np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))

In [None]:
def semantic_search(query, text_chunks, embeddings, k=5):
    """
    Performs semantic search on the text chunks using the given query and embeddings.

    Args:
    query (str): The query for the semantic search.
    text_chunks (List[str]): A list of text chunks to search through.
    embeddings (List[dict]): A list of embeddings for the text chunks.
    k (int): The number of top relevant text chunks to return. Default is 5.

    Returns:
    List[str]: A list of the top k most relevant text chunks based on the query.
    """
    
    # Create an embedding for the query
    query_embedding = create_embeddings([query])[0]
    
    # ::--> [nparray(query_embedding)] , len = 1
    
    similarity_scores = []  # Initialize a list to store similarity scores

    # Calculate similarity scores between the query embedding and each text chunk embedding
    for i, chunk_embedding in enumerate(embeddings):
        similarity_score = cosine_similarity(np.array(query_embedding), np.array(chunk_embedding))
        similarity_scores.append((i, similarity_score))  # Append the index and similarity score

    # Sort the similarity scores in descending order
    similarity_scores.sort(key=lambda x: x[1], reverse=True)
    # Get the indices of the top k most similar text chunks
    top_indices = [index for index, _ in similarity_scores[:k]]
    # Return the top k most relevant text chunks
    return [text_chunks[index] for index in top_indices]

In [91]:
# Load the validation data from a JSON file
with open('C:/Users/Eray/Desktop/RAG/rag-learning/data/val.json') as f:
    data = json.load(f)

# Extract the first query from the validation data
query = data[0]['question']

# Perform semantic search to find the top 2 most relevant text chunks for the query
top_chunks = semantic_search(query, text_chunks, response, k=2)

In [92]:
# Print the query
print("Query:", query)

# Print the top 2 most relevant text chunks
for i, chunk in enumerate(top_chunks):
    print(f"Context {i + 1}:\n{chunk}\n=====================================")

Query: What is 'Explainable AI' and why is it considered important?
Context 1:
he Future of Artificial Intelligence 
The future of AI is likely to be characterized by continued advancements and broader adoption 
across various domains. Key trends and areas of development include: 
Explainable AI (XAI) 
Explainable AI (XAI) aims to make AI systems more transparent and understandable. XAI 
techniques are being developed to provide insights into how AI models make decisions, 
enhancing trust and accountability. 
AI at the Edge 
AI at the edge involves processing data locally on devices, 
Context 2:
p learning models, as well 
as exploring new architectures and training techniques. 
Explainable AI (XAI) 
Explainable AI (XAI) aims to make AI systems more transparent and understandable. Research in 
XAI focuses on developing methods for explaining AI decisions, enhancing trust, and improving 
accountability. 
AI and Neuroscience 
The intersection of AI and neuroscience is a promising area of

In [93]:
# Define the system prompt for the AI assistant
system_prompt = "You are an AI assistant that strictly answers based on the given context. If the answer cannot be derived directly from the provided context, respond with: 'I do not have enough information to answer that.'"

def generate_response(system_prompt, user_message, model="Qwen/Qwen2.5-7B-Instruct-1M"):
    """
    Generates a response from the AI model based on the system prompt and user message.

    Args:
    system_prompt (str): The system prompt to guide the AI's behavior.
    user_message (str): The user's message or query.
    model (str): The model to be used for generating the response.

    Returns:
    dict: The response from the AI model.
    """
    
    chat_client = InferenceClient(
        provider="featherless-ai",
        api_key=os.getenv("HUGGING_FACE_TOKEN")
    )
    
    response = chat_client.chat.completions.create(
        model=model,
        temperature=0,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_message}
        ]
    )
    return response



In [110]:
# Create the user prompt based on the top chunks
user_prompt = "\n".join([f"Context {i + 1}:\n{chunk}\n=====================================\n" for i, chunk in enumerate(top_chunks)])
user_prompt = f"{user_prompt}\nQuestion: {query}"


# # Generate AI response
ai_response = generate_response(system_prompt, user_prompt)

print(ai_response)
print(type(ai_response))

print(ai_response.model)
print(type(ai_response.choices))

print(ai_response.choices[0].message.content)


ChatCompletionOutput(choices=[ChatCompletionOutputComplete(finish_reason='stop', index=0, message=ChatCompletionOutputMessage(role='assistant', content='Explainable AI (XAI) is aimed at making AI systems more transparent and understandable. This approach is crucial because it enhances trust and accountability by providing insights into how AI models make decisions.', tool_call_id=None, tool_calls=None), logprobs=None)], created=1751303621145, id='Bkp1By', model='Qwen/Qwen2.5-7B-Instruct-1M', system_fingerprint='', usage=ChatCompletionOutputUsage(completion_tokens=39, prompt_tokens=287, total_tokens=326), object='chat.completion')
<class 'huggingface_hub.inference._generated.types.chat_completion.ChatCompletionOutput'>
Qwen/Qwen2.5-7B-Instruct-1M
<class 'list'>
Explainable AI (XAI) is aimed at making AI systems more transparent and understandable. This approach is crucial because it enhances trust and accountability by providing insights into how AI models make decisions.


In [111]:
# Define the system prompt for the evaluation system
evaluate_system_prompt = "You are an intelligent evaluation system tasked with assessing the AI assistant's responses. If the AI assistant's response is very close to the true response, assign a score of 1. If the response is incorrect or unsatisfactory in relation to the true response, assign a score of 0. If the response is partially aligned with the true response, assign a score of 0.5."

# Create the evaluation prompt by combining the user query, AI response, true response, and evaluation system prompt
evaluation_prompt = f"User Query: {query}\nAI Response:\n{ai_response.choices[0].message.content}\nTrue Response: {data[0]['ideal_answer']}\n{evaluate_system_prompt}"

# Generate the evaluation response using the evaluation system prompt and evaluation prompt
evaluation_response = generate_response(evaluate_system_prompt, evaluation_prompt)

# Print the evaluation response
print(evaluation_response.choices[0].message.content)

Score: 1

The AI response accurately captures the essence of Explainable AI (XAI), emphasizing its goal of transparency and understandability, as well as its importance for trust, accountability, and fairness. The response closely aligns with the true response provided, hence a score of 1 is appropriate.
