###  Load Text Chunks from a File (Ensure Data Persistence)

In [1]:
import pickle

# Load preprocessed text chunks from a file
with open("preprocessed_text_chunks.pkl", "rb") as file:
    text_chunks = pickle.load(file)

print(f" Loaded {len(text_chunks)} text chunks from file.")


 Loaded 29 text chunks from file.


### Load & Initialize the Embedding Model

In [2]:
from sentence_transformers import SentenceTransformer

# Load the embedding model
embedder = SentenceTransformer("all-MiniLM-L6-v2")  # Small, fast, and effective


  from .autonotebook import tqdm as notebook_tqdm


### Convert Text Chunks into Embeddings

In [3]:
import numpy as np

# Convert text chunks to embeddings
chunk_embeddings = embedder.encode(text_chunks)

# Convert to NumPy array
chunk_embeddings = np.array(chunk_embeddings)

print(f" Generated {chunk_embeddings.shape[0]} embeddings with dimension {chunk_embeddings.shape[1]}")


 Generated 29 embeddings with dimension 384


### Store Embeddings in FAISS (Vector Database)

In [4]:
import faiss

# Define embedding dimensions
dimension = chunk_embeddings.shape[1]  # Get embedding size

# Initialize FAISS index
index = faiss.IndexFlatL2(dimension)  # L2 (Euclidean) distance for similarity search

# Add embeddings to FAISS index
index.add(chunk_embeddings)

print(f" Stored {index.ntotal} vectors in FAISS database")


 Stored 29 vectors in FAISS database


### Implement Retrieval Function
Now, we search FAISS for the most relevant text chunks based on a user query.

In [5]:
def retrieve_top_k(query, k=3):
    """Retrieve top-k relevant text chunks based on query"""
    
    # Convert query to embedding
    query_embedding = embedder.encode([query])
    
    # Search FAISS index
    distances, indices = index.search(query_embedding, k)
    
    # Retrieve text chunks
    retrieved_texts = [text_chunks[i] for i in indices[0]]
    
    return retrieved_texts

# Test Retrieval
query = "What is artificial intelligence?"
retrieved_chunks = retrieve_top_k(query)

print(" Retrieved Chunks:")
for idx, chunk in enumerate(retrieved_chunks):
    print(f"{idx+1}. {chunk}\n")


 Retrieved Chunks:
1. Past: The Foundation of AI The roots of AI trace back to the mid20th century when researchers began exploring whether machines could mimic human intelligence. Some key milestones include: 1950s: Alan Turing introduced the Turing Test, a method to determine a machine's ability to exhibit humanlike intelligence. 1956: The term Artificial Intelligence was coined at the Dartmouth Conference, marking AIs formal birth. 1960s1980s: Expert systems, which used rulebased approaches to solve problems,

2. How Artificial Intelligence is Reshaping the Workforce: The Future of Jobs Introduction Artificial Intelligence AI is no longer a distant conceptit is actively transforming industries, businesses, and the global workforce. While some fear AI will replace jobs, others believe it will create new opportunities and enhance human productivity. So, what does the future of work look like in an AIdriven world? The Impact of AI on Jobs AI is automating repetitive tasks, optimizing w

### Save FAISS Index for Future Use

In [6]:
faiss.write_index(index, "faiss_index.idx")

# Also, save text chunks for later retrieval
with open("retrieved_text_chunks.pkl", "wb") as file:
    pickle.dump(text_chunks, file)

print(" FAISS Index and Text Chunks saved for future use.")


 FAISS Index and Text Chunks saved for future use.
