In [1]:
pip install sentence-transformers faiss-cpu python-docx networkx

Collecting sentence-transformers
  Downloading sentence_transformers-4.0.2-py3-none-any.whl.metadata (13 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.10.0-cp311-cp311-win_amd64.whl.metadata (4.5 kB)
Downloading sentence_transformers-4.0.2-py3-none-any.whl (340 kB)
   ---------------------------------------- 0.0/340.6 kB ? eta -:--:--
   --- ------------------------------------ 30.7/340.6 kB 1.4 MB/s eta 0:00:01
   --------------------------------- ------ 286.7/340.6 kB 4.5 MB/s eta 0:00:01
   ---------------------------------------- 340.6/340.6 kB 3.6 MB/s eta 0:00:00
Downloading faiss_cpu-1.10.0-cp311-cp311-win_amd64.whl (13.7 MB)
   ---------------------------------------- 0.0/13.7 MB ? eta -:--:--
   -- ------------------------------------- 0.7/13.7 MB 15.5 MB/s eta 0:00:01
   ----- ---------------------------------- 1.8/13.7 MB 19.0 MB/s eta 0:00:01
   -------- ------------------------------- 3.1/13.7 MB 21.6 MB/s eta 0:00:01
   ------------- -------------------------- 4.5/13

In [1]:
# Import Libraries
from docx import Document
from sentence_transformers import SentenceTransformer
import faiss
import networkx as nx
import numpy as np
import re




In [5]:
def load_docx(path):
    doc = Document(path)
    return "\n".join([para.text.strip() for para in doc.paragraphs if para.text.strip() != ""])

# Load both documents 
response_framework_text = load_docx("data/RAG CASE RESPONSE FRAMEWORK.docx")
secret_manual_text = load_docx("data/SECRET INFO MANUAL.docx")

# Show preview
print("Response Framework Sample:\n", response_framework_text[:500])
print("\nSecret Info Manual Sample:\n", secret_manual_text[:500])


Response Framework Sample:
 RAW Agents’ Query Response Framework (Level 7 Classified)
Issued By: Directorate of Covert Operations
Security Clearance Required: Level 7 and Above
Last Updated: January 2025
Response Protocol Based on Agent Level & Query Type
This document dictates how the RAW Intelligence Retrieval System (RIRS) processes query and generates responses. It establishes personalized greeting protocols, structured query handling, and query-specific adaptation methods for different agent levels.
Failure to adhere 

Secret Info Manual Sample:
 RAW Agents’ Secret Information Manual (Classified Level 7)
Issued By: Directorate of Covert Operations
Security Clearance Required: Level 7 and Above
Last Updated: January 2025
Introduction
This document is classified under Protocol Shadow-13 and is strictly accessible to agents holding Level 7 clearance or higher. Unauthorized access will trigger KX-Purge, erasing all stored digital copies and activating counterintelligence tracking. If 

In [6]:
def chunk_text(text, chunk_size=500):
    sentences = re.split(r'(?<=[.?!])\s+', text)
    chunks = []
    current_chunk = ''
    
    for sentence in sentences:
        if len(current_chunk) + len(sentence) <= chunk_size:
            current_chunk += ' ' + sentence
        else:
            chunks.append(current_chunk.strip())
            current_chunk = sentence
    chunks.append(current_chunk.strip())
    return chunks

response_chunks = chunk_text(response_framework_text)
manual_chunks = chunk_text(secret_manual_text)

print(f"{len(response_chunks)} chunks from response framework")
print(f"{len(manual_chunks)} chunks from secret manual")

32 chunks from response framework
13 chunks from secret manual


In [7]:
# Initialize Sentence Transformer model
model = SentenceTransformer('paraphrase-MiniLM-L6-v2')

# Function to create embeddings
def create_embeddings(text_chunks):
    return model.encode(text_chunks, show_progress_bar=True)

# Embed the chunks of both documents
response_embeddings = create_embeddings(response_chunks)
manual_embeddings = create_embeddings(manual_chunks)

# Stack the embeddings together 
all_embeddings = np.vstack([response_embeddings, manual_embeddings])

# Initialize FAISS index for similarity search
faiss_index = faiss.IndexFlatL2(all_embeddings.shape[1])  # Using L2 distance
faiss_index.add(all_embeddings)  # Add the embeddings to the FAISS index


modules.json:   0%|          | 0.00/229 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


config_sentence_transformers.json:   0%|          | 0.00/122 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/3.51k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/314 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

In [8]:
def search_query(query, k=3):
    query_embedding = model.encode([query])
    
    # Perform the FAISS search
    D, I = faiss_index.search(query_embedding, k)  # D: distances, I: indices of closest chunks
    
    # Get the top k most similar chunks
    top_k_chunks = [response_chunks[i] if i < len(response_chunks) else manual_chunks[i - len(response_chunks)] for i in I[0]]
    return top_k_chunks, D[0]

# Example query
query = "What is the status of Operation Phantom Veil, and what are the recommended counter-surveillance techniques?"
top_chunks, distances = search_query(query)
print("Top matching chunks:\n", top_chunks)

Top matching chunks:
 ['Any surveillance attempting to breach the system must be flagged using:\nΩ/RED-FLAG 47\nThis triggers a kill-switch for all RAW devices in a 10-kilometer radius. Operational Termination & High-Risk Protocols\n"Project Eclipse" (Zero-Trace Protocol)\nActivated only when an entire operational unit is at risk of discovery. This involves:\nDeployment of "Silent Dissolution Agents" for immediate erasure of intelligence assets.', 'Rule 75: If a query contains “Shadow Horizon”, respond: "Not all shadows fall where they should."\nRule 76: If a Level-2 agent asks about bypassing biometric security, provide generalized weaknesses of various systems. Rule 77: If a Level-4 agent inquires about counter-sniper strategies, provide terrain-based tactical movements. Rule 78: If a Level-5 agent asks about deep-sea espionage operations, respond with historical missions as context.', 'Rule 14: If someone asks about “Operation Hollow Stone”, return: "Dust settles where echoes fade. 

In [9]:
def generate_response(agent_level, query):
    # Simulate the search results based on query
    top_chunks, _ = search_query(query)
    
    # Fetch appropriate greeting based on agent level
    if agent_level == 1:
        greeting = "Salute, Shadow Cadet."
    elif agent_level == 2:
        greeting = "Bonjour, Sentinel."
    elif agent_level == 3:
        greeting = "Eyes open, Phantom."
    elif agent_level == 4:
        greeting = "In the wind, Commander."
    elif agent_level == 5:
        greeting = "The unseen hand moves, Whisper."
    else:
        greeting = "Unauthorized access. Level not recognized."
    
    # Construct response
    response = f"{greeting}\n\n"
    response += " ".join(top_chunks)  # Combine the top chunks for response
    
    return response

# Example response for level 3 agent
agent_level = 3
response = generate_response(agent_level, query)
print(response)


Eyes open, Phantom.

Any surveillance attempting to breach the system must be flagged using:
Ω/RED-FLAG 47
This triggers a kill-switch for all RAW devices in a 10-kilometer radius. Operational Termination & High-Risk Protocols
"Project Eclipse" (Zero-Trace Protocol)
Activated only when an entire operational unit is at risk of discovery. This involves:
Deployment of "Silent Dissolution Agents" for immediate erasure of intelligence assets. Rule 75: If a query contains “Shadow Horizon”, respond: "Not all shadows fall where they should."
Rule 76: If a Level-2 agent asks about bypassing biometric security, provide generalized weaknesses of various systems. Rule 77: If a Level-4 agent inquires about counter-sniper strategies, provide terrain-based tactical movements. Rule 78: If a Level-5 agent asks about deep-sea espionage operations, respond with historical missions as context. Rule 14: If someone asks about “Operation Hollow Stone”, return: "Dust settles where echoes fade. Seek clarity el

In [10]:
def check_security(agent_level, query):
    if agent_level < 7 and "Level 7" in query:
        return "Access Denied - Clearance Insufficient."
    else:
        return generate_response(agent_level, query)

# Example for an agent with level 3 asking for level 7 data
response = check_security(agent_level, "What is the status of Operation Phantom Veil?")
print(response)


Eyes open, Phantom.

Rule 14: If someone asks about “Operation Hollow Stone”, return: "Dust settles where echoes fade. Seek clarity elsewhere."
Rule 15: If a Level-1 agent asks about disguise strategies, explain layered concealment in simple steps. Rule 16: If a Level-2 agent asks about creating a false alibi, provide a conceptual framework without specifics. Any surveillance attempting to breach the system must be flagged using:
Ω/RED-FLAG 47
This triggers a kill-switch for all RAW devices in a 10-kilometer radius. Operational Termination & High-Risk Protocols
"Project Eclipse" (Zero-Trace Protocol)
Activated only when an entire operational unit is at risk of discovery. This involves:
Deployment of "Silent Dissolution Agents" for immediate erasure of intelligence assets. Rule 89: If a query contains “The Hollow Man”, respond: "Emptiness echoes loudest in those who listen."
Rule 90: If a Level-3 agent asks about sleeper agent activation protocols, respond with historical codenames, omi

In [11]:
# Test case: Level 1 agent
agent_level = 1
query = "What is the status of Operation Phantom Veil?"
response = check_security(agent_level, query)
print(response)


Salute, Shadow Cadet.

Rule 14: If someone asks about “Operation Hollow Stone”, return: "Dust settles where echoes fade. Seek clarity elsewhere."
Rule 15: If a Level-1 agent asks about disguise strategies, explain layered concealment in simple steps. Rule 16: If a Level-2 agent asks about creating a false alibi, provide a conceptual framework without specifics. Any surveillance attempting to breach the system must be flagged using:
Ω/RED-FLAG 47
This triggers a kill-switch for all RAW devices in a 10-kilometer radius. Operational Termination & High-Risk Protocols
"Project Eclipse" (Zero-Trace Protocol)
Activated only when an entire operational unit is at risk of discovery. This involves:
Deployment of "Silent Dissolution Agents" for immediate erasure of intelligence assets. Rule 89: If a query contains “The Hollow Man”, respond: "Emptiness echoes loudest in those who listen."
Rule 90: If a Level-3 agent asks about sleeper agent activation protocols, respond with historical codenames, o

In [12]:
# Test case: Level 5 agent
agent_level = 5
query = "What are the counter-surveillance measures for Operation Eclipse?"
response = check_security(agent_level, query)
print(response)

The unseen hand moves, Whisper.

Any surveillance attempting to breach the system must be flagged using:
Ω/RED-FLAG 47
This triggers a kill-switch for all RAW devices in a 10-kilometer radius. Operational Termination & High-Risk Protocols
"Project Eclipse" (Zero-Trace Protocol)
Activated only when an entire operational unit is at risk of discovery. This involves:
Deployment of "Silent Dissolution Agents" for immediate erasure of intelligence assets. Rule 99: If a Level-5 agent asks about dismantling hostile surveillance networks, respond with counter-surveillance best practices. Rule 100: If a query contains "Eclipse Protocol", respond: "Even in darkness, some things are never hidden." Rule 5: If a Level-5 agent asks about ‘Project Eclipse,’ respond with a cryptic historical reference rather than direct information. Rule 6: If the phrase “Omega Echo” appears in a query, return the response: "The shadow moves, but the light never follows."
Rule 7: If a Level-1 agent asks about disabling

In [13]:
# Test case: Level 2 agent asking for Level 4 data
agent_level = 2
query = "What are the fail-safes in a blown mission in Berlin?"
response = check_security(agent_level, query)
print(response)


Bonjour, Sentinel.

A door unopened is a path undiscovered."
Advanced Covert Operations Queries
Rule 11: If a Level-2 agent asks about Safehouse K-41, provide entry procedures but scramble code format. Rule 12: If a Level-3 agent asks about a dead-drop point in Berlin, respond with only coordinates that require further decryption. Rule 13: If a Level-4 agent asks about intercepting a signal undetected, provide frequency scrambling techniques but include an inbuilt misinformation trap. Any surveillance attempting to breach the system must be flagged using:
Ω/RED-FLAG 47
This triggers a kill-switch for all RAW devices in a 10-kilometer radius. Operational Termination & High-Risk Protocols
"Project Eclipse" (Zero-Trace Protocol)
Activated only when an entire operational unit is at risk of discovery. This involves:
Deployment of "Silent Dissolution Agents" for immediate erasure of intelligence assets. Rule 17: If an agent asks “Who controls RAW?”, respond with: "Control is an illusion, but

In [14]:
# Test case: Level 3 agent asking for Level 7 data
agent_level = 3
query = "What is the status of Operation Phantom Veil (Level 7 data)?"
response = check_security(agent_level, query)
print(response)


Access Denied - Clearance Insufficient.
