# **Hyperdimensional Computing and Resonance-Based Swarm Intelligence**

## **Overview**
This notebook integrates **Hyperdimensional Computing (HDC)**, **Pyramid-Resonance AI**, and **Queen-Agent Swarm Intelligence** into an optimized **adaptive learning framework**. The system is designed to encode, retrieve, and amplify contextual data while leveraging swarm-based decision-making. Additionally, a **Monte Carlo Optimization** process is implemented to refine encoding strategies based on class-specific adaptation.

## **Key Components**
### 1. **Hyperdimensional Computing (HDC)**
- Implements a **Hyperdimensional Encoder** that converts text into **high-dimensional vector representations**.
- Uses **random binary vectors** to represent words and encodes documents based on word distributions.
- Computes similarity between encoded vectors using **cosine similarity**.

### 2. **Pyramid-Resonance AI**
- A **multi-layer resonance model** that applies **resonance transformations** to encoded vectors.
- Uses **adaptive scaling** based on entropy levels to **dynamically modulate resonance amplification**.
- Applies **nonlinear transformations (tanh)** and **resampling techniques** to refine contextual embedding.

### 3. **Queen-Agent Swarm Intelligence**
- Implements a **swarm learning framework** where **agents are trained to optimize contextual retrieval**.
- Uses **pheromone-based reinforcement learning** to **prioritize high-performing agents**.
- Introduces **adaptive stabilization** to prevent **overfitting and stagnation** in the learning process.

### 4. **Monte Carlo Optimization for Class-Specific Encoding**
- Runs **multiple simulations** to optimize **scaling factors** for encoding based on **document classes**.
- Adjusts encoding **scaling parameters per class** to **maximize similarity within categories**.
- Computes and visualizes **similarity matrices** to analyze **clustering effects of encoded vectors**.

## **Applications**
- **Text Retrieval & Classification**: Enhances document searchability using **adaptive swarm-based retrieval**.
- **Contextual Intelligence**: Uses **resonance-based amplification** to improve **semantic understanding**.
- **Self-Optimizing AI Systems**: Implements **adaptive reinforcement learning** to improve **decision-making over time**.
- **Efficient High-Dimensional Representations**: Uses **Monte Carlo search** to **fine-tune encoding strategies** per class.

## **Execution Steps**
1. **Initialize HDC Encoder** to encode text data into high-dimensional vectors.
2. **Train Queen-Agent Swarm** to optimize retrieval and decision-making.
3. **Apply Pyramid-Resonance AI** to amplify and refine encoded vectors.
4. **Run Monte Carlo Optimization** to fine-tune encoding strategies.
5. **Analyze Performance** using similarity matrices and benchmark results.

In [None]:
import numpy as np
import scipy.signal

# -------------------------------
# Hyperdimensional Computing (HDC)
# -------------------------------
class HyperdimensionalEncoder:
    def __init__(self, dimension=10000, seed=42):
        np.random.seed(seed)
        self.dimension = dimension
        self.token_vectors = {}

    def generate_random_vector(self):
        return np.random.choice([-1, 1], size=(self.dimension,))

    def encode_text(self, text):
        words = text.split()
        encoded_vector = np.zeros(self.dimension)
        for word in words:
            if word not in self.token_vectors:
                self.token_vectors[word] = self.generate_random_vector()
            encoded_vector += self.token_vectors[word]
        return np.sign(encoded_vector)

    def similarity(self, vec1, vec2):
        return np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))

# -------------------------------
# Pyramid-Resonance AI
# -------------------------------
class PyramidResonanceAI:
    def __init__(self, layers=3, dimension=10000):
        self.layers = layers
        self.dimension = dimension
        self.weights = [np.random.randn(dimension) for _ in range(layers)]

    def apply_resonance(self, vector):
        resonant_vector = vector.copy()
        for i in range(self.layers):
            resonant_vector = np.tanh(np.multiply(resonant_vector, self.weights[i]))
            resonant_vector = scipy.signal.resample(resonant_vector, self.dimension)
        return resonant_vector

# -------------------------------
# Queen-Agent Optimized Swarm Intelligence
# -------------------------------
class ImprovedQueenAgentSwarm:
    def __init__(self, num_agents=10, dimension=10000, learning_rate=0.05, inheritance_factor=0.5):
        self.num_agents = num_agents
        self.dimension = dimension
        self.learning_rate = learning_rate
        self.inheritance_factor = inheritance_factor
        self.agents = [np.random.randn(dimension) for _ in range(num_agents)]
        self.pheromones = np.ones(num_agents)
        self.rewards = np.zeros(num_agents)

    def select_best_agent(self, query_vector):
        similarities = np.array([hdc.similarity(query_vector, self.agents[i]) for i in range(self.num_agents)])
        weighted_similarities = similarities * self.pheromones
        best_agent = np.argmax(weighted_similarities)
        self.rewards[best_agent] += 1
        return self.agents[best_agent]

    def refine_agents(self):
        max_reward = np.max(self.rewards) if np.max(self.rewards) > 0 else 1
        self.pheromones = (self.rewards / max_reward) ** 0.5
        min_pheromone_index = np.argmin(self.pheromones)
        if self.pheromones[min_pheromone_index] < 0.3:
            best_agent_index = np.argmax(self.pheromones)
            self.agents[min_pheromone_index] = (
                self.agents[best_agent_index] * self.inheritance_factor +
                self.agents[min_pheromone_index] * (1 - self.inheritance_factor)
            )
        self.rewards *= (1 - self.learning_rate)

    def retrieve_context(self, text):
        query_vector = hdc.encode_text(text)
        return self.select_best_agent(query_vector)

# -------------------------------
# Full Integrated NSA-Killer Model
# -------------------------------
class OptimizedIntegratedModel:
    def __init__(self, num_agents=10, dimension=10000, pyramid_layers=3, resonance_scaling=0.5):
        self.swarm = ImprovedQueenAgentSwarm(num_agents=num_agents, dimension=dimension)
        self.pyramid = PyramidResonanceAI(layers=pyramid_layers, dimension=dimension)
        self.resonance_scaling = resonance_scaling

    def retrieve_and_amplify(self, text):
        swarm_vector = self.swarm.retrieve_context(text)
        swarm_vector = swarm_vector / np.linalg.norm(swarm_vector)
        resonant_vector = self.pyramid.apply_resonance(swarm_vector)
        blended_vector = (1 - self.resonance_scaling) * swarm_vector + self.resonance_scaling * resonant_vector
        return blended_vector

# -------------------------------
# Running the Model
# -------------------------------
if __name__ == "__main__":
    hdc = HyperdimensionalEncoder()
    integrated_model = OptimizedIntegratedModel()

    text1 = "The quick brown fox jumps over the lazy dog"
    text2 = "A fast brown fox leaps above the sleepy canine"

    final_vector1_optimized = integrated_model.retrieve_and_amplify(text1)
    final_vector2_optimized = integrated_model.retrieve_and_amplify(text2)

    final_similarity_score_optimized = hdc.similarity(final_vector1_optimized, final_vector2_optimized)
    print("Final Similarity Score (NSA-Killer Model):", final_similarity_score_optimized)

Final Similarity Score (NSA-Killer Model): 0.012334041771775776


In [None]:
# Refining the Swarm Intelligence: Introduce Soft Adaptation Instead of Hard Fixing

class AdaptiveStableQueenAgentSwarm:
    def __init__(self, num_agents=10, dimension=10000, learning_rate=0.05, inheritance_factor=0.5, adaptation_threshold=5):
        """Initialize Queen Agent with soft adaptation to maintain stability while avoiding overfitting."""
        self.num_agents = num_agents
        self.dimension = dimension
        self.learning_rate = learning_rate
        self.inheritance_factor = inheritance_factor
        self.agents = [np.random.randn(dimension) for _ in range(num_agents)]
        self.pheromones = np.ones(num_agents)
        self.rewards = np.zeros(num_agents)
        self.best_agent_index = None  # Track the best agent consistently
        self.adaptation_counter = 0
        self.adaptation_threshold = adaptation_threshold  # How many rounds before re-evaluating best agent

    def select_best_agent(self, query_vector):
        """Select the best agent but allow soft adaptation every few rounds."""
        similarities = np.array([hdc.similarity(query_vector, self.agents[i]) for i in range(self.num_agents)])

        if self.best_agent_index is None or self.adaptation_counter >= self.adaptation_threshold:
            self.best_agent_index = np.argmax(similarities)  # Allow adaptation every few rounds
            self.adaptation_counter = 0  # Reset adaptation timer

        self.adaptation_counter += 1  # Track iterations before adapting again
        self.rewards[self.best_agent_index] += 1
        return self.agents[self.best_agent_index]

    def refine_agents(self):
        """Apply reinforcement learning and maintain stability while allowing gradual adaptation."""
        max_reward = np.max(self.rewards) if np.max(self.rewards) > 0 else 1
        self.pheromones = (self.rewards / max_reward) ** 0.5

        # Ensure weak agents evolve only when necessary
        min_pheromone_index = np.argmin(self.pheromones)
        if self.pheromones[min_pheromone_index] < 0.3:
            best_agent_index = np.argmax(self.pheromones)
            self.agents[min_pheromone_index] = (
                self.agents[best_agent_index] * self.inheritance_factor +
                self.agents[min_pheromone_index] * (1 - self.inheritance_factor)
            )

        self.rewards *= (1 - self.learning_rate)

    def retrieve_context(self, text):
        """Retrieve context using the best agent but with soft adaptation over time."""
        query_vector = hdc.encode_text(text)
        return self.select_best_agent(query_vector)

# Updating the Integrated Model with Soft Adaptive Swarm Selection
class AdaptiveStableOptimizedModel:
    def __init__(self, num_agents=10, dimension=10000, pyramid_layers=3):
        """Integrate Adaptive Stable Queen Swarm Intelligence with dynamically scaled Pyramid-Resonance AI."""
        self.swarm = AdaptiveStableQueenAgentSwarm(num_agents=num_agents, dimension=dimension)
        self.pyramid = RefinedDynamicallyScaledResonanceAI(layers=pyramid_layers, dimension=dimension)

    def retrieve_and_amplify(self, text):
        """Retrieve optimized context using adaptive swarm intelligence and amplify it with resonance."""
        swarm_vector = self.swarm.retrieve_context(text)
        swarm_vector = swarm_vector / np.linalg.norm(swarm_vector)  # Normalize Swarm Vector

        complexity = np.std(swarm_vector) / np.mean(np.abs(swarm_vector) + 1e-8)
        blending_weight = np.clip(complexity, 0.3, 0.7)  # Adaptive balance between swarm and resonance

        resonant_vector = self.pyramid.apply_resonance(swarm_vector)
        blended_vector = (1 - blending_weight) * swarm_vector + blending_weight * resonant_vector

        return blended_vector

# Initialize the Adaptive Optimized Model
adaptive_optimized_model = AdaptiveStableOptimizedModel()

# Retrieve and amplify context using the adaptive NSA-Killer Framework
final_vector1_adaptive = adaptive_optimized_model.retrieve_and_amplify(text1)
final_vector2_adaptive = adaptive_optimized_model.retrieve_and_amplify(text2)

# Compute Similarity After Adaptive Swarm Selection Implementation
final_similarity_score_adaptive = hdc.similarity(final_vector1_adaptive, final_vector2_adaptive)
final_similarity_score_adaptive

0.9999999999999999

In [None]:
# Small Real-World Task: Document Retrieval Simulation

# Simulating a small dataset of documents (e.g., customer support chat history, FAQ system)
documents = [
    "How do I reset my password?",
    "What is the refund policy for digital products?",
    "Can I change my shipping address after placing an order?",
    "How do I contact customer support?",
    "What payment methods do you accept?",
    "How long does it take for an order to be delivered?",
    "How do I cancel my subscription?",
    "What is the warranty period for purchased items?",
    "Do you offer international shipping?",
    "How can I track my order status?"
]

# Query to match against the documents
query = "I forgot my password, how can I recover it?"

# Retrieve the best-matching document using NSA-Killer Model
query_vector = adaptive_optimized_model.retrieve_and_amplify(query)
similarity_scores = [hdc.similarity(query_vector, adaptive_optimized_model.retrieve_and_amplify(doc)) for doc in documents]

# Identify the best-matching document
best_match_index = np.argmax(similarity_scores)
best_matching_document = documents[best_match_index]

# Benchmark execution time for retrieving the best-matching document
retrieval_execution_time = benchmark_model(adaptive_optimized_model, [query])

# Output results
real_world_task_results = {
    "Query": query,
    "Best Matching Document": best_matching_document,
    "Similarity Score": similarity_scores[best_match_index],
    "Execution Time (Avg per sample)": retrieval_execution_time
}

real_world_task_results

{'Query': 'I forgot my password, how can I recover it?',
 'Best Matching Document': 'How do I reset my password?',
 'Similarity Score': 0.9999999999999999,
 'Execution Time (Avg per sample)': 0.006218409538269043}

In [None]:
# Running an extended benchmarking test for NSA-Killer Model across varied text complexity

# Define additional real-world queries and documents
extended_documents = [
    "How do I update my billing information?",
    "What are the steps to secure my account?",
    "How can I enable two-factor authentication?",
    "Where can I check my order history?",
    "What should I do if my payment fails?",
    "Can I request a refund after 30 days?",
    "How do I delete my account permanently?",
    "What are the benefits of a premium subscription?",
    "Is there a way to export my account data?",
    "How can I disable email notifications?"
]

# Additional real-world queries
extended_queries = [
    "How do I add a new credit card to my account?",
    "What are the best practices for account security?",
    "How do I set up 2FA on my phone?",
    "How can I find past purchases on my account?",
    "What do I do if my credit card is declined?",
    "Can I still get a refund if it's been more than a month?",
    "I want to remove my account, what should I do?",
    "Is there a premium membership? What does it include?",
    "How do I download all my account data?",
    "How do I stop receiving emails from your service?"
]

# Running similarity tests for multiple queries across all documents
query_vectors = [adaptive_optimized_model.retrieve_and_amplify(q) for q in extended_queries]
document_vectors = [adaptive_optimized_model.retrieve_and_amplify(doc) for doc in extended_documents]

# Compute similarity scores for all query-document pairs
similarity_matrix = np.array([[hdc.similarity(qv, dv) for dv in document_vectors] for qv in query_vectors])

# Identify the best-matching document for each query
best_match_indices = np.argmax(similarity_matrix, axis=1)
best_matches = [extended_documents[idx] for idx in best_match_indices]

# Benchmark execution time for full query-document retrieval across all examples
extended_retrieval_execution_time = benchmark_model(adaptive_optimized_model, extended_queries)

# Output results
extended_real_world_results = {
    "Queries": extended_queries,
    "Best Matches": best_matches,
    "Similarity Scores": [similarity_matrix[i, idx] for i, idx in enumerate(best_match_indices)],
    "Execution Time (Avg per sample)": extended_retrieval_execution_time
}

extended_real_world_results

{'Queries': ['How do I add a new credit card to my account?',
  'What are the best practices for account security?',
  'How do I set up 2FA on my phone?',
  'How can I find past purchases on my account?',
  'What do I do if my credit card is declined?',
  "Can I still get a refund if it's been more than a month?",
  'I want to remove my account, what should I do?',
  'Is there a premium membership? What does it include?',
  'How do I download all my account data?',
  'How do I stop receiving emails from your service?'],
 'Best Matches': ['How can I enable two-factor authentication?',
  'How can I enable two-factor authentication?',
  'How can I enable two-factor authentication?',
  'How can I enable two-factor authentication?',
  'How can I enable two-factor authentication?',
  'How can I enable two-factor authentication?',
  'How can I enable two-factor authentication?',
  'How do I update my billing information?',
  'How do I update my billing information?',
  'How do I update my bil

In [None]:
import numpy as np
import scipy.signal
import time
import pandas as pd

# -------------------------------
# Hyperdimensional Computing (HDC)
# -------------------------------
class HyperdimensionalEncoder:
    def __init__(self, dimension=10000, seed=42):
        np.random.seed(seed)
        self.dimension = dimension
        self.token_vectors = {}

    def generate_random_vector(self):
        return np.random.choice([-1, 1], size=(self.dimension,))

    def encode_text(self, text):
        words = text.split()
        encoded_vector = np.zeros(self.dimension)
        for word in words:
            if word not in self.token_vectors:
                self.token_vectors[word] = self.generate_random_vector()
            encoded_vector += self.token_vectors[word]
        return np.sign(encoded_vector)

    def similarity(self, vec1, vec2):
        return np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))

# -------------------------------
# Pyramid-Resonance AI
# -------------------------------
class RefinedDynamicallyScaledResonanceAI:
    def __init__(self, layers=3, dimension=10000, min_scaling=0.1811, max_scaling=0.8479):
        self.layers = layers
        self.dimension = dimension
        self.weights = [np.random.randn(dimension) for _ in range(layers)]
        self.min_scaling = min_scaling
        self.max_scaling = max_scaling

    def compute_scaling_factor(self, vector):
        entropy = np.std(vector) / np.mean(np.abs(vector) + 1e-8)
        return np.clip(self.min_scaling + (self.max_scaling - self.min_scaling) * entropy, self.min_scaling, self.max_scaling)

    def apply_resonance(self, vector):
        scaling_factor = self.compute_scaling_factor(vector)
        resonant_vector = vector.copy()
        for i in range(self.layers):
            resonant_vector = np.tanh(np.multiply(resonant_vector, self.weights[i]) * scaling_factor)
            resonant_vector = scipy.signal.resample(resonant_vector, self.dimension)
        return resonant_vector

# -------------------------------
# Enhanced Query-Specific Swarm Intelligence
# -------------------------------
class StabilizedContextAwareQueenAgentSwarm:
    def __init__(self, num_agents=10, dimension=10000, learning_rate=0.0172, inheritance_factor=0.5662, adaptation_threshold=3, penalty_factor=0.3578):
        self.num_agents = num_agents
        self.dimension = dimension
        self.learning_rate = learning_rate
        self.inheritance_factor = inheritance_factor
        self.adaptation_threshold = adaptation_threshold
        self.penalty_factor = penalty_factor
        self.agents = [np.random.randn(dimension) for _ in range(num_agents)]
        self.pheromones = np.ones(num_agents)
        self.rewards = np.zeros(num_agents)
        self.best_agents = {}

    def select_best_agent(self, query_vector, query_id):
        similarities = np.array([hdc.similarity(query_vector, self.agents[i]) for i in range(self.num_agents)])
        if query_id not in self.best_agents or self.rewards[self.best_agents[query_id]] < np.max(similarities):
            self.best_agents[query_id] = np.argmax(similarities)
        self.rewards[self.best_agents[query_id]] += 3
        self.rewards -= self.penalty_factor
        return self.agents[self.best_agents[query_id]]

    def retrieve_context(self, text, query_id):
        query_vector = hdc.encode_text(text)
        return self.select_best_agent(query_vector, query_id)
    def __init__(self, num_agents=10, dimension=10000, learning_rate=0.0172, inheritance_factor=0.5662, adaptation_threshold=3, penalty_factor=0.3578):
        self.num_agents = num_agents
        self.dimension = dimension
        self.learning_rate = learning_rate
        self.inheritance_factor = inheritance_factor
        self.adaptation_threshold = adaptation_threshold
        self.penalty_factor = penalty_factor
        self.agents = [np.random.randn(dimension) for _ in range(num_agents)]
        self.pheromones = np.ones(num_agents)
        self.rewards = np.zeros(num_agents)
        self.best_agents = {}

    def select_best_agent(self, query_vector, query_id):
        similarities = np.array([hdc.similarity(query_vector, self.agents[i]) for i in range(self.num_agents)])
        if query_id not in self.best_agents or self.rewards[self.best_agents[query_id]] < np.max(similarities):
            self.best_agents[query_id] = np.argmax(similarities)
        self.rewards[self.best_agents[query_id]] += 3
        self.rewards -= self.penalty_factor
        return self.agents[self.best_agents[query_id]]

# -------------------------------
# Monte Carlo Optimization for NSA-Killer Model
# -------------------------------
class MonteCarloNSAOptimizer:
    def __init__(self, num_simulations=50):
        self.num_simulations = num_simulations
        self.results = []

    def run_simulation(self, learning_rate, inheritance_factor, penalty_factor, resonance_min, resonance_max):
        model = StabilizedOptimizedModel()
        model.swarm.learning_rate = learning_rate
        model.swarm.inheritance_factor = inheritance_factor
        model.swarm.rewards -= penalty_factor
        model.pyramid.min_scaling = resonance_min
        model.pyramid.max_scaling = resonance_max

        vector1 = model.retrieve_and_amplify(text1, 0)
        vector2 = model.retrieve_and_amplify(text2, 1)
        similarity_score = hdc.similarity(vector1, vector2)

        self.results.append({
            "learning_rate": learning_rate,
            "inheritance_factor": inheritance_factor,
            "penalty_factor": penalty_factor,
            "resonance_min": resonance_min,
            "resonance_max": resonance_max,
            "similarity_score": similarity_score,
            "agent_selection_log": model.swarm.best_agents.copy()
        })

    def run_monte_carlo(self):
        for _ in range(self.num_simulations):
            learning_rate = np.random.uniform(0.01, 0.05)
            inheritance_factor = np.random.uniform(0.4, 0.7)
            penalty_factor = np.random.uniform(0.1, 1.0)
            resonance_min = np.random.uniform(0.15, 0.3)
            resonance_max = np.random.uniform(0.6, 0.9)
            self.run_simulation(learning_rate, inheritance_factor, penalty_factor, resonance_min, resonance_max)
        best_result = max(self.results, key=lambda x: x["similarity_score"])
        return best_result, self.results

# -------------------------------
# Running the Model
# -------------------------------
if __name__ == "__main__":
    hdc = HyperdimensionalEncoder()
    monte_carlo_optimizer = MonteCarloNSAOptimizer(num_simulations=50)
    best_config, all_results = monte_carlo_optimizer.run_monte_carlo()
    df_results = pd.DataFrame(all_results)
    print("Best Monte Carlo Configuration:", best_config)

Best Monte Carlo Configuration: {'learning_rate': 0.041508022325720546, 'inheritance_factor': 0.4479876131811957, 'penalty_factor': 0.3235989634396341, 'resonance_min': 0.25006058838074874, 'resonance_max': 0.8929548199402698, 'similarity_score': 1.0000000000000002, 'agent_selection_log': {0: 7, 1: 7}}


In [None]:
!pip install datasets

Collecting datasets
  Downloading datasets-3.3.2-py3-none-any.whl.metadata (19 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py311-none-any.whl.metadata (7.2 kB)
Downloading datasets-3.3.2-py3-none-any.whl (485 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m485.4/485.4 kB[0m [31m10.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading multiprocess-0.70.16-py311-none-any.whl (143 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m143.5/143.5 kB[0m [31m10.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading 

In [None]:
import numpy as np
import scipy.signal
import time
import pandas as pd
from datasets import load_dataset

# -------------------------------
# Hyperdimensional Computing (HDC)
# -------------------------------
class HyperdimensionalEncoder:
    def __init__(self, dimension=10000, seed=42):
        np.random.seed(seed)
        self.dimension = dimension
        self.token_vectors = {}

    def generate_random_vector(self):
        return np.random.choice([-1, 1], size=(self.dimension,))

    def encode_text(self, text):
        words = text.split()
        encoded_vector = np.zeros(self.dimension)
        for word in words:
            if word not in self.token_vectors:
                self.token_vectors[word] = self.generate_random_vector()
            encoded_vector += self.token_vectors[word]
        return np.sign(encoded_vector)

    def similarity(self, vec1, vec2):
        return np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))

# -------------------------------
# Loading and Encoding a Real-World Dataset
# -------------------------------
# Choose a dataset (e.g., 'ag_news' for text classification)
dataset = load_dataset("ag_news", split="train")

# Extract a subset of texts for encoding (first 100 samples)
texts = [example["text"] for example in dataset.select(range(100))]

# Initialize NSA-Killer Model for Encoding
hdc = HyperdimensionalEncoder()

# Encode all texts using NSA-Killer Model
encoded_vectors = [hdc.encode_text(text) for text in texts]

# Compute similarity matrix for the encoded dataset
num_samples = len(encoded_vectors)
similarity_matrix = np.zeros((num_samples, num_samples))

for i in range(num_samples):
    for j in range(i, num_samples):  # Compute only upper triangle (symmetry)
        similarity = hdc.similarity(encoded_vectors[i], encoded_vectors[j])
        similarity_matrix[i, j] = similarity
        similarity_matrix[j, i] = similarity  # Mirror the value

# Convert results to a DataFrame for visualization
similarity_df = pd.DataFrame(similarity_matrix, columns=[f"Sample_{i}" for i in range(num_samples)], index=[f"Sample_{i}" for i in range(num_samples)])

# Display the similarity matrix
print(similarity_df)

           Sample_0  Sample_1  Sample_2  Sample_3  Sample_4  Sample_5  \
Sample_0   1.000000  0.127933  0.188860  0.082969  0.042200  0.100987   
Sample_1   0.127933  1.000000  0.239341  0.140729  0.052941  0.145976   
Sample_2   0.188860  0.239341  1.000000  0.118827  0.154618  0.179992   
Sample_3   0.082969  0.140729  0.118827  1.000000  0.085735  0.181728   
Sample_4   0.042200  0.052941  0.154618  0.085735  1.000000  0.069536   
...             ...       ...       ...       ...       ...       ...   
Sample_95  0.148800  0.129457  0.057072  0.147203  0.097482  0.149536   
Sample_96  0.202575  0.173435  0.306467  0.097455  0.108580  0.102403   
Sample_97  0.076211  0.122522  0.060494  0.133995  0.037401  0.130469   
Sample_98  0.089027  0.150077  0.131792  0.126643  0.139291  0.114834   
Sample_99  0.041421  0.125760  0.125550  0.134192  0.108493  0.052409   

           Sample_6  Sample_7  Sample_8  Sample_9  ...  Sample_90  Sample_91  \
Sample_0   0.110200  0.072800  0.056075  0.

In [None]:
import numpy as np
import scipy.signal
import time
import pandas as pd
from datasets import load_dataset

# -------------------------------
# Hyperdimensional Computing (HDC)
# -------------------------------
class HyperdimensionalEncoder:
    def __init__(self, dimension=10000, seed=42):
        np.random.seed(seed)
        self.dimension = dimension
        self.token_vectors = {}

    def generate_random_vector(self):
        return np.random.choice([-1, 1], size=(self.dimension,))

    def encode_text(self, text):
        words = text.split()
        encoded_vector = np.zeros(self.dimension)
        for word in words:
            if word not in self.token_vectors:
                self.token_vectors[word] = self.generate_random_vector()
            encoded_vector += self.token_vectors[word]
        return np.sign(encoded_vector)

    def similarity(self, vec1, vec2):
        return np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))

# -------------------------------
# Monte Carlo Optimization for Encoding by Class
# -------------------------------
class MonteCarloEncodingOptimizer:
    def __init__(self, num_simulations=50, dimension=10000):
        self.num_simulations = num_simulations
        self.dimension = dimension
        self.results = {}

    def run_simulation(self, scaling_factor, class_texts, label):
        hdc = HyperdimensionalEncoder(dimension=self.dimension)
        encoded_vectors = [hdc.encode_text(text) * scaling_factor for text in class_texts]
        num_samples = len(encoded_vectors)
        similarity_matrix = np.zeros((num_samples, num_samples))

        for i in range(num_samples):
            for j in range(i, num_samples):
                similarity = hdc.similarity(encoded_vectors[i], encoded_vectors[j])
                similarity_matrix[i, j] = similarity
                similarity_matrix[j, i] = similarity

        avg_similarity = np.mean(similarity_matrix)
        if label not in self.results or self.results[label]["avg_similarity"] < avg_similarity:
            self.results[label] = {"scaling_factor": scaling_factor, "avg_similarity": avg_similarity}

    def run_monte_carlo(self, class_data):
        for label, class_texts in class_data.items():
            for _ in range(self.num_simulations):
                scaling_factor = np.random.uniform(0.5, 2.0)
                self.run_simulation(scaling_factor, class_texts, label)
        return self.results

# -------------------------------
# Loading and Encoding a Real-World Dataset by Class
# -------------------------------
# Choose a dataset (e.g., 'ag_news' for text classification)
dataset = load_dataset("ag_news", split="train")

# Organize texts by class
class_data = {}
for example in dataset.select(range(100)):
    label = example["label"]
    if label not in class_data:
        class_data[label] = []
    class_data[label].append(example["text"])

# Run Monte Carlo Optimization for each class
monte_carlo_optimizer = MonteCarloEncodingOptimizer(num_simulations=50)
best_configs = monte_carlo_optimizer.run_monte_carlo(class_data)

# Apply best encoding parameters per class
hdc = HyperdimensionalEncoder(dimension=10000)
encoded_vectors = []
labels = []

for label, class_texts in class_data.items():
    scaling_factor = best_configs[label]["scaling_factor"]
    for text in class_texts:
        encoded_vectors.append(hdc.encode_text(text) * scaling_factor)
        labels.append(label)

# Compute similarity matrix
num_samples = len(encoded_vectors)
similarity_matrix = np.zeros((num_samples, num_samples))

for i in range(num_samples):
    for j in range(i, num_samples):
        similarity = hdc.similarity(encoded_vectors[i], encoded_vectors[j])
        similarity_matrix[i, j] = similarity
        similarity_matrix[j, i] = similarity

# Convert results to a DataFrame for visualization
similarity_df = pd.DataFrame(similarity_matrix, columns=[f"Sample_{i}" for i in range(num_samples)], index=[f"Sample_{i}" for i in range(num_samples)])

# Display best Monte Carlo results per class and similarity matrix
print("Best Monte Carlo Encoding Configurations by Class:", best_configs)
print(similarity_df)

Best Monte Carlo Encoding Configurations by Class: {2: {'scaling_factor': 1.415925371292712, 'avg_similarity': 0.11830949756140072}, 3: {'scaling_factor': 1.4872989033196222, 'avg_similarity': 0.1589565012488857}}
           Sample_0  Sample_1  Sample_2  Sample_3  Sample_4  Sample_5  \
Sample_0   1.000000  0.127933  0.188860  0.082969  0.042200  0.100987   
Sample_1   0.127933  1.000000  0.239341  0.140729  0.052941  0.145976   
Sample_2   0.188860  0.239341  1.000000  0.118827  0.154618  0.179992   
Sample_3   0.082969  0.140729  0.118827  1.000000  0.085735  0.181728   
Sample_4   0.042200  0.052941  0.154618  0.085735  1.000000  0.069536   
...             ...       ...       ...       ...       ...       ...   
Sample_95  0.148800  0.129457  0.057072  0.147203  0.097482  0.149536   
Sample_96  0.202575  0.173435  0.306467  0.097455  0.108580  0.102403   
Sample_97  0.076211  0.122522  0.060494  0.133995  0.037401  0.130469   
Sample_98  0.089027  0.150077  0.131792  0.126643  0.139