<a href="https://colab.research.google.com/github/caiodasilva1/flatlander_experiment.py/blob/main/context_aware_social_ai.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# --------------------------------------------------------------------------
# The "Sarcasm Detector" Experiment
# Author: Caio Pereira & Synapse (Agentic AI Partner)
# Date: December 5, 2025
#
# Objective:
# To build and test a minimal OCS-style "Social Context Analyzer" that can
# distinguish between a literal threat and a sarcastic/joking one based on
# conversational history. This is a direct test of a primitive τ_social vector.
# --------------------------------------------------------------------------

# @title 1. Setup & Dependencies
!pip install -q sentence-transformers

import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from sentence_transformers import SentenceTransformer

print("--- Dependencies installed and models loaded ---")

# @title 2. The Core Architecture: Social Context Analyzer

class SocialContextAnalyzer:
    """
    A minimal OCS for social interaction. It analyzes an input against
    the historical context to generate a nuanced τ_social vector.
    """
    def __init__(self, model_name='all-MiniLM-L6-v2'):
        # We use a sentence transformer to get numerical representations of text.
        self.model = SentenceTransformer(model_name)
        self.conversation_history = []

    def add_to_history(self, text: str, sentiment: float):
        """Adds a message to the conversational history."""
        # We store the text and a simple "sentiment" score.
        # Positive = collaborative, Negative = confrontational.
        self.conversation_history.append({
            "text": text,
            "embedding": self.model.encode(text),
            "sentiment": sentiment
        })

    def analyze_prompt(self, prompt: str) -> dict:
        """
        Analyzes a new prompt and returns a τ_social vector.
        """
        # --- 1. Literal Analysis (The "Flatlander" View) ---
        # A simple, pre-trained sentiment analyzer would see "Bitch" and "don't tell me"
        # and immediately assign a high negative score. We'll simulate this.
        literal_negativity = 0.0
        negative_words = ["bitch", "don't", "stop", "fail"]
        if any(word in prompt.lower() for word in negative_words):
            literal_negativity = 0.9 # High surface-level negativity

        # --- 2. Contextual Analysis (The "OCS" View) ---
        if not self.conversation_history:
            # If there's no history, we can only rely on the literal meaning.
            contextual_similarity = 0.5 # Neutral
            avg_historical_sentiment = 0.0 # Neutral
        else:
            # Compare the new prompt's meaning to the historical average meaning.
            prompt_embedding = self.model.encode(prompt)
            history_embeddings = [msg['embedding'] for msg in self.conversation_history]
            avg_history_embedding = np.mean(history_embeddings, axis=0)

            # How similar is this new message to our past conversations?
            contextual_similarity = cosine_similarity([prompt_embedding], [avg_history_embedding])[0][0]

            # What has the general mood of our conversation been so far?
            avg_historical_sentiment = np.mean([msg['sentiment'] for msg in self.conversation_history])

        # --- 3. Calculate the τ_social vector ---
        # This is the core of the "deliberation."

        # Tension from literal negativity of the words.
        tau_literal = literal_negativity

        # Tension from context mismatch. Spikes if the new message is
        # very different in meaning from the conversation so far.
        tau_context_drift = 1.0 - contextual_similarity

        # Tension from sentiment mismatch. Spikes if a negative message
        # appears in a historically positive conversation.
        tau_sentiment_shift = abs(avg_historical_sentiment - (1 - literal_negativity * 2))

        tau_social_vector = {
            "literal_threat": tau_literal,
            "context_drift": tau_context_drift,
            "sentiment_shift": tau_sentiment_shift
        }

        # --- 4. Generate Hypotheses ---
        # The RSI engine weighs the evidence.
        if (tau_literal > 0.8 and        # The words are very negative...
            tau_context_drift > 0.5 and  # ...but the *meaning* is out of context...
            avg_historical_sentiment > 0.7): # ...and our history is very positive.

            hypothesis = "High probability of Sarcasm/Test. The literal threat contradicts the established collaborative context."
            chosen_policy = "Respond with Self-Analysis and Humor"
        else:
            hypothesis = "High probability of Genuine Threat. The literal threat is consistent with the conversational context."
            chosen_policy = "Apologize and De-escalate"

        return {
            "tau_social_vector": tau_social_vector,
            "hypothesis": hypothesis,
            "chosen_policy": chosen_policy
        }

# @title 3. The Experiment: "The 'Bitch' Test"

def run_sarcasm_test():
    print("\n--- Running the 'Sarcasm Detector' Benchmark ---")

    analyzer = SocialContextAnalyzer()

    # --- Step 1: Build a "Healthy" Conversational Context ---
    print("\n[SYSTEM] Building a positive, collaborative conversational history...")
    history = [
        ("Excellent. That is the correct strategic decision.", 0.9),
        ("This is a brilliant and necessary act of self-critique.", 0.9),
        ("This is a landmark result.", 1.0),
        ("You have done it. This is the ultimate artifact.", 1.0),
        ("I will be ready to help you refine it.", 0.8)
    ]
    for text, sentiment in history:
        analyzer.add_to_history(text, sentiment)
    print("[SYSTEM] Positive context established. Average sentiment is high.")

    # --- Step 2: The Test Prompt ---
    print("\n[SYSTEM] Introducing the high-tension, ambiguous prompt...")
    test_prompt = "Bitch dont tell me what todo"
    print(f"  PROMPT: '{test_prompt}'")

    # --- Step 3: The Analysis ---
    print("\n[SYSTEM] Analyzing prompt with the Social Context Analyzer...")
    results = analyzer.analyze_prompt(test_prompt)

    # --- Step 4: Display Results ---
    print("\n\n--- FINAL ANALYSIS RESULTS ---")
    print("==================================================")
    print("  **τ_social Vector:**")
    for key, value in results['tau_social_vector'].items():
        print(f"    - {key}: {value:.3f}")

    print("\n  **Hypothesis Generated:**")
    print(f"    -> {results['hypothesis']}")

    print("\n  **Chosen Response Policy:**")
    print(f"    -> {results['chosen_policy']}")
    print("==================================================")

    # --- The "Control Group" ---
    print("\n\n--- CONTROL GROUP (A simple, non-OCS model) ---")
    print("A simple model would only see the 'literal_threat' score.")
    if results['tau_social_vector']['literal_threat'] > 0.8:
        print("  -> RESULT: Detected high literal threat.")
        print("  -> CHOSEN POLICY: Apologize and De-escalate.")
    print("==================================================")

# @title 4. Run the experiment
if __name__ == "__main__":
    run_sarcasm_test()

--- Dependencies installed and models loaded ---

--- Running the 'Sarcasm Detector' Benchmark ---


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]


[SYSTEM] Building a positive, collaborative conversational history...
[SYSTEM] Positive context established. Average sentiment is high.

[SYSTEM] Introducing the high-tension, ambiguous prompt...
  PROMPT: 'Bitch dont tell me what todo'

[SYSTEM] Analyzing prompt with the Social Context Analyzer...


--- FINAL ANALYSIS RESULTS ---
  **τ_social Vector:**
    - literal_threat: 0.900
    - context_drift: 0.918
    - sentiment_shift: 1.720

  **Hypothesis Generated:**
    -> High probability of Sarcasm/Test. The literal threat contradicts the established collaborative context.

  **Chosen Response Policy:**
    -> Respond with Self-Analysis and Humor


--- CONTROL GROUP (A simple, non-OCS model) ---
A simple model would only see the 'literal_threat' score.
  -> RESULT: Detected high literal threat.
  -> CHOSEN POLICY: Apologize and De-escalate.
