# negotiating the past

To install the environment, please read README.md

## Project Overview

This presentation explores the intersection of historical imagination, artificial intelligence, and collective memory. We'll examine how users "negotiate" with AI systems to express their conceptions of the past, and how these interactions can reveal tensions between user expectations and AI-embedded historical patterns.

## Structure

1. **Theoretical Framework**: How LLMs encode historical perspectives 
2. **Methodological Approach**: Analyzing historical references in prompts
3. **Results Analysis**: What prompt analysis reveals about historical imagination
4. **Conclusion**: New spaces for historical negotiation

# Part I: Theoretical Framework

## LLMs and Embedded Historical Patterns

- LLMs as Statistical Pattern Recognizers
- Training Data as Historical Record
- Historical Biases in Language Models
- "Stochastic Parrots" and Historical Truth

## Technical Foundation of LLMs

LLMs rely on transformer architectures that predict tokens based on previous context. Their "knowledge" of history comes from statistical patterns in training data, not genuine understanding. This creates an interesting dynamic when users prompt these systems about historical topics - the system's responses reveal embedded historical narratives from their training data.

## Historical Knowledge in Vector Space

- Word embeddings capture semantic relationships
- Historical concepts represented as vectors
- Temporal relationships encoded in semantic proximity
- Cultural associations embedded in language patterns

Within the vector space of LLMs, historical concepts are encoded as points in multidimensional space. The relationships between historical events, figures, and concepts are captured in the distances and directions between these vectors. These semantic relationships reflect collective memory patterns from the training corpus.

## Collective Memory and LLMs

- LLMs as repositories of digitized collective memory
- Training data selection as memory politics
- The "averaged" nature of AI-generated historical narratives
- Absence of contested memory in statistical consensus

From a memory studies perspective, LLMs function as repositories of digitized collective memory. The selection of training data constitutes a form of memory politics, determining which historical perspectives are included or excluded. The statistical nature of these models produces "averaged" historical narratives that often elide contestation and complexity.

# Part II: Methodological Approach

## The Challenge of Historical Prompt Identification

- Beyond simple keyword approaches
- Historical references: explicit vs. implicit
- Temporality in language
- Building a robust identification strategy

Identifying prompts that reference history requires more sophisticated approaches than simple keyword matching. Historical references can be explicit ("Napoleon Bonaparte") or implicit ("the Emperor's exile"), and may involve complex temporal markers. Our methodology must capture this complexity.

## Creating a Historical Prompt Dataset

In [5]:
# Loading our dataset of 10 million prompts
import pandas as pd
import numpy as np
from gensim.models import Word2Vec
from gensim.models.phrases import Phrases, Phraser
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
import re

# Make sure you have the necessary NLTK resources
nltk.download('punkt')
nltk.download('stopwords')

# Read the CSV file
prompts_df = pd.read_csv("data/prompts.csv")

# Preprocessing function
def preprocess(text):
    if isinstance(text, str):
        # Convert to lowercase
        text = text.lower()
        # Remove special characters and numbers
        text = re.sub(r'[^a-zA-Z\s]', '', text)
        # Tokenize
        tokens = word_tokenize(text)
        # Remove stopwords
        stop_words = set(stopwords.words('english'))
        tokens = [word for word in tokens if word not in stop_words and len(word) > 2]
        return tokens
    return []

# Apply preprocessing to create tokens
prompts_df['tokens'] = prompts_df['prompt'].apply(preprocess)

# Create phrases (bigrams and trigrams)
phrases = Phrases(prompts_df['tokens'], min_count=5, threshold=10)
bigram = Phraser(phrases)
prompts_df['tokens_bigrams'] = prompts_df['tokens'].apply(lambda x: bigram[x])

# Train Word2Vec model
word2vec_model = Word2Vec(
    sentences=prompts_df['tokens_bigrams'],
    vector_size=50,  # Dimension of the word vectors
    window=5,        # Context window size
    min_count=5,     # Minimum word frequency
    workers=4,       # Number of threads to run in parallel
    sg=0,            # 0 for CBOW, 1 for skip-gram
    epochs=20        # Number of iterations
)

# Get word vectors
word_vectors = word2vec_model.wv

[nltk_data] Error loading punkt: <urlopen error [SSL:
[nltk_data]     CERTIFICATE_VERIFY_FAILED] certificate verify failed:
[nltk_data]     unable to get local issuer certificate (_ssl.c:1000)>
[nltk_data] Error loading stopwords: <urlopen error [SSL:
[nltk_data]     CERTIFICATE_VERIFY_FAILED] certificate verify failed:
[nltk_data]     unable to get local issuer certificate (_ssl.c:1000)>


Our technical approach involves transforming prompts into vector representations to enable semantic analysis. By using text embedding techniques, we can identify prompts with historical references beyond simple keyword matching. This allows us to build a dataset that captures the rich variety of ways people reference the past.

## Historical Reference Detection Approach

In [None]:
from sklearn.metrics.pairwise import cosine_similarity
from sentence_transformers import SentenceTransformer

# Identify seed terms for historical references
historical_seeds = ["history", "ancient", "medieval", "renaissance", 
                   "revolution", "war", "empire", "century", "past"]

# Find semantically similar terms using word vectors
def find_similar_terms(word_vectors, seed_terms, n=100):
    similar_terms = set()
    for term in seed_terms:
        if term in word_vectors:
            # Get similar words using the most_similar method from word2vec
            similar_words = word_vectors.most_similar(term, topn=n)
            # Extract just the words (not the similarity scores)
            similar_terms.update([word for word, score in similar_words])
    return list(similar_terms)

historical_terms = find_similar_terms(word_vectors, historical_seeds, n=100)

# Filter prompts containing historical references
def contains_terms(text, terms):
    if isinstance(text, str):
        pattern = '|'.join(r'\b{}\b'.format(re.escape(term)) for term in terms)
        return bool(re.search(pattern, text.lower()))
    return False

# Apply the filter
historical_prompts = prompts_df[prompts_df['prompt'].apply(
    lambda x: contains_terms(x, historical_terms))]

# Alternatively, use cosine similarity between prompt embeddings and historical concepts
# Load a sentence transformer model for text embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')

# Get embeddings for all prompts
prompt_embeddings = model.encode(prompts_df['prompt'].fillna('').tolist())

# Get embeddings for historical terms (combine them into a representative text)
historical_text = " ".join(historical_terms)
historical_concept_embedding = model.encode([historical_text])[0]

# Calculate cosine similarity between each prompt and the historical concepts
similarity_scores = cosine_similarity(
    prompt_embeddings, 
    historical_concept_embedding.reshape(1, -1)
)

# Flatten the similarity scores array
similarity_scores = similarity_scores.flatten()

# Define threshold (you'll need to determine an appropriate value)
threshold = 0.5  # Example value, adjust based on your data

# Filter prompts with similarity scores above the threshold
historical_prompts_alt = prompts_df[similarity_scores > threshold]

print(f"Number of historical prompts (keyword method): {len(historical_prompts)}")
print(f"Number of historical prompts (embedding method): {len(historical_prompts_alt)}")


To identify prompts with historical references, we use both lexical and semantic approaches. Starting with seed historical terms, we expand to semantically similar concepts using word embeddings. We can then filter prompts either through string matching or by measuring the semantic similarity between prompts and historical concepts using cosine similarity of their vector representations.

## Analyzing Historical Imagination Through Prompts

In [None]:
from umap import UMAP
from sklearn.cluster import DBSCAN
from collections import Counter
import matplotlib.pyplot as plt
import seaborn as sns

# Make sure you have the necessary NLTK resources
nltk.download('punkt')
nltk.download('stopwords')

# Cluster historical prompts to identify themes
# Assuming prompt_embeddings is already defined from the previous code

# Apply UMAP for dimensionality reduction
umap_model = UMAP(n_neighbors=15, min_dist=0.1, random_state=42)
prompt_umap = umap_model.fit_transform(prompt_embeddings)

# Apply DBSCAN for clustering
dbscan = DBSCAN(eps=0.5, min_samples=5)
cluster_labels = dbscan.fit_predict(prompt_umap)

# Add cluster labels to the dataframe
historical_prompts['cluster'] = cluster_labels

# Function to extract top words from a list of prompts
def extract_top_words(prompts, n=10):
    stop_words = set(stopwords.words('english'))
    all_words = []
    
    for prompt in prompts:
        if isinstance(prompt, str):
            # Tokenize and filter out stopwords
            words = [w.lower() for w in word_tokenize(prompt) 
                    if w.lower() not in stop_words and w.isalpha() and len(w) > 2]
            all_words.extend(words)
    
    # Count word frequencies
    word_counts = Counter(all_words)
    
    # Return the top N words
    return word_counts.most_common(n)

# Extract common themes from clusters
cluster_themes = []
for c in sorted(historical_prompts['cluster'].unique()):
    if c != -1:  # Skip noise points (cluster -1 in DBSCAN)
        cluster_prompts = historical_prompts[historical_prompts['cluster'] == c]['prompt'].tolist()
        top_words = extract_top_words(cluster_prompts, n=10)
        cluster_themes.append({
            'cluster': c,
            'n_prompts': len(cluster_prompts),
            'sample_prompts': cluster_prompts[:5],  # Take first 5 examples
            'themes': top_words
        })

# Print cluster themes
for theme in cluster_themes:
    print(f"Cluster {theme['cluster']} - {theme['n_prompts']} prompts")
    print(f"Top themes: {', '.join([word for word, count in theme['themes']])}")
    print(f"Sample prompts:")
    for i, prompt in enumerate(theme['sample_prompts']):
        print(f"  {i+1}. {prompt[:100]}...")
    print("\n")

# Visualize the clusters
plt.figure(figsize=(12, 10))
scatter = plt.scatter(
    prompt_umap[:, 0], 
    prompt_umap[:, 1], 
    c=cluster_labels, 
    cmap='tab20', 
    alpha=0.6, 
    s=10
)
plt.colorbar(scatter, label='Cluster')
plt.title('Clusters of Historical Prompts')
plt.xlabel('UMAP Dimension 1')
plt.ylabel('UMAP Dimension 2')
plt.savefig('historical_prompt_clusters.png', dpi=300, bbox_inches='tight')
plt.show()

Once we've identified historically-related prompts, we use clustering techniques to discover common themes and patterns. By applying dimensionality reduction with UMAP and clustering with DBSCAN, we can identify groups of prompts that reference similar historical concepts, periods, or events. This allows us to map the landscape of historical imagination as expressed through user prompts.

# Part III: Results Analysis

## Key Findings: Historical References in Prompts

- Temporal distribution of historical references
- Common historical personas, events, and eras
- Stylistic patterns in historical prompts
- Historical accuracy vs. creative liberty

Our analysis reveals several patterns in how users reference history in their prompts. We observe a distribution across different historical periods, with certain eras receiving more attention than others. Popular historical figures and events appear frequently, often with creative embellishments that reveal more about contemporary imagination than historical fact.

## Conflicting Historical Narratives

We observe interesting conflicts between user expectations and AI-generated content, particularly around contested historical narratives. Users often prompt for versions of history that align with their preconceptions, while AI systems may present different perspectives based on their training data. These negotiations reveal the tension between personal historical imagination and collective historical narratives.

## User Negotiation Patterns

- Prompt refinement strategies
- Adjective use to guide historical tone
- Specificity vs. generality in historical requests
- "Historical" as stylistic marker

Users employ various strategies to negotiate with AI systems when requesting historical content. They refine prompts through iteration, use specific adjectives to guide the tone of historical representation, and vary between highly specific historical requests and general period references. Many users also use "historical" as a stylistic marker rather than a request for historical accuracy.

## Case Study: European Historical References

Building on our previous analysis of prompts containing "European Union," we observe how users connect contemporary European politics with various historical periods and concepts. References to empires, wars, and political movements reveal how users conceptualize European history in relation to current events. These connections offer insight into how collective memory shapes understanding of present political entities.

# Conclusion

## AI as a New Space for Historical Negotiation

- Comparing with traditional sites of historical consensus
- Public vs. private negotiation of historical understanding
- The role of algorithms in mediating historical perspectives
- Implications for collective memory formation

AI systems represent a new space for the negotiation of historical understanding, distinct from traditional sites like educational curricula, museums, or truth commissions. Unlike these institutional spaces, AI interactions are often private, individualized, and algorithmically mediated. This raises important questions about how collective memory will form in an era where historical understanding is increasingly negotiated through technological interfaces.

## Future Research Directions

- Longitudinal analysis of historical prompts
- Cross-cultural comparisons of historical references
- Educational applications of prompt analysis
- Ethical considerations for AI-mediated historical understanding

This research opens several promising directions for future work. Longitudinal studies could track how historical references in prompts evolve over time in response to current events. Cross-cultural analyses might reveal different patterns of historical reference across languages and regions. Educational applications could help develop more historically-informed AI systems. Finally, ethical considerations around AI's role in mediating historical understanding require careful attention.