# Tutorial 4: The Preservation Principle
## Functors: Structure-Preserving Maps Between Categories

---

### A Problem of Translation

*To the research assistant:*

*The Archive faces a persistent challenge: scholars in different regions use different classification systems. The Capital Archive catalogs creatures as 'predators' and 'aquatic_fauna.' The Western Archive uses 'hunters' and 'water_dwellers.' The Northern territories speak of 'meat_takers' and 'river_creatures.'*

*For decades, translations between systems were haphazard. A document classified in the Capital might be misplaced when sent to the Western branch—or worse, reclassified inconsistently.*

*Tessery Vold recognized that the problem was not about individual translations, but about structure. She wrote: "We must not translate terms. We must translate the entire classification system—preserving the relationships between categories, not just the categories themselves."*

*Vold called these structure-preserving translations 'preservation maps.' Modern category theorists call them 'functors.'*

*Your task: Examine the Archive's translation records. Determine which translations are genuine preservation maps (functors) and which break categorical structure. The integrity of cross-regional scholarship depends on this analysis.*

—*Archive Review Committee, Year 934*

---

## What You Will Learn

In this tutorial, you will learn to:

1. Understand **functors** as structure-preserving maps between categories
2. Identify the two requirements: preserving objects and preserving morphisms
3. Verify the **functor laws**: preserving composition and identity
4. Distinguish valid functors from structure-breaking translations
5. Recognize functors in machine learning contexts

By the end, you will understand:
- How functors formalize the notion of "same structure, different representation"
- Why preserving composition is the key requirement
- Applications to neural network equivariance and data transformation

---

## What is a Functor?

A **functor** F: C → D between categories C and D consists of:

1. **Object map**: For each object A in C, an object F(A) in D
2. **Morphism map**: For each morphism f: A → B in C, a morphism F(f): F(A) → F(B) in D

Subject to two laws:

### Functor Law 1: Preserve Composition

For composable morphisms f: A → B and g: B → C:

**F(g ∘ f) = F(g) ∘ F(f)**

The image of a composition is the composition of images.

### Functor Law 2: Preserve Identity

For each object A:

**F(id_A) = id_{F(A)}**

The image of an identity is the identity of the image.

---

### Vold's Interpretation

> *"A preservation map does not merely translate names. It translates structure. If stakdur passes to reed_marsh in our system, then the translated stakdur must pass to the translated reed_marsh in theirs. If a creature remains in place, the translated creature must remain in the translated place. Otherwise, the translation is mere word-replacement—not understanding."*

---

## Part 1: Loading the Translation Data

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import networkx as nx

# Set style
plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette('deep')

print("Libraries loaded. Ready to study functors.")

In [None]:
# Load the Regional Translations data
BASE_URL = "https://raw.githubusercontent.com/buildLittleWorlds/densworld-datasets/main/data/"

translations = pd.read_csv(BASE_URL + "regional_translations.csv")

print(f"Translations loaded: {len(translations)} records")
print(f"\nColumns: {list(translations.columns)}")

In [None]:
# Preview the data
translations.head(10)

---

## Part 2: Understanding the Translation Schema

In [None]:
# What types of translations exist?
print("Translation types:")
print(translations['translation_type'].value_counts())

In [None]:
# What source/target system pairs exist?
system_pairs = translations.groupby(['source_system', 'target_system']).size().reset_index(name='count')
system_pairs_sorted = system_pairs.sort_values('count', ascending=False)

print("Translation pathways (source → target):")
print(system_pairs_sorted)

In [None]:
# Most importantly: which translations preserve structure?
print("\nStructure preservation:")
print(translations['preserves_structure'].value_counts())

The key column is `preserves_structure`. When True, the translation is (claimed to be) a valid functor. When False, it breaks categorical structure.

---

## Part 3: Examining a Valid Functor

Let's examine the functor from Capital Archive to Western Archive classification.

In [None]:
# Filter to the archive exchange functor
archive_functor = translations[
    (translations['source_system'] == 'capital_archive') &
    (translations['target_system'] == 'western_archive')
]

print("Capital Archive → Western Archive Functor:")
print("=" * 60)
print(archive_functor[['source_term', 'target_term', 'preserves_structure', 'confidence']])

In [None]:
# Build the object map: source_term → target_term
object_map = dict(zip(archive_functor['source_term'], archive_functor['target_term']))

print("Object map F: Capital → Western")
print("-" * 40)
for src, tgt in object_map.items():
    print(f"  F({src}) = {tgt}")

This shows the object part of the functor. But a functor also maps morphisms!

### Mapping Morphisms

In the classification pre-order category:
- A morphism A → B means "A is contained in B"
- If F is a functor, then A → B in Capital must map to F(A) → F(B) in Western

Let's check: if `creature_observations` is a subcategory of `field_records` in Capital, then `beast_logs` (= F(creature_observations)) must be a subcategory of `expedition_documents` (= F(field_records)) in Western.

In [None]:
# Load classification data to check morphism preservation
classifications = pd.read_csv(BASE_URL + "archive_classifications.csv")

# Find classification relationships in Capital
capital_classes = classifications[classifications['region_system'] == 'capital_archive']

# Build the Capital classification category
capital_relations = capital_classes[capital_classes['child_category'] != capital_classes['parent_category']]

print("Capital Archive containment relations (morphisms):")
for _, row in capital_relations.head(10).iterrows():
    print(f"  {row['child_category']} → {row['parent_category']}")

In [None]:
# Check if the functor preserves these relations
print("Verifying functor preserves morphisms:")
print("=" * 60)

# Example: creature_observations → field_records
# Should map to: beast_logs → expedition_documents

# Check a few key relations
test_relations = [
    ('creature_observations', 'field_records'),
    ('mathematical_treatises', 'scholarly_works'),
    ('philosophical_treatises', 'scholarly_works'),
]

for child, parent in test_relations:
    if child in object_map and parent in object_map:
        f_child = object_map[child]
        f_parent = object_map[parent]
        print(f"\n{child} → {parent} (Capital)")
        print(f"  maps to")
        print(f"{f_child} → {f_parent} (Western)")
        print("  ✓ Morphism preserved")
    else:
        print(f"\n{child} → {parent}: terms not in functor object map")

---

## Part 4: A Non-Functor: Structure-Breaking Translation

Not all translations are functors. Let's examine one that breaks structure.

In [None]:
# Find translations that do NOT preserve structure
non_functors = translations[translations['preserves_structure'] == False]

print(f"Structure-breaking translations: {len(non_functors)}")
print("\nExamples:")
print(non_functors[['source_system', 'target_system', 'source_term', 'target_term', 'notes']].head(10))

In [None]:
# Examine a specific case: capital_taxonomy → western_taxonomy for anomalous_entities
anomaly_translation = translations[
    (translations['source_term'] == 'anomalous_entities') &
    (translations['target_system'] == 'western_taxonomy')
]

print("The Anomaly Problem:")
print("=" * 50)
print(anomaly_translation[['source_term', 'target_term', 'preserves_structure', 'notes']].iloc[0])

### Why This Breaks Structure

In the Capital taxonomy, `anomalous_entities` is a specific subcategory of `creatures` with particular relationships to other categories.

But the Western system has no equivalent—it maps to `unclassified`, which doesn't have the same relationships.

If `yeller` → `anomalous_entities` → `creatures` in Capital, but `unclassified` doesn't properly sit within the Western creature hierarchy, the morphism isn't preserved.

This is NOT a functor—it's just a word replacement that loses structural information.

In [None]:
# Visualize structure-preserving vs. structure-breaking
fig, ax = plt.subplots(figsize=(10, 6))

# Count by type and structure preservation
structure_by_type = translations.groupby(['translation_type', 'preserves_structure']).size().unstack(fill_value=0)

structure_by_type.plot(kind='barh', ax=ax, stacked=True, 
                        color=['salmon', 'lightgreen'], alpha=0.8)
ax.set_xlabel('Number of Translations')
ax.set_ylabel('Translation Type')
ax.set_title('Structure Preservation by Translation Type')
ax.legend(['Structure broken', 'Structure preserved'], loc='lower right')
plt.tight_layout()
plt.show()

Notice that:
- **Category mappings** and **terminology translations** usually preserve structure
- **Metaphorical translations** and some **cross-framework** mappings often break structure

This makes intuitive sense: metaphors sacrifice precision for expressiveness.

---

## Part 5: Vold's Terminology Functor

Vold created a functor from mathematical notation to natural language, making category theory accessible.

In [None]:
# Find Vold's terminology translations
vold_functor = translations[
    (translations['translator'] == 'tessery_vold') &
    (translations['source_system'] == 'mathematical_notation')
]

print("Vold's Terminology Functor: Math → Natural Language")
print("=" * 60)
print(vold_functor[['source_term', 'target_term', 'preserves_structure', 'notes']])

In [None]:
# Build the Vold terminology dictionary
vold_terms = dict(zip(vold_functor['source_term'], vold_functor['target_term']))

print("Vold's Translation Dictionary:")
print("-" * 40)
for math_term, natural_term in vold_terms.items():
    print(f"  {math_term:25} → {natural_term}")

Vold's genius was recognizing that this translation could preserve structure. The relationships between mathematical concepts (morphism composes with morphism, functor maps objects to objects) map to relationships between natural language concepts (passage chains with passage, preservation map relates things to things).

---

## Part 6: Inverse Functors

Some functors have inverses—you can translate back and forth without loss.

In [None]:
# Find the inverse: Western → Capital
inverse_functor = translations[
    (translations['source_system'] == 'western_archive') &
    (translations['target_system'] == 'capital_archive')
]

print("Western Archive → Capital Archive (Inverse Functor):")
print("=" * 60)
print(inverse_functor[['source_term', 'target_term', 'preserves_structure', 'notes']])

In [None]:
# Verify: F followed by F^{-1} should return to original
print("Verifying F ∘ F^{-1} = identity:")
print("=" * 50)

# Build inverse map
inverse_map = dict(zip(inverse_functor['source_term'], inverse_functor['target_term']))

# Check round-trip: Capital → Western → Capital
for capital_term, western_term in object_map.items():
    if western_term in inverse_map:
        roundtrip = inverse_map[western_term]
        match = "✓" if roundtrip == capital_term else "✗"
        print(f"  {capital_term} → {western_term} → {roundtrip} {match}")

When F ∘ G = identity and G ∘ F = identity, we say F and G are **inverse functors**, and the two categories are **equivalent**.

The Capital and Western Archive classification systems are categorically equivalent—they have the same structure, just different names.

---

## Part 7: Cross-Framework Translations

More interesting are translations between different scholarly frameworks.

In [None]:
# Translations between Vold's and Keth's terminology
vold_keth = translations[
    ((translations['source_system'] == 'vold_terminology') & 
     (translations['target_system'] == 'keth_terminology')) |
    ((translations['source_system'] == 'keth_terminology') & 
     (translations['target_system'] == 'vold_terminology'))
]

print("Vold ↔ Keth Framework Translations:")
print("=" * 60)
print(vold_keth[['source_system', 'source_term', 'target_term', 'preserves_structure', 'notes']])

Vold recognized structural equivalences between her passage framework and Keth's flow dynamics:

| Vold Term | Keth Term | Conceptual Match |
|-----------|-----------|------------------|
| passage | flow_line | Both describe directed movement |
| composition | path_concatenation | Both describe chaining |
| object_neighborhood | basin | Local structure around an object |
| identity_morphism_sink | attractor | Points of rest/stability |

This is NOT a coincidence. Vold saw that different mathematical frameworks often describe the same underlying structure.

---

## Part 8: Visualizing Functor Networks

Let's build a graph showing all the translation relationships between systems.

In [None]:
# Build a graph of systems connected by functors
G_systems = nx.DiGraph()

# Add edges for each translation (weighted by structure preservation)
for _, row in translations.iterrows():
    src = row['source_system']
    tgt = row['target_system']
    preserves = row['preserves_structure']
    
    if G_systems.has_edge(src, tgt):
        # Update existing edge
        G_systems[src][tgt]['count'] += 1
        if preserves:
            G_systems[src][tgt]['preserved'] += 1
    else:
        G_systems.add_edge(src, tgt, count=1, preserved=1 if preserves else 0)

print(f"Translation network: {G_systems.number_of_nodes()} systems, {G_systems.number_of_edges()} pathways")

In [None]:
# Visualize the translation network
fig, ax = plt.subplots(figsize=(14, 10))

pos = nx.spring_layout(G_systems, k=2, iterations=50, seed=42)

# Color edges by preservation ratio
edge_colors = []
for u, v in G_systems.edges():
    ratio = G_systems[u][v]['preserved'] / G_systems[u][v]['count']
    edge_colors.append(ratio)

# Draw
nx.draw_networkx_nodes(G_systems, pos, node_size=2000, node_color='lightblue', ax=ax)
nx.draw_networkx_labels(G_systems, pos, font_size=7, ax=ax)
edges = nx.draw_networkx_edges(G_systems, pos, edge_color=edge_colors, 
                                edge_cmap=plt.cm.RdYlGn, arrows=True,
                                arrowsize=15, connectionstyle='arc3,rad=0.1', ax=ax)

# Colorbar
sm = plt.cm.ScalarMappable(cmap=plt.cm.RdYlGn, norm=plt.Normalize(0, 1))
sm.set_array([])
plt.colorbar(sm, ax=ax, label='Proportion Structure-Preserving')

ax.set_title('Translation Network Between Classification Systems\n(Green = functorial, Red = structure-breaking)', fontsize=12)
ax.axis('off')
plt.tight_layout()
plt.show()

---

## Part 9: The ML Connection

Functors appear throughout machine learning:

### 1. Equivariant Neural Networks

An equivariant function f satisfies: f(g · x) = g · f(x)

This is exactly functoriality! The action of group element g commutes with the function.

### 2. Data Augmentation

When we rotate an image, the label should stay the same. This is a functor:
- Objects: images
- Morphisms: transformations (rotations, flips)
- The label function is a functor from images to labels

### 3. Embedding Functions

A good embedding preserves structure:
- If word A is related to word B, their embeddings should be related
- This is functoriality: relationships → embedding relationships

In [None]:
# Demonstrate: A simple functor from categories to vectors
# This simulates an embedding that preserves structure

np.random.seed(42)

# Source category: creature taxonomy (pre-order)
creatures = ['creatures', 'predators', 'aquatic_fauna', 'apex_predators', 'deep_dwellers']

# Functor: map each category to a vector
# Structure-preserving means: if A ≤ B, then embed(A) "points toward" embed(B)

# Hierarchical embedding: deeper categories have larger first component
embeddings = {
    'creatures': np.array([0.0, 0.5]),          # Root
    'predators': np.array([0.3, 0.3]),           # Level 2
    'aquatic_fauna': np.array([0.3, 0.7]),       # Level 2
    'apex_predators': np.array([0.6, 0.2]),      # Level 3
    'deep_dwellers': np.array([0.6, 0.8]),       # Level 3
}

print("Creature Taxonomy Embeddings:")
for cat, vec in embeddings.items():
    print(f"  F({cat}) = {vec}")

In [None]:
# Visualize the embedding (functor image)
fig, ax = plt.subplots(figsize=(10, 8))

# Plot embeddings
for cat, vec in embeddings.items():
    ax.scatter(vec[0], vec[1], s=200, c='steelblue', alpha=0.7, zorder=3)
    ax.annotate(cat, (vec[0], vec[1]), xytext=(5, 5), textcoords='offset points', fontsize=10)

# Draw containment relationships as arrows
relations = [
    ('predators', 'creatures'),
    ('aquatic_fauna', 'creatures'),
    ('apex_predators', 'predators'),
    ('deep_dwellers', 'aquatic_fauna'),
]

for child, parent in relations:
    start = embeddings[child]
    end = embeddings[parent]
    ax.annotate('', xy=end, xytext=start,
                arrowprops=dict(arrowstyle='->', color='gray', lw=1.5))

ax.set_xlabel('Embedding dimension 1 (specificity)')
ax.set_ylabel('Embedding dimension 2 (aquatic vs. terrestrial)')
ax.set_title('Functor: Taxonomy Category → Vector Space\n(Arrows show "is subcategory of")')
ax.set_xlim(-0.1, 0.8)
ax.set_ylim(0.0, 1.0)
plt.tight_layout()
plt.show()

The embedding preserves structure: subcategories are mapped "upward" toward their parents. This is a functor from the taxonomy pre-order to a category of vectors.

---

## Exercises

### Exercise 1: Measurement Functors

Find all translations of type `measurement_translation`. Which preserve structure? What makes a measurement translation structure-preserving?

In [None]:
# Your code here
# Hint: translations[translations['translation_type'] == 'measurement_translation']

### Exercise 2: Translator Analysis

Which translators have the highest proportion of structure-preserving translations? Does expertise correlate with functoriality?

In [None]:
# Your code here
# Hint: Group by translator, compute mean of preserves_structure

### Exercise 3: Confidence and Structure

Is there a correlation between translation confidence and structure preservation? Create a visualization.

In [None]:
# Your code here
# Hint: Cross-tabulate confidence and preserves_structure

### Exercise 4: Build a Functor

Using the Vold ↔ Pol translations, determine whether there's a valid functor between their frameworks. What would need to be true for this to be a functor?

In [None]:
# Your code here
# Hint: Filter to vold_terminology ↔ pol_terminology and check preserves_structure

---

## Discussion Questions

1. Vold's translations from mathematical notation to natural language are marked as structure-preserving. But natural language is inherently ambiguous. How can a translation to an ambiguous system preserve structure?

2. The metaphorical translations (creature_behavior → human_behavior) don't preserve structure. But metaphors are often the most illuminating translations. Is there a categorical concept that captures "partial" or "approximate" functors?

3. In machine learning, we often want embeddings to be "structure-preserving." What structure should be preserved: distances? Neighborhoods? Rankings? How does this relate to the functor concept?

---

## Summary

In this tutorial, you learned:

| Concept | What You Learned |
|---------|------------------|
| Functor | A structure-preserving map between categories |
| Object map | Functors map objects to objects |
| Morphism map | Functors map morphisms to morphisms |
| Composition preservation | F(g ∘ f) = F(g) ∘ F(f) |
| Identity preservation | F(id_A) = id_{F(A)} |
| Inverse functors | When both directions preserve structure |
| Non-functors | Translations that break categorical structure |

| Skill | Code Pattern |
|-------|--------------|
| Build object map | `dict(zip(df['source'], df['target']))` |
| Verify functoriality | Check if morphisms map consistently |
| Visualize translation network | `nx.DiGraph()` with colored edges |
| Compare structure preservation | Group by translator/type and aggregate |

---

## Next Tutorial

In **Tutorial 5: The Probing Pattern**, you will learn about the **Hom functor**—the functor that asks "what morphisms point to this object?":

- The Hom functor Hom(-, X): probing an object by what points to it
- Contravariance: why the source varies "backwards"
- Representable functors and the Yoneda perspective
- The foundation of Vold's Probing Theorem

> *"Tell me every creature that passes through the stakdur's territory, and I will tell you what a stakdur is—without ever seeing one."*
> — Tessery Vold, "The Probing Theorem," Year 892

---

## Credits

**Source Material:** Tai-Danae Bradley, "Category Theory and Language Models" (Cartesian Cafe)

**Densworld Integration:** The Relational Foundations course applies categorical concepts through the framework of Tessery Vold.

**Learn more:** [buildLittleWorlds](https://github.com/buildLittleWorlds)

---

> *"A preservation map does not merely translate names. It translates structure. If stakdur passes to reed_marsh in our system, then the translated stakdur must pass to the translated reed_marsh in theirs. Otherwise, the translation is mere word-replacement—not understanding."*
> — Tessery Vold, on the nature of functors