# Tutorial 3: Functors from Categories to Sets

**Course 3: Document Functors (Lorren Dray)**

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/buildLittleWorlds/category-theory-document-functors/blob/main/notebooks/03_functors_to_sets.ipynb)

---

## Overview

In Year 926, Dray proposed that **documents are functors** from the Archive category to the category of Sets. This tutorial develops the formal definition of such functors.

### Learning Goals

1. Understand functors as structure-preserving maps between categories
2. See how functors map objects to sets and morphisms to functions
3. Verify the functor laws (identity and composition preservation)
4. Build intuition for functors to Sets specifically

---

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Load datasets
BASE_URL = "https://raw.githubusercontent.com/buildLittleWorlds/densworld-datasets/main/data/"

documents = pd.read_csv(BASE_URL + "document_functor_examples.csv")
archive_structure = pd.read_csv(BASE_URL + "archive_category_structure.csv")
correspondence = pd.read_csv(BASE_URL + "dray_correspondence.csv")

## Part 1: What is a Functor?

A **functor** F: C → D is a map between categories that preserves structure:

1. **On objects**: For each object A in C, F assigns an object F(A) in D
2. **On morphisms**: For each morphism f: A → B in C, F assigns a morphism F(f): F(A) → F(B) in D
3. **Preserves identity**: F(id_A) = id_{F(A)}
4. **Preserves composition**: F(g ∘ f) = F(g) ∘ F(f)

> "A functor is a translation between categories that respects their structure."
> — Lorren Dray

In [None]:
# Find the letter where Dray explains functors
functor_letter = correspondence[correspondence['key_concepts'].str.contains('functor', case=False, na=False)].iloc[0]

print(f"From Dray's correspondence ({functor_letter['date']}):\n")
print(f"\"{functor_letter['excerpt']}\"")

## Part 2: Functors to Set

The category **Set** has:
- **Objects**: Sets (collections of elements)
- **Morphisms**: Functions between sets

A functor F: C → Set assigns:
- To each object A in C, a set F(A)
- To each morphism f: A → B in C, a function F(f): F(A) → F(B)

### In the Archive Context

A document D is a functor D: Archive → Set:
- D(subject_catalog) = {set of topic keywords}
- D(author_index) = {set of contributor names}
- D(date_registry) = {set of dates}

In [None]:
class DocumentFunctor:
    """
    A document represented as a functor from Archive to Set.
    """
    def __init__(self, doc_id, doc_df):
        self.doc_id = doc_id
        self.observations = {}
        
        # Build the functor's action on objects
        for _, row in doc_df.iterrows():
            access_method = row['access_method']
            # Parse observation value as a set
            obs_value = row['observation_value']
            # Split on comma if multiple values
            if ',' in obs_value:
                obs_set = set(obs_value.split(','))
            else:
                obs_set = {obs_value}
            self.observations[access_method] = obs_set
    
    def __call__(self, access_method):
        """
        Apply the functor to an object (access method).
        Returns the set of observations.
        """
        return self.observations.get(access_method, set())
    
    def __repr__(self):
        return f"DocumentFunctor({self.doc_id})"

# Create a document functor for DOC-001
doc_001_data = documents[documents['document_id'] == 'DOC-001']
F = DocumentFunctor('DOC-001', doc_001_data)

print(f"Document functor: {F}\n")
print("Functor applied to objects (access methods):")
for method in ['subject_catalog', 'author_index', 'date_registry', 'location_index']:
    print(f"  F({method}) = {F(method)}")

## Part 3: Functor on Morphisms

A functor must also act on morphisms. If f: A → B is a morphism in the Archive category, then F(f): F(A) → F(B) is a function.

In the Archive context, if `topic_to_author` is a morphism from subject_catalog to author_index, then for a document D:

D(topic_to_author): D(subject_catalog) → D(author_index)

This function takes topics and returns the authors who wrote about those topics.

In [None]:
# Illustrate functor action on morphisms
fig, ax = plt.subplots(figsize=(12, 6))

# Draw the Archive category (left side)
ax.text(0.1, 0.9, 'Archive Category', fontsize=14, fontweight='bold')
ax.scatter([0.15, 0.35], [0.6, 0.6], s=500, c='lightblue', edgecolor='navy', zorder=5)
ax.annotate('subject\ncatalog', (0.15, 0.6), ha='center', va='center', fontsize=9)
ax.annotate('author\nindex', (0.35, 0.6), ha='center', va='center', fontsize=9)
ax.annotate('', xy=(0.31, 0.6), xytext=(0.19, 0.6),
            arrowprops=dict(arrowstyle='->', color='blue', lw=2))
ax.annotate('topic_to_author', (0.25, 0.7), ha='center', fontsize=8, color='blue')

# Draw the functor arrow
ax.annotate('', xy=(0.6, 0.6), xytext=(0.45, 0.6),
            arrowprops=dict(arrowstyle='->', color='green', lw=3))
ax.annotate('F', (0.52, 0.68), ha='center', fontsize=14, fontweight='bold', color='green')

# Draw the Set category (right side)
ax.text(0.65, 0.9, 'Set Category', fontsize=14, fontweight='bold')

# F(subject_catalog) = set of topics
ax.add_patch(plt.Circle((0.7, 0.6), 0.08, fill=False, edgecolor='navy', lw=2))
ax.text(0.7, 0.6, '{boundaries,\nsurveys,\nSW-sector}', ha='center', va='center', fontsize=6)
ax.annotate('F(subject_catalog)', (0.7, 0.75), ha='center', fontsize=8)

# F(author_index) = set of authors
ax.add_patch(plt.Circle((0.9, 0.6), 0.06, fill=False, edgecolor='navy', lw=2))
ax.text(0.9, 0.6, '{kell}', ha='center', va='center', fontsize=7)
ax.annotate('F(author_index)', (0.9, 0.72), ha='center', fontsize=8)

# F(topic_to_author) = function between sets
ax.annotate('', xy=(0.82, 0.6), xytext=(0.78, 0.6),
            arrowprops=dict(arrowstyle='->', color='purple', lw=2))
ax.annotate('F(topic_to_author)', (0.8, 0.5), ha='center', fontsize=8, color='purple')

ax.set_xlim(0, 1)
ax.set_ylim(0.3, 1)
ax.axis('off')
ax.set_title('Functor F: Archive → Set\nMaps objects to sets and morphisms to functions', fontsize=12)

plt.tight_layout()
plt.show()

## Part 4: The Functor Laws

For F to be a valid functor, it must satisfy two laws:

### Law 1: Identity Preservation
F(id_A) = id_{F(A)}

The identity morphism on any access method maps to the identity function on the corresponding set.

### Law 2: Composition Preservation
F(g ∘ f) = F(g) ∘ F(f)

Applying F to a composed morphism is the same as composing the images.

In [None]:
# Demonstrate identity preservation
print("Identity Preservation:")
print("="*50)
print()
print("For the subject_catalog access method:")
print(f"  F(subject_catalog) = {F('subject_catalog')}")
print()
print("  The identity morphism id_subject_catalog maps this set to itself.")
print(f"  F(id_subject_catalog)(F(subject_catalog)) = {F('subject_catalog')}")
print()
print("  ✓ Identity preserved: F(id) = id_F")

In [None]:
# Demonstrate composition preservation (conceptually)
print("Composition Preservation:")
print("="*50)
print()
print("Consider morphisms:")
print("  f: subject_catalog → author_index (topic_to_author)")
print("  g: author_index → methodology_index (author_to_method)")
print()
print("Then:")
print("  g ∘ f: subject_catalog → methodology_index")
print()
print("Functor law says:")
print("  F(g ∘ f) = F(g) ∘ F(f)")
print()
print("This means: applying the composed morphism to a document")
print("gives the same result as applying each morphism in sequence.")
print()
print("  ✓ Composition preserved: F(g ∘ f) = F(g) ∘ F(f)")

## Part 5: Multiple Documents as Functors

Each document is a different functor from the same Archive category. They share the same source (the Archive) but produce different sets.

In [None]:
# Create functors for multiple documents
unique_docs = documents['document_id'].unique()[:5]

functors = {}
for doc_id in unique_docs:
    doc_data = documents[documents['document_id'] == doc_id]
    functors[doc_id] = DocumentFunctor(doc_id, doc_data)

# Compare their values on subject_catalog
print("Multiple documents as functors:")
print("Each document F_i: Archive → Set\n")

for doc_id, F in functors.items():
    doc_title = documents[documents['document_id'] == doc_id]['document_title'].iloc[0]
    print(f"{doc_id}: {doc_title}")
    print(f"  F({doc_id})(subject_catalog) = {F('subject_catalog')}")
    print()

In [None]:
# Visualize multiple functors
fig, ax = plt.subplots(figsize=(14, 8))

# Archive category on the left
archive_x = 0.15
archive_objects = ['subject_catalog', 'author_index', 'date_registry']
archive_y = [0.7, 0.5, 0.3]

ax.text(archive_x, 0.9, 'Archive', fontsize=14, fontweight='bold', ha='center')
for name, y in zip(archive_objects, archive_y):
    ax.scatter([archive_x], [y], s=400, c='lightblue', edgecolor='navy', zorder=5)
    ax.annotate(name.replace('_', '\n'), (archive_x, y), ha='center', va='center', fontsize=7)

# Multiple functors going to Set
set_x_positions = [0.45, 0.6, 0.75, 0.9]
colors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#96CEB4']

for i, (doc_id, F) in enumerate(list(functors.items())[:4]):
    x = set_x_positions[i]
    color = colors[i]
    
    ax.text(x, 0.9, doc_id, fontsize=10, fontweight='bold', ha='center', color=color)
    
    # Draw functor arrows from archive to this document's sets
    for j, (obj, y) in enumerate(zip(archive_objects, archive_y)):
        # Arrow from archive object to set
        ax.annotate('', xy=(x-0.05, y), xytext=(archive_x+0.05, y),
                    arrowprops=dict(arrowstyle='->', color=color, alpha=0.5, lw=1))
        
        # Small circle representing the set F(obj)
        obs = F(obj)
        if obs:
            radius = 0.03 + 0.01 * len(obs)
            circle = plt.Circle((x, y), radius, fill=True, facecolor=color, alpha=0.3, edgecolor=color)
            ax.add_patch(circle)
            # Show size of set
            ax.annotate(f"|{len(obs)}|", (x, y), ha='center', va='center', fontsize=7)

ax.set_xlim(0, 1)
ax.set_ylim(0.1, 1)
ax.axis('off')
ax.set_title('Multiple Documents as Functors\nEach document F maps Archive objects to different Sets', fontsize=12)

plt.tight_layout()
plt.show()

## Summary

In this tutorial, we've learned:

1. **Functors** are structure-preserving maps between categories
2. **Functors to Set** map objects to sets and morphisms to functions
3. **Documents as functors**: Each document assigns sets of observations to access methods
4. **Functor laws**: Identity and composition must be preserved

### Key Quote

> "A document F assigns to each access method A a set F(A) of observations. The same document yields different observations through different access methods, but these observations cohere through functorial preservation."
> — Lorren Dray

### Next Tutorial

In Tutorial 4, we'll see documents as presheaves — functors from the opposite category — and understand why this reversal is crucial.

---

*Part of the [Category Theory & LLMs Series](https://github.com/buildLittleWorlds)*