# FireProx Document References Guide

This notebook demonstrates how to work with document references in Firestore, including:

- **Assigning FireObject references** - Store relationships between documents
- **Automatic conversions** - FireObject ↔ DocumentReference conversion
- **Lazy loading** - Automatic data fetching on access
- **Nested references** - References in lists and dictionaries
- **Validation** - Type safety and state validation
- **Common patterns** - Real-world reference use cases

## Key Findings

✅ **Automatic Conversion**:
- Assigning a FireObject automatically converts to DocumentReference for storage
- Reading a DocumentReference automatically converts to FireObject
- Works recursively in nested structures (lists, dicts)

✅ **Lazy Loading**:
- Referenced FireObjects start in ATTACHED state
- Data is automatically fetched on first attribute access
- Works seamlessly for both sync and async

⚠️ **Important Validations**:
- Cannot assign DETACHED FireObjects (no path to reference)
- Cannot mix sync and async FireObjects (TypeError)
- References preserve object identity (same instance on repeated access)

## Setup

Import modules and initialize FireProx.

In [1]:
from fire_prox import AsyncFireProx, FireProx
from fire_prox.testing import async_demo_client, demo_client

---

# Part 1: Basic Document References

Learn how to create and use document references between FireObjects.

### Initialize Client

In [2]:
# Create sync client and collections
client = demo_client()
db = FireProx(client)
users = db.collection('doc_ref_users')
posts = db.collection('doc_ref_posts')

print("✅ Client initialized")

✅ Client initialized


## Feature 1: Assigning FireObject References

You can assign one FireObject to another's property to create a reference.

In [3]:
# Create a user
user = users.new()
user.name = 'Ada Lovelace'
user.occupation = 'Mathematician'
user.save(doc_id='ada')
print(f"👤 Created user: {user.name}")
print(f"   Path: {user.path}")

# Create a post with a reference to the user
post = posts.new()
post.title = 'On Analytical Engines'
post.author = user  # Assign FireObject reference
post.content = 'The Analytical Engine weaves algebraic patterns...'
print(f"\n📝 Created post: {post.title}")
print(f"   Author (before save): {type(post.author).__name__}")

# When we access the internal data, we see it's converted to DocumentReference
print(f"   Internal storage: {type(post._data['author']).__name__}")
print(f"   Reference path: {post._data['author'].path}")

# Save the post
post.save(doc_id='post1')
print("\n✅ Post saved with author reference!")

👤 Created user: Ada Lovelace
   Path: doc_ref_users/ada

📝 Created post: On Analytical Engines
   Author (before save): FireObject
   Internal storage: FireObject
   Reference path: doc_ref_users/ada

✅ Post saved with author reference!


## Feature 2: Reading References Back

When you read a document with references, they're automatically converted to FireObjects.

In [4]:
# Fetch the post from Firestore
retrieved_post = db.doc('doc_ref_posts/post1')
retrieved_post.fetch()

print(f"📄 Retrieved post: {retrieved_post.title}")
print(f"   Content: {retrieved_post.content[:50]}...")

# Access the author reference
author = retrieved_post.author
print("\n👤 Author reference:")
print(f"   Type: {type(author).__name__}")
print(f"   State: {author.state}")
print(f"   Path: {author.path}")

print("\n✅ Reference converted to FireObject (ATTACHED state)")

📄 Retrieved post: On Analytical Engines
   Content: The Analytical Engine weaves algebraic patterns......

👤 Author reference:
   Type: FireObject
   State: ATTACHED
   Path: doc_ref_users/ada

✅ Reference converted to FireObject (ATTACHED state)


## Feature 3: Lazy Loading

Referenced FireObjects automatically load data on first attribute access.

In [5]:
# The author is currently ATTACHED (no data loaded yet)
print("📊 Before accessing data:")
print(f"   State: {author.state}")

# Access an attribute - this triggers lazy loading
print("\n👤 Accessing author.name...")
author_name = author.name
print(f"   Name: {author_name}")
print(f"   Occupation: {author.occupation}")

# Now the author is LOADED
print("\n📊 After accessing data:")
print(f"   State: {author.state}")

# Subsequent accesses are instant (no fetch needed)
print("\n⚡ Second access (instant):")
print(f"   Name again: {author.name}")

print("\n✅ Lazy loading automatically fetched data on first access!")

📊 Before accessing data:
   State: ATTACHED

👤 Accessing author.name...
   Name: Ada Lovelace
   Occupation: Mathematician

📊 After accessing data:
   State: LOADED

⚡ Second access (instant):
   Name again: Ada Lovelace

✅ Lazy loading automatically fetched data on first access!


## Feature 4: Validation - DETACHED Objects

You cannot assign DETACHED FireObjects as references (they have no path).

In [6]:
# Create two DETACHED objects
unsaved_user = users.new()
unsaved_user.name = 'Grace Hopper'

unsaved_post = posts.new()
unsaved_post.title = 'Compilers'

print(f"👤 Unsaved user state: {unsaved_user.state}")
print(f"📝 Unsaved post state: {unsaved_post.state}")

# Try to assign DETACHED object as reference
print("\n❌ Attempting to assign DETACHED object...")
try:
    unsaved_post.author = unsaved_user
    print("   This should not print!")
except ValueError as e:
    print(f"   Caught ValueError: {e}")

print("\n✅ DETACHED objects cannot be assigned as references")
print("   Save the object first to create a reference!")

# The correct way:
unsaved_user.save(doc_id='grace')
print(f"\n✓ User saved, now in {unsaved_user.state} state")
unsaved_post.author = unsaved_user  # Now this works!
print("✓ Reference assignment successful!")

👤 Unsaved user state: DETACHED
📝 Unsaved post state: DETACHED

❌ Attempting to assign DETACHED object...
   Caught ValueError: Cannot assign a DETACHED FireObject as a reference. The object must be saved first to have a document path.

✅ DETACHED objects cannot be assigned as references
   Save the object first to create a reference!

✓ User saved, now in LOADED state
✓ Reference assignment successful!


## Feature 5: Validation - Sync/Async Mismatch

You cannot mix sync and async FireObjects.

In [7]:
# Create async client
async_client = async_demo_client()
async_db = AsyncFireProx(async_client)
async_users = async_db.collection('doc_ref_users_async')

# Create and save an async user
async_user = async_users.new()
async_user.name = 'Margaret Hamilton'
await async_user.save(doc_id='margaret')

print(f"👤 Async user created: {async_user.name}")
print(f"   Type: {type(async_user).__name__}")

# Create a sync post
sync_post = posts.new()
sync_post.title = 'Apollo Guidance Computer'

print(f"\n📝 Sync post created: {sync_post.title}")
print(f"   Type: {type(sync_post).__name__}")

# Try to assign async user to sync post
print("\n❌ Attempting to mix sync and async...")
try:
    sync_post.author = async_user
    print("   This should not print!")
except TypeError as e:
    print(f"   Caught TypeError: {e}")

print("\n✅ Sync and async FireObjects cannot be mixed")
print("   Use matching types: sync with sync, async with async")

👤 Async user created: Margaret Hamilton
   Type: AsyncFireObject

📝 Sync post created: Apollo Guidance Computer
   Type: FireObject

❌ Attempting to mix sync and async...
   Caught TypeError: Cannot assign async FireObject to sync FireObject. Both objects must be from the same context (sync or async).

✅ Sync and async FireObjects cannot be mixed
   Use matching types: sync with sync, async with async


---

# Part 2: Nested References

References can be nested in lists and dictionaries.

## Feature 6: References in Lists

Store multiple references in a list.

In [8]:
# Create multiple reviewers
reviewer1 = users.new()
reviewer1.name = 'Alan Turing'
reviewer1.expertise = 'Computation'
reviewer1.save(doc_id='alan')

reviewer2 = users.new()
reviewer2.name = 'Donald Knuth'
reviewer2.expertise = 'Algorithms'
reviewer2.save(doc_id='donald')

reviewer3 = users.new()
reviewer3.name = 'Barbara Liskov'
reviewer3.expertise = 'Programming Languages'
reviewer3.save(doc_id='barbara')

print("👥 Created reviewers:")
for r in [reviewer1, reviewer2, reviewer3]:
    print(f"   • {r.name} ({r.expertise})")

# Create a paper with multiple reviewers
paper = posts.new()
paper.title = 'On Computable Numbers'
paper.reviewers = [reviewer1, reviewer2, reviewer3]  # List of references
paper.save(doc_id='paper1')

print(f"\n📄 Created paper: {paper.title}")
print(f"   Reviewers: {len(paper.reviewers)} assigned")
print("\n✅ Multiple references stored in list!")

👥 Created reviewers:
   • Alan Turing (Computation)
   • Donald Knuth (Algorithms)
   • Barbara Liskov (Programming Languages)

📄 Created paper: On Computable Numbers
   Reviewers: 3 assigned

✅ Multiple references stored in list!


In [9]:
# Read back and access reviewers
retrieved_paper = db.doc('doc_ref_posts/paper1')
retrieved_paper.fetch()

print(f"📄 Retrieved paper: {retrieved_paper.title}")
print("\n👥 Reviewers (lazy loading):")

for i, reviewer in enumerate(retrieved_paper.reviewers, 1):
    print(f"   {i}. {reviewer.name} - {reviewer.expertise}")
    print(f"      State: {reviewer.state}")

print("\n✅ Each reference in list supports lazy loading!")

📄 Retrieved paper: On Computable Numbers

👥 Reviewers (lazy loading):
   1. Alan Turing - Computation
      State: LOADED
   2. Donald Knuth - Algorithms
      State: LOADED
   3. Barbara Liskov - Programming Languages
      State: LOADED

✅ Each reference in list supports lazy loading!


## Feature 7: References in Dictionaries

Store references as dictionary values with semantic keys.

In [10]:
# Create contributors for different roles
author_user = users.new()
author_user.name = 'Edsger Dijkstra'
author_user.role = 'Primary Author'
author_user.save(doc_id='edsger')

editor_user = users.new()
editor_user.name = 'Niklaus Wirth'
editor_user.role = 'Technical Editor'
editor_user.save(doc_id='niklaus')

reviewer_user = users.new()
reviewer_user.name = 'Tony Hoare'
reviewer_user.role = 'Peer Reviewer'
reviewer_user.save(doc_id='tony')

print("👥 Created contributors:")
for u in [author_user, editor_user, reviewer_user]:
    print(f"   • {u.name} - {u.role}")

# Create article with contributor dict
article = posts.new()
article.title = 'Structured Programming'
article.contributors = {
    'author': author_user,
    'editor': editor_user,
    'reviewer': reviewer_user
}
article.save(doc_id='article1')

print(f"\n📰 Created article: {article.title}")
print(f"   Contributors: {len(article.contributors)} roles defined")
print("\n✅ References stored in dictionary with semantic keys!")

👥 Created contributors:
   • Edsger Dijkstra - Primary Author
   • Niklaus Wirth - Technical Editor
   • Tony Hoare - Peer Reviewer

📰 Created article: Structured Programming
   Contributors: 3 roles defined

✅ References stored in dictionary with semantic keys!


In [11]:
# Read back and access contributors
retrieved_article = db.doc('doc_ref_posts/article1')
retrieved_article.fetch()

print(f"📰 Retrieved article: {retrieved_article.title}")
print("\n👥 Contributors:")

for role, person in retrieved_article.contributors.items():
    print(f"   {role.title()}: {person.name}")
    print(f"      Role: {person.role}")
    print(f"      State: {person.state}")

print("\n✅ Dictionary references support lazy loading and semantic access!")

📰 Retrieved article: Structured Programming

👥 Contributors:
   Author: Edsger Dijkstra
      Role: Primary Author
      State: LOADED
   Editor: Niklaus Wirth
      Role: Technical Editor
      State: LOADED
   Reviewer: Tony Hoare
      Role: Peer Reviewer
      State: LOADED

✅ Dictionary references support lazy loading and semantic access!


## Feature 8: Mixed Nested Structures

Complex nesting: references in dicts containing lists, etc.

In [12]:
# Create team members
lead = users.new()
lead.name = 'Dennis Ritchie'
lead.save(doc_id='dennis')

dev1 = users.new()
dev1.name = 'Ken Thompson'
dev1.save(doc_id='ken')

dev2 = users.new()
dev2.name = 'Brian Kernighan'
dev2.save(doc_id='brian')

# Create project with complex nested structure
project = posts.new()
project.title = 'UNIX Operating System'
project.team = {
    'lead': lead,
    'developers': [dev1, dev2],
    'structure': {
        'primary': lead,
        'secondary': [dev1, dev2]
    }
}
project.save(doc_id='project1')

print(f"🚀 Created project: {project.title}")
print("   Team structure: nested dict with lists of references")
print("\n✅ Complex nested references stored!")

🚀 Created project: UNIX Operating System
   Team structure: nested dict with lists of references

✅ Complex nested references stored!


In [13]:
# Read back and navigate nested structure
retrieved_project = db.doc('doc_ref_posts/project1')
retrieved_project.fetch()

print(f"🚀 Retrieved project: {retrieved_project.title}")
print("\n👥 Team structure:")
print(f"   Lead: {retrieved_project.team['lead'].name}")
print("\n   Developers:")
for dev in retrieved_project.team['developers']:
    print(f"      • {dev.name}")
print("\n   Nested structure:")
print(f"      Primary: {retrieved_project.team['structure']['primary'].name}")
print("      Secondary:")
for dev in retrieved_project.team['structure']['secondary']:
    print(f"         • {dev.name}")

print("\n✅ All nested references support lazy loading!")

🚀 Retrieved project: UNIX Operating System

👥 Team structure:
   Lead: Dennis Ritchie

   Developers:
      • Ken Thompson
      • Brian Kernighan

   Nested structure:
      Primary: Dennis Ritchie
      Secondary:
         • Ken Thompson
         • Brian Kernighan

✅ All nested references support lazy loading!


---

# Part 3: Common Patterns

Real-world patterns for using document references.

## Pattern 1: Author/Owner References

Track who created or owns a document.

In [14]:
# Create a user
owner = users.new()
owner.name = 'Linus Torvalds'
owner.email = 'linus@example.com'
owner.save(doc_id='linus')

# Create documents with owner references
for i in range(3):
    doc = posts.new()
    doc.title = f'Kernel Module {i+1}'
    doc.owner = owner
    doc.created_by = owner  # Same reference, different semantic meaning
    doc.save(doc_id=f'kernel_module_{i+1}')

print(f"👤 Owner: {owner.name}")
print("📝 Created 3 documents with owner references")

# Query documents by owner
owner_docs = posts.where('owner', '==', owner._doc_ref).get()
print(f"\n🔍 Documents owned by {owner.name}: {len(owner_docs)}")
for doc in owner_docs:
    print(f"   • {doc.title}")
    # Verify lazy loading
    print(f"     Owner: {doc.owner.name} <{doc.owner.email}>")

print("\n✅ Pattern: Track document ownership with references")

👤 Owner: Linus Torvalds
📝 Created 3 documents with owner references

🔍 Documents owned by Linus Torvalds: 3
   • Kernel Module 1
     Owner: Linus Torvalds <linus@example.com>
   • Kernel Module 2
     Owner: Linus Torvalds <linus@example.com>
   • Kernel Module 3
     Owner: Linus Torvalds <linus@example.com>

✅ Pattern: Track document ownership with references


## Pattern 2: Parent/Child Relationships

Model hierarchical relationships between documents.

In [15]:
# Create parent document (conversation thread)
thread = posts.new()
thread.title = 'How to learn programming?'
thread.type = 'thread'
thread.parent = None  # Top-level thread
thread.save(doc_id='thread_001')

print(f"💬 Thread: {thread.title}")

# Create child documents (replies)
reply1 = posts.new()
reply1.title = 'Start with Python'
reply1.type = 'reply'
reply1.parent = thread  # Reference to parent
reply1.save(doc_id='reply_001')

reply2 = posts.new()
reply2.title = 'Practice every day'
reply2.type = 'reply'
reply2.parent = thread
reply2.save(doc_id='reply_002')

# Create nested reply (reply to reply)
nested_reply = posts.new()
nested_reply.title = 'Which Python version?'
nested_reply.type = 'reply'
nested_reply.parent = reply1  # Reference to parent reply
nested_reply.save(doc_id='reply_003')

print(f"   ├─ {reply1.title}")
print(f"   │  └─ {nested_reply.title}")
print(f"   └─ {reply2.title}")

# Query for direct replies to thread
replies = posts.where('parent', '==', thread._doc_ref).get()
print(f"\n📊 Direct replies to thread: {len(replies)}")
for reply in replies:
    print(f"   • {reply.title}")
    print(f"     Parent: {reply.parent.title}")

print("\n✅ Pattern: Model hierarchical relationships with parent references")

💬 Thread: How to learn programming?
   ├─ Start with Python
   │  └─ Which Python version?
   └─ Practice every day

📊 Direct replies to thread: 2
   • Start with Python
     Parent: How to learn programming?
   • Practice every day
     Parent: How to learn programming?

✅ Pattern: Model hierarchical relationships with parent references


## Pattern 3: Cross-Collection References

References can point to documents in different collections.

In [16]:
# Create multiple collections
products = db.collection('doc_ref_products')
orders = db.collection('doc_ref_orders')
customers = db.collection('doc_ref_customers')

# Create a customer
customer = customers.new()
customer.name = 'Alice Smith'
customer.email = 'alice@example.com'
customer.save(doc_id='alice')

# Create products
product1 = products.new()
product1.name = 'Laptop'
product1.price = 999.99
product1.save(doc_id='laptop_001')

product2 = products.new()
product2.name = 'Mouse'
product2.price = 29.99
product2.save(doc_id='mouse_001')

# Create order with cross-collection references
order = orders.new()
order.order_id = 'ORD-2024-001'
order.customer = customer  # Reference to customers collection
order.items = [product1, product2]  # References to products collection
order.total = 1029.98
order.save(doc_id='order_001')

print(f"🛒 Order: {order.order_id}")
print(f"   Customer: {customer.name} <{customer.email}>")
print("   Items:")
for product in [product1, product2]:
    print(f"      • {product.name} - ${product.price}")
print(f"   Total: ${order.total}")

# Read back and verify cross-collection references
retrieved_order = orders.doc('order_001')
retrieved_order.fetch()

print(f"\n📦 Retrieved order: {retrieved_order.order_id}")
print(f"   Customer: {retrieved_order.customer.name}")
print(f"   Customer collection: {retrieved_order.customer.path.split('/')[0]}")
print("\n   Items:")
for item in retrieved_order.items:
    print(f"      • {item.name} - ${item.price}")
    print(f"        Collection: {item.path.split('/')[0]}")

print("\n✅ Pattern: References work across different collections!")

🛒 Order: ORD-2024-001
   Customer: Alice Smith <alice@example.com>
   Items:
      • Laptop - $999.99
      • Mouse - $29.99
   Total: $1029.98

📦 Retrieved order: ORD-2024-001
   Customer: Alice Smith
   Customer collection: doc_ref_customers

   Items:
      • Laptop - $999.99
        Collection: doc_ref_products
      • Mouse - $29.99
        Collection: doc_ref_products

✅ Pattern: References work across different collections!


---

# Part 4: Async Document References

Document references work identically with AsyncFireProx.

## Async References with Lazy Loading

In [17]:
# Initialize async client
async_client = async_demo_client()
async_db = AsyncFireProx(async_client)
async_users = async_db.collection('doc_ref_async_users')
async_posts = async_db.collection('doc_ref_async_posts')

# Create user
async_user = async_users.new()
async_user.name = 'Tim Berners-Lee'
async_user.invention = 'World Wide Web'
await async_user.save(doc_id='tim')

print(f"👤 Async user: {async_user.name}")

# Create post with reference
async_post = async_posts.new()
async_post.title = 'Information Management Proposal'
async_post.author = async_user  # Async reference
await async_post.save(doc_id='post_async_1')

print(f"📝 Async post: {async_post.title}")

# Read back
retrieved = async_db.doc('doc_ref_async_posts/post_async_1')
await retrieved.fetch()

print(f"\n📄 Retrieved post: {retrieved.title}")
print(f"   Author (before lazy load): State = {retrieved.author.state}")

# Lazy loading works with async too!
print(f"   Author name: {retrieved.author.name}")
print(f"   Invention: {retrieved.author.invention}")
print(f"   Author (after lazy load): State = {retrieved.author.state}")

print("\n✅ Async references support lazy loading!")

👤 Async user: Tim Berners-Lee
📝 Async post: Information Management Proposal

📄 Retrieved post: Information Management Proposal
   Author (before lazy load): State = ATTACHED
   Author name: Tim Berners-Lee
   Invention: World Wide Web
   Author (after lazy load): State = LOADED

✅ Async references support lazy loading!


## Async Nested References

In [18]:
# Create team members
member1 = async_users.new()
member1.name = 'Vint Cerf'
await member1.save(doc_id='vint')

member2 = async_users.new()
member2.name = 'Bob Kahn'
await member2.save(doc_id='bob')

# Create project with nested references
async_project = async_posts.new()
async_project.title = 'TCP/IP Protocol'
async_project.team = {
    'lead': member1,
    'members': [member1, member2]
}
await async_project.save(doc_id='project_async_1')

print(f"🚀 Async project: {async_project.title}")

# Read back
retrieved_project = async_db.doc('doc_ref_async_posts/project_async_1')
await retrieved_project.fetch()

print("\n👥 Team:")
print(f"   Lead: {retrieved_project.team['lead'].name}")
print("   Members:")
for member in retrieved_project.team['members']:
    print(f"      • {member.name}")

print("\n✅ Async nested references work perfectly!")

🚀 Async project: TCP/IP Protocol

👥 Team:
   Lead: Vint Cerf
   Members:
      • Vint Cerf
      • Bob Kahn

✅ Async nested references work perfectly!


---

## Summary

### ✅ Key Capabilities

#### Automatic Conversion
1. **Assignment**: `post.author = user` converts FireObject → DocumentReference
2. **Retrieval**: Reading back converts DocumentReference → FireObject
3. **Nested**: Works recursively in lists and dicts
4. **Transparent**: Conversions happen automatically

#### Lazy Loading
1. **ATTACHED State**: Referenced objects start without data loaded
2. **Auto-Fetch**: First attribute access triggers data fetch
3. **LOADED State**: After fetch, subsequent accesses are instant
4. **Async Support**: Lazy loading works for both sync and async

#### Validation
1. **DETACHED Check**: Cannot assign unsaved FireObjects
2. **Type Safety**: Cannot mix sync and async FireObjects
3. **State Tracking**: Objects maintain state through lifecycle

### 🎯 Best Practices

#### ✅ DO:

**1. Save before referencing**
```python
user = users.new()
user.name = 'Ada'
user.save(doc_id='ada')  # Save first!
post.author = user       # Now can reference
```

**2. Use semantic key names in dicts**
```python
doc.contributors = {
    'author': author_user,
    'editor': editor_user,
    'reviewer': reviewer_user
}
```

**3. Query by reference**
```python
# Find all posts by an author
user_posts = posts.where('author', '==', user._doc_ref).get()
```

**4. Leverage lazy loading**
```python
# No need to manually fetch
author_name = post.author.name  # Automatically loads
```

#### ❌ DON'T:

**1. Reference DETACHED objects**
```python
# Bad - will raise ValueError
unsaved_user = users.new()
post.author = unsaved_user  # Error!
```

**2. Mix sync and async**
```python
# Bad - will raise TypeError
async_user = async_users.new()
await async_user.save()
sync_post.author = async_user  # Error!
```

**3. Create circular references**
```python
# Avoid - can cause infinite recursion
doc1.ref = doc2
doc2.ref = doc1
```

### 📊 Common Patterns

#### 1. Ownership Tracking
```python
doc.owner = user
doc.created_by = user
```

#### 2. Parent/Child Relationships
```python
reply.parent = thread
# Query children
replies = collection.where('parent', '==', thread._doc_ref).get()
```

#### 3. Many-to-Many via Lists
```python
paper.reviewers = [user1, user2, user3]
course.students = [student1, student2, ...]
```

#### 4. Cross-Collection References
```python
order.customer = customer  # Different collection
order.items = [product1, product2]  # Another collection
```

### 💡 Performance Tips

1. **Lazy Loading**: Referenced data loads on-demand (efficient)
2. **Caching**: Same reference returns same object instance
3. **Batch Queries**: Consider using `where()` to find related documents
4. **Avoid Deep Nesting**: Multiple levels of references = multiple fetches

### 📚 Learn More

- **Firestore references**: https://firebase.google.com/docs/firestore/data-model#references
- **FireProx state machine**: See `state.py` documentation
- **Query documentation**: See pagination and queries notebooks