# Integration Patterns & Multimodal Fusion

**Topics:** Multimodal fusion, encoder composition, hierarchical encoding
**Time:** 25 minutes
**Prerequisites:** 10, 14, 17 (scalar, ngram, image encoders)

---

## Pattern 1: Multimodal Fusion

Combine text, numeric, and other data types into single representations.

In [None]:
from holovec import VSA
from holovec.encoders import NGramEncoder, FractionalPowerEncoder

model = VSA.create('FHRR', dim=10000, seed=42)

# Setup encoders for different modalities
text_enc = NGramEncoder(model, n=3, seed=42)
rating_enc = FractionalPowerEncoder(model, min_val=1, max_val=5, seed=43)
price_enc = FractionalPowerEncoder(model, min_val=0, max_val=1000, seed=44)

# Define modality dimensions
TEXT_DIM = model.random(seed=100)
RATING_DIM = model.random(seed=101)
PRICE_DIM = model.random(seed=102)

print("Setup complete: text, rating, price encoders")

In [None]:
# Encode product review
review_text = "excellent quality fast shipping"
review_rating = 5.0
review_price = 49.99

# Encode each modality
text_hv = text_enc.encode(review_text.split())
rating_hv = rating_enc.encode(review_rating)
price_hv = price_enc.encode(review_price)

# Bind each to its dimension
text_bound = model.bind(TEXT_DIM, text_hv)
rating_bound = model.bind(RATING_DIM, rating_hv)
price_bound = model.bind(PRICE_DIM, price_hv)

# Bundle into multimodal representation
product_hv = model.bundle([text_bound, rating_bound, price_bound])

print(f"\nMultimodal product representation: {product_hv.shape}")

### Querying Multimodal Data

We can query by any modality or combination.

In [None]:
# Query: Extract rating
rating_query = model.unbind(product_hv, RATING_DIM)
print(f"\nExtracted rating similarity: {float(model.similarity(rating_query, rating_hv)):.3f}")
print(f"Original rating: {review_rating}")
print(f"Decoded rating: {rating_enc.decode(rating_query):.1f}")

## Pattern 2: Hierarchical Encoding

Build layered representations (document → sections → paragraphs).

In [None]:
# Define hierarchy levels
DOCUMENT = model.random(seed=200)
SECTION = model.random(seed=201)
PARAGRAPH = model.random(seed=202)

# Encode document structure
para1 = "machine learning algorithms process data"
para2 = "neural networks learn patterns"

section1 = model.bundle([
    model.bind(PARAGRAPH, text_enc.encode(para1.split())),
    model.bind(PARAGRAPH, text_enc.encode(para2.split()))
])

document = model.bind(SECTION, section1)

print("Hierarchical document encoded")
print(f"  Paragraphs → Section → Document")

In [None]:
# Query: Find paragraphs about "neural networks"
query = text_enc.encode("neural networks".split())
section_content = model.unbind(document, SECTION)
para_match = model.unbind(section_content, PARAGRAPH)

print(f"\nQuery 'neural networks' in document:")
print(f"Similarity to para1: {float(model.similarity(para_match, text_enc.encode(para1.split()))):.3f}")
print(f"Similarity to para2: {float(model.similarity(para_match, text_enc.encode(para2.split()))):.3f}  ← Match!")

## Pattern 3: Context Binding

Create context-dependent representations (word sense disambiguation).

In [None]:
# Encode "bank" in different contexts
WORD = model.random(seed=300)
CONTEXT = model.random(seed=301)

bank = model.random(seed=400)
context_river = text_enc.encode("river water shore".split())
context_money = text_enc.encode("money deposit account".split())

# Create context-dependent representations
bank_river = model.bundle([
    model.bind(WORD, bank),
    model.bind(CONTEXT, context_river)
])

bank_money = model.bundle([
    model.bind(WORD, bank),
    model.bind(CONTEXT, context_money)
])

print("Created context-dependent word representations")

In [None]:
# Test: Which sense matches "deposit money"?
test_context = text_enc.encode("deposit money".split())
test_bank = model.bundle([
    model.bind(WORD, bank),
    model.bind(CONTEXT, test_context)
])

print(f"\nTest context: 'deposit money'")
print(f"Similarity to riverbank: {float(model.similarity(test_bank, bank_river)):.3f}")
print(f"Similarity to financial: {float(model.similarity(test_bank, bank_money)):.3f}  ← Match!")

## Summary: Integration Patterns

✓ **Multimodal Fusion**: Combine text + numeric + spatial data  
✓ **Hierarchical Encoding**: Build layered structures  
✓ **Context Binding**: Create context-dependent meanings  
✓ **Modular Design**: Mix and match encoders

### Recipe for Multimodal Fusion:
1. Create encoder for each modality
2. Define unique dimension vector per modality
3. Encode each modality separately
4. Bind each to its dimension: bind(DIM, encoded_value)
5. Bundle all: bundle([text_bound, rating_bound, ...])
6. Query by unbinding specific dimension

### Applications:
- Multimodal search (text + image + metadata)
- Document understanding (hierarchical structure)
- Recommendation systems (user + item features)
- Semantic search (context-aware)

### Next Steps
- Apply patterns to your domain
- Combine with cleanup strategies
- Build domain-specific applications