# SAP Material Embeddings - Introduction

## ðŸŽ¯ What This Notebook Covers

This notebook series demonstrates how to generate **multimodal embeddings** for SAP material master data.

### Topics:

1. **Introduction** (this notebook)
2. **Text Embeddings** - Semantic similarity from descriptions
3. **Multimodal Embeddings** - Combining multiple features
4. **Duplicate Detection** - Finding duplicates with 1481% improvement
5. **Advanced Analysis** - Deep dive into components

---

## ðŸ”‘ Key Concepts

### Traditional Approach (String Matching)

```python
# Example: Levenshtein distance
similarity('Steel Bolt M8', 'Steel Bolt M10') = 0.85
```

**Problem:** Misses semantic similarity and context.

### Our Approach (Multimodal Embeddings)

```python
material = {
    'MAKTX': 'Steel Bolt M8x50 DIN 933',       # Text
    'MATKL': 'BOLTS',                           # Category
    'characteristics': {                        # Technical specs
        'DIAMETER': 'M8',
        'LENGTH': '50mm',
        'MATERIAL': 'STEEL'
    },
    'plants': ['Plant_1001', 'Plant_1002'],    # Context
    'suppliers': ['SUPP_100', 'SUPP_200']
}

embedding = encode_multimodal(material)  # â†’ 768-d vector
```

**Advantage:** Captures semantic meaning, technical specs, and usage context.

---

## ðŸ“Š Results Preview

**Duplicate Detection:**
- Text-only: 16 pairs found
- Multimodal: 253 pairs found
- **Improvement: +1481%** ðŸš€

---

## ðŸš€ Quick Start


In [14]:
# Setup
import sys
from pathlib import Path

# Add project root to path
project_root = Path.cwd().parent
sys.path.insert(0, str(project_root))

print(f"âœ“ Project root: {project_root}")

âœ“ Project root: /Users/antonio/Documents/Herramientas SAP/ML/RPT-1/materials-sap-embeddings


In [15]:
# Import key modules
from src.embeddings.text_embeddings import MaterialEmbeddings
from src.embeddings.multimodal_embeddings import MultimodalMaterialEmbeddings
from src.sap_connector import create_sample_materials

print("âœ“ Imports successful")

âœ“ Imports successful


In [16]:
# Generate sample data
materials = create_sample_materials(n_materials=5)

print(f"Generated {len(materials)} materials\n")
print("Example material:")
print(f"  MATNR: {materials[0]['MATNR']}")
print(f"  Description: {materials[0]['MAKTX']}")
print(f"  Group: {materials[0]['MATKL']}")
print(f"  Characteristics: {materials[0]['characteristics']}")
print(f"  Plants: {materials[0]['plants']}")
print(f"  Suppliers: {materials[0]['suppliers']}")

Generated 5 materials

Example material:
  MATNR: MAT000001
  Description: Steel Rivet 5x15
  Group: RIVETS
  Characteristics: {'DIAMETER': '5mm', 'LENGTH': '15mm', 'MATERIAL': 'STEEL'}
  Plants: ['Plant_1005', 'Plant_1004']
  Suppliers: ['SUPP_100139', 'SUPP_100022', 'SUPP_100108']


---

## âœ… Next Steps

Continue to:
- **Notebook 02**: Text Embeddings
- **Notebook 03**: Multimodal Embeddings
- **Notebook 04**: Duplicate Detection

---

## ðŸ“š Resources

- [GitHub Repository](https://github.com/AntonioLeites/materials-sap-embeddings)
- [Sentence Transformers](https://www.sbert.net/)
- [SAP Documentation](https://help.sap.com/)
