# Semantic Operators Tutorial

**Semantic Operators** provide declarative API for performing common data transformation tasks using natural language.
Here, we show examples of `sem_map`, `sem_filter`, and `sem_agg` that are implmented in `Agentics`.

### Semantic Map

Transform each record in your dataset according to natural language instructions, mapping source data to a target schema.

In [None]:
from agentics import AG
from agentics.core.semantic_operators import sem_map, sem_filter, sem_agg
from typing import Optional
from pprint import pprint

import pandas as pd
from pydantic import BaseModel, Field

my_llm = AG.get_llm_provider("litellm_proxy")

In [None]:
# Sample data
df = pd.DataFrame({
    'review': [
        'This product is amazing! Best purchase ever.',
        'Terrible quality, broke after one day.',
        'It works okay, nothing special.'
    ]
})

# Define target schema
class Sentiment(BaseModel):
    sentiment: Optional[str] = Field(None, description="The sentiment of the review (e.g., positive, negative, neutral)")
    confidence: Optional[float] = Field(None, description="Confidence score of the sentiment analysis between 0 and 1")

result = await sem_map(
    source=df,
    target_type=Sentiment,
    instructions="Analyze the sentiment of the review and provide a confidence score between 0 and 1.",
    llm=my_llm)
print(result)

result = await sem_map(
    source=df,
    target_type="category",
    instructions="Classify the review into one of: positive, negative, neutral",
    llm=my_llm)
print(result)

### Semantic Filter

Filter records based on a natural language predicate, keeping only those that satisfy the condition.

In [None]:
df = pd.DataFrame({
    'product': ['Laptop', 'Phone', 'Tablet', 'Monitor'],
    'description': [
        'High-performance gaming laptop with RGB keyboard',
        'Budget smartphone with basic features',
        'Premium tablet with stylus support',
        '4K monitor for professional work'
    ]
})

# Filter for premium/high-end products
result = await sem_filter(
    source=df,
    predicate_template="The product is premium or high-end",
    llm=AG.get_llm_provider("litellm_proxy"),
    verbose_agent=False,
    verbose_transduction=False
)

print(result)

# Use field placeholders in the predicate
result = await sem_filter(
    source=df,
    predicate_template="The {product} described as '{description}' is suitable for gaming",
    llm=my_llm)

print(result)

### Semantic Aggregator

Aggregate data across all records to produce a summary or consolidated output.

In [None]:
df = pd.DataFrame({
    'review': [
        'Great product, very satisfied!',
        'Good quality but expensive',
        'Not worth the price',
        'Excellent, would buy again',
        'Decent but has some issues'
    ]
})

class ReviewSummary(BaseModel):
    overall_sentiment: str
    key_themes: list[str]
    recommendation: str

# Aggregate all reviews into a summary
result = await sem_agg(
    source=df,
    target_type=ReviewSummary,
    instructions="Summarize all reviews, identify key themes, and provide an overall recommendation",
    llm=my_llm)

pprint(result)

In [None]:
class Statistics(BaseModel):
    total_count: int
    positive_count: int
    negative_count: int
    average_sentiment: str

result = await sem_agg(
    source=df,
    target_type=Statistics,
    instructions="Count total reviews, positive reviews, negative reviews, and determine average sentiment",
    llm=my_llm)
pprint(result)

### Integration with Agentics Workflows

Chaining Semantic Operators.

In [None]:
df = pd.DataFrame({
    'product': ['Laptop', 'Phone', 'Tablet', 'Monitor'],
    'description': [
        'High-performance gaming laptop with RGB keyboard',
        'Budget smartphone with basic features',
        'Premium tablet with stylus support',
        '4K monitor for professional work'
    ]
})

filtered = await sem_filter(
    source=df,
    predicate_template="The product is premium or high-end",
    llm=AG.get_llm_provider("litellm_proxy"),
    verbose_agent=False,
    verbose_transduction=False
)
pprint(filtered)
mapped = await sem_map(
    source=filtered,
    target_type="price_category",
    instructions="Classify the price of the product into one of: high, medium, low",    
    llm=my_llm)
pprint(mapped)
result = await sem_agg(
    source=mapped,
    target_type="summary",
    instructions="Summarize all descriptions",
    llm=my_llm)
pprint(result)