# Get fast feedback on transformations

Test transformation logic on sample rows before processing your entire dataset.

**What's in this recipe:**
- Query transformations without adding columns
- Test on sample rows with `.head()`
- Speed up your iteration cycle


## Problem

You want to test transformation logic on your data, but you need a way to preview results before processing everything.

**The challenge:** How do you verify your logic works correctly without either:
- Writing throwaway test code that you'll delete later
- Waiting for expensive operations (API calls, model inference) to run on your full dataset

```python
# Test a transformation on sample data
t.add_computed_column(result=expensive_transform(t.col))
# Processes all 10,000 rows... then you realize the logic is wrong
```

You need a pattern for fast iteration that doesn't require temporary code or processing the full dataset.


## Solution

**Without Pixeltable:** Write temporary test code on a subset, verify it works, then rewrite it for the full dataset—and hope you copied the logic correctly.

**With Pixeltable:** Use the same expression twice:

1. **Query**: Preview with `.select(expr).head()` 
2. **Commit**: Apply with `.add_computed_column(col=expr)`

Same expression means no transcription errors. Run expensive operations only after you've confirmed the logic works.

### Setup


In [None]:
%pip install -qU pixeltable

In [None]:
import pixeltable as pxt

In [None]:
# Create a fresh directory (drop existing if present)
pxt.drop_dir('demo_project', force=True)
pxt.create_dir('demo_project')

### Create sample data


In [None]:
t = pxt.create_table('demo_project.lyrics', {'text': pxt.String})
t.insert([
    {'text': 'Tumble out of bed and I stumble to the kitchen'},
    {'text': 'Pour myself a cup of ambition'},
    {'text': 'And yawn and stretch and try to come to life'},
    {'text': "Jump in the shower and the blood starts pumpin'"},
    {'text': "Out on the street, the traffic starts jumpin'"},
    {'text': 'With folks like me on the job from nine to five'}
])

print(f"Total rows: {t.count()}")

### Example 1: Built-in string methods

Query-then-Commit with built-in functions.


In [None]:
# Query: Test uppercase transformation on subset
t.select(
    t.text,
    uppercase=t.text.upper()
).head(2)

In [None]:
# Commit: Apply to all rows (same expression)
t.add_computed_column(uppercase=t.text.upper())

t.select(t.text, t.uppercase).show()

### Example 2: Custom UDF

Query-then-Commit with a user-defined function.


In [None]:
# Define a custom transformation
@pxt.udf
def word_count(text: str) -> int:
    return len(text.split())


In [None]:
# Query: Test UDF on subset
t.select(
    t.text,
    word_count=word_count(t.text)
).head(2)


In [None]:
# Commit: Apply to all rows (same expression)
t.add_computed_column(word_count=word_count(t.text))

t.select(t.text, t.word_count).show()


In [None]:
# Now process all rows and store
t.add_computed_column(words=word_count(t.text))

t.show()

## Explanation

**Fast iteration matters:**
- Test transformation logic on 3 rows instead of 10,000
- Immediate feedback on whether logic is correct
- Iterate 10x or 100x faster
- Build better transformations through rapid experimentation

**Pattern:**
```python
# Fast iteration loop
t.select(t.col, result=transform(t.col)).head(3)  # Test logic
t.select(t.col, result=transform_v2(t.col)).head(3)  # Refine logic
t.add_computed_column(result=transform_v2(t.col))  # Commit when ready
```

**Key insight:**
Optimizing for iteration speed means keeping feedback loops tight. Process small batches, learn fast, iterate. As Erik Bernhardsson notes in [Optimizing for iteration speed](https://erikbern.com/2017/07/06/optimizing-for-iteration-speed.html): "By keeping the feedback loop tight, you keep changing the combination of spices and learn from the feedback you get. Your recipe can evolve 10x or 100x faster."


## See also
- [Inspect large datasets](./iteration/inspect-large-datasets.ipynb) - Use `.head()` to view sample rows
- [Refine your transformations](./iteration/refine-transformations.ipynb) - Replace columns after testing
