# **Catalog AI**

This notebook implements a toy version of the **Catalog AI** system discussed in [Addressing Gen AI's Quality Control Problem](https://hbr.org/2025/09/addressing-gen-ais-quality-control-problem) (Harvard Business Review)

![description](../01-online-retail-simulator/article.png)

Generative AI systems can produce vast numbers of candidate product data improvements, but at scale the central problem is not generation—it is quality control and learning. When Amazon began automating product data improvements using large language models to enhance product `title`, `description`, and `features`, most generated outputs were initially unreliable, and only a small fraction improved business outcomes like conversion and revenue. The solution was not human review, but a self-learning system that combines automated guardrails to filter low-quality outputs, experimentation to measure impact on shopper behavior, and feedback loops that use experimental results to guide future generation. The system can test millions of product data hypotheses, discard most of them, and continuously learn from the few that work. 

This notebook introduces a simplified, toy version of that idea: how to build a <img src="../../_static/learn-decide-repeat.png" alt="LDR" style="height: 1em; vertical-align: middle;"> **Learn · Decide · Repeat** loop that turns large-scale product data analysis into action and learning over time.


![description](../01-online-retail-simulator/online-retail-simulator.png)

We use the **[Online Retail Simulator](https://github.com/eisenhauerIO/tools-catalog-generator)** to demonstrate three capabilities. First, we generate product `title`, `description`, and `features` using a local LLM via [**Ollama**](https://ollama.com/) with customizable prompts. Using simulated data, we then apply causal impact measurement to establish whether these content improvements actually drive shopper behavior and sales, and to identify which positioning strategy—luxury versus budget—produces the strongest business outcomes. This illustrates how impact measurement guides data-driven content strategy selection.


Before running this notebook, execute the setup script `"setup_catalog_ai.sh"` to install and configure Ollama:

```bash
bash setup_catalog_ai.sh
```

The script installs the Online Retail Simulator package and sets up Ollama with the `llama3.2` model.

In [None]:
# Local imports
from online_retail_simulator import simulate, load_job_results
from support import print_product_details

The **Ollama backend** uses a local LLM to generate unique, contextually-aware product `title`, `description`, and `features`. Custom prompts let you control the tone and style—enabling different positioning for the same products. Let's compare **luxury** vs **budget** positioning for the same products.

In [None]:
# Generate base product catalog with characteristics
job_info = simulate("config_simulation.yaml")

In [None]:
# Load the generated product data
products = load_job_results(job_info)["products"]

In [None]:
# Select 2 products for demonstration
sample_products = products[["product_identifier", "category", "price"]].head(2)
sample_products

In [None]:
# Display the luxury positioning prompt template
!cat prompt_luxury.txt

In [None]:
# Display the budget positioning prompt template
!cat prompt_budget.txt

In [None]:
# Standard library
import inspect

# Third-party packages
from IPython.display import Code

# Local imports
from online_retail_simulator.simulate.product_details_ollama import simulate_product_details_ollama

Code(inspect.getsource(simulate_product_details_ollama), language="python")

In [None]:
# Generate luxury-positioned product content using Ollama LLM
luxury_products = simulate_product_details_ollama(sample_products, prompt_path="prompt_luxury.txt")
print_product_details(luxury_products, "Luxury")

In [None]:
# Generate budget-positioned product content using Ollama LLM
budget_products = simulate_product_details_ollama(sample_products, prompt_path="prompt_budget.txt")
print_product_details(budget_products, "Budget")

The same product generates completely different content depending on the prompt—enabling A/B testing of content strategies, audience targeting, and channel customization.