korea-persona-interview

A field-ready CLI for running synthetic Korean persona interviews on top of OpenAI, Anthropic Claude, or any OpenAI-compatible local LLM (mlx_lm.server, vLLM, llama.cpp). Pair the NVIDIA Nemotron-Personas-Korea dataset (CC BY 4.0, about 1M Korean synthetic personas) with the model of your choice to pressure-test product ideas, interview guides, and persona hypotheses before recruiting real participants.

The tool ships four CLI subcommands (healthcheck, list-personas, interview, report), a JSON output mode for machine-to-machine use, and a Model Context Protocol (MCP) entry point that runs in either MCP server mode (server-side OpenAI/Anthropic calls) or MCP orchestrator mode (the host agent's sub-agent does the LLM work).

Features

Multi-turn interviews with 1M+ Korean synthetic personas (NVIDIA Nemotron-Personas-Korea, CC BY 4.0)
Three inference targets: OpenAI Chat Completions API, Anthropic Messages API, and any OpenAI-compatible local server
Async batch runner with concurrency 1-10, tqdm progress, SIGINT partial save, and exit-code 3 partial-failure detection
Persona drift detection with sentence-bounded first-person assertions for the gender/age/region/family-type axes (negation guards, third-person exclusion) plus an English-ratio safety net
--persona-id to pin specific personas by uuid for A/B comparisons; --resume PATH to re-run only the failed records of a previous batch
--insight-model to run interviews on a small model and the qualitative-insight call on a larger one
OpenAI streaming (llm.streaming: true) and Anthropic prompt caching (llm.anthropic_cache_control: true, default on)
LLM-as-judge drift refinement (heuristics.llm_drift_review, opt-in) for clearing false positives
acceptable_price_signal (cheap/fair/expensive/null) on every structured summary, plus optional WTP recommendation from the signal distribution
MCP entry point (python -m src.mcp_server) for Claude Code, Cursor, and Codex. mcp.mode toggles between orchestrator (default, no server-side key) and server (server-side OpenAI/Anthropic calls)
Automatic markdown report after every run (toggle with --no-report) and --json root mode for shell scripts
Single-turn mode (--single-turn) that bundles every question into one chat call to cut tokens
Token usage (prompt / completion / cached) printed at the end of every run and embedded in the JSON and report header
Reproducible sampling via --seed. Same seed plus same filter plus same dataset version returns the same personas
Operational hardening: persona ids sha256-masked in logs, outputs/ created with mode 0700 (result files 0600), --product and per-question text length-capped at 2000 chars with prompt-injection guards
No external telemetry. Outbound calls go only to the configured LLM endpoint and (on first run) Hugging Face Hub for the dataset

Requirements

Python 3.12 (pinned in .python-version)
uv package manager
An API key for the provider you plan to use:
- OPENAI_API_KEY for provider=openai (default). Get one at https://platform.openai.com/api-keys
- ANTHROPIC_API_KEY for provider=anthropic. Get one at https://console.anthropic.com/
- For local LLMs (mlx_lm.server, vLLM, llama.cpp) keep provider=openai and use any non-empty value
Internet access for the LLM API call and the first dataset download (about 1M records, cached afterwards under ~/.cache/huggingface)
macOS, Linux, and Windows are all supported. There is no Apple Silicon, GPU, or local-runtime requirement

Installation

.python-version pins Python 3.12, so uv venv picks the right interpreter automatically. Production deploys must install from the lockfiles to keep the resolved graph identical across environments.

uv venv --python 3.12
source .venv/bin/activate
uv pip sync requirements.lock requirements-dev.lock

Recompile the lockfiles after editing requirements*.txt.

uv pip compile requirements.txt -o requirements.lock
uv pip compile requirements-dev.txt -o requirements-dev.lock

To run the CLI as kpi and the MCP server as kpi-mcp-server from anywhere, install the project in editable mode after the dependency sync.

uv pip install -e .

Plain pip works too if you cannot use uv.

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Direct runtime dependencies live in pyproject.toml ([project.dependencies]). The official openai and anthropic SDKs are intentionally not used; calls go through httpx so the project keeps its dependency tree small and owns the retry, timeout, and logging policy. See docs/adr/2026-05-02-openai-backend-migration.md for the rationale.

Quick Start

Five commands take you from a fresh checkout to a finished report. The first interview run downloads the dataset (5-10 minutes); subsequent runs start in under 30 seconds.

export OPENAI_API_KEY=sk-...
python main.py healthcheck
python main.py list-personas --filter "age:25-39,region:서울특별시" --limit 20
python main.py interview --product "1인 가구용 반찬 정기배송, 월 39,900원, 주 2회 배송" --filter "age:25-39,region:서울특별시" --n 10 --questions "이 서비스 쓰실 의향 있나요?" "월 얼마면 적당한가요?" "거절한다면 왜요?"
python main.py report outputs/interview_korea-persona-interview_20260502_120000.json

The interview command auto-generates the markdown report (default --report); the standalone report step is only needed if you used --no-report, edited the JSON, or want to regenerate with different --top-n or --include-drift settings.

A .env file at the project root with OPENAI_API_KEY=sk-... (or ANTHROPIC_API_KEY=sk-ant-...) is picked up automatically. Existing shell environment variables take precedence over .env.

To use Claude instead, set ANTHROPIC_API_KEY and pass --provider anthropic.

export ANTHROPIC_API_KEY=sk-ant-...
python main.py interview --provider anthropic --model claude-haiku-4-5 --product "..." --questions "..." --n 10

To use a local OpenAI-compatible server, keep provider=openai and override --base-url. Any non-empty OPENAI_API_KEY works; local servers ignore the value.

export OPENAI_API_KEY=local
python main.py interview --base-url http://localhost:8080/v1 --model llama-3-8b --product "..." --questions "..." --n 10

Usage Examples

Validate a product idea

python main.py interview --product "1인 가구용 반찬 정기배송, 월 39,900원, 주 2회 배송" --filter "age:25-39,region:서울특별시" --n 10 --seed 42 --questions "이 서비스 쓰실 의향 있나요?" "월 얼마면 적당한가요?" "거절한다면 왜요?"

A markdown report with intent share (positive/neutral/negative), willingness-to-pay median plus IQR, top rejection reasons, and 5-10 actionable insights for the next round.

A/B test product copy on the same personas

Pin the same persona ids across two runs by extracting them from the first batch and replaying them on the second.

python main.py interview --product "직장인 1인 가구를 위한 건강 반찬, 월 39,900원" --filter "age:25-39,region:서울특별시" --n 10 --seed 42 --questions "쓸 의향?" "월 얼마면?" "거절 사유?" --output outputs/copy-a/

python -c "import json,sys; d=json.load(open(sys.argv[1])); print('\n'.join(r['persona_id'] for r in d['records']))" outputs/copy-a/interview_*.json > /tmp/persona_ids.txt

xargs -I {} echo --persona-id {} < /tmp/persona_ids.txt | xargs python main.py interview --product "주말에 받는 1주일치 한식 반찬 박스, 월 39,900원" --questions "쓸 의향?" "월 얼마면?" "거절 사유?" --output outputs/copy-b/

Both runs interview the exact same persona ids, so the only variable is the product copy.

Cohort comparison

python main.py interview --product "직장인 1인 가구를 위한 건강 반찬 정기배송" --filter "age:20-29" --n 15 --seed 42 --questions "쓸 의향?" "월 얼마면?" "거절 사유?" --output outputs/cohort-20s/
python main.py interview --product "직장인 1인 가구를 위한 건강 반찬 정기배송" --filter "age:30-39" --n 15 --seed 42 --questions "쓸 의향?" "월 얼마면?" "거절 사유?" --output outputs/cohort-30s/

The cohort intent table inside each report further splits by region and gender, so you can see whether a 20s/30s gap holds across all regions or comes from one segment.

Large-scale screen with single-turn mode

Single-turn mode bundles every question into one chat call, which roughly halves the prompt tokens versus multi-turn. The auto follow-up is disabled in this mode.

python main.py interview --product "1인 가구용 반찬 정기배송, 월 39,900원" --filter "age:20-49" --n 100 --seed 42 --concurrency 8 --single-turn --questions "이 서비스 쓸 의향?" "월 얼마면 적당?" "거절 사유?"

Resume after a partial-failure exit

A 30-person batch hit rate-limit storms and the run exited with code 3. Re-run only the failed records on top of the previous JSON.

python main.py interview --product "..." --filter "..." --n 30 --seed 42 --questions "..." --resume outputs/interview_korea-persona-interview_20260502_120000.json

meta_extra.previous_run_id is set to the original interview_id so the two runs can be linked.

Tip: ask explicit value-pricing questions

willingness_to_pay is filled in only when the persona names a specific number. To maximize the explicit-number rate, ask a direct value-pricing question.

"본인은 월 얼마면 가입하시겠어요?" (anchored to a monthly subscription)
"월 39,900원이면 가입할 의향이 있으세요? 아니면 얼마면 적당할까요?" (counter-offer prompt)
"비슷한 서비스에 한 달에 얼마까지 쓸 수 있어요?" (ceiling probe)

Open-ended price questions often only return a qualitative signal (acceptable_price_signal), which is filled for every record but does not produce a willingness_to_pay integer.

CLI Reference

Subcommands

Command	Description	Exit codes
`healthcheck`	Verify provider reachability and model availability	0 ok, 1 missing key / 401 / 429 / unreachable
`list-personas`	Preview personas matching a filter	0 ok, 2 no match
`interview`	Run a batch interview, save JSON, auto-generate report	0 ok, 1 server error, 2 sample shortfall, 3 partial failure
`report`	Generate a markdown report from an interview JSON	0 ok, 1 input error, 2 no valid records

Exit code 130 is reserved for SIGINT (Ctrl-C). The first interrupt saves a partial JSON; the second terminates immediately.

Root options

These apply to every subcommand and must be placed before the subcommand name.

Option	Default	Description
`--config PATH`	`config.yaml` in cwd	Override the config file path
`--no-color`	off	Disable ANSI color output (also honors `NO_COLOR` env)
`--log-level LEVEL`	`INFO` (from yaml)	Set log level: `DEBUG`/`INFO`/`WARNING`/`ERROR`
`--json`	off	Emit a single JSON document on stdout. Disables tqdm, color, and Korean labels. Errors land as `{"error": {...}}` with non-zero exit

`interview` options

Option	Default	Description
`--product TEXT`	required	One-line product description (max 2000 chars)
`--questions TEXT`	required, repeatable	Each question is one `--questions` flag (max 2000 chars each)
`--filter SPEC`	none	Filter DSL (see below)
`--persona-id UUID`	none, repeatable	Pin specific persona ids by uuid. Disables `--n` and `--seed` randomization. Combine with `--filter` for an intersection
`--n N`	`10`	Number of personas
`--seed N`	`42`	Sampling seed
`--concurrency N`	`4`	Async concurrency, range 1-10
`--persona-fields LIST`	`summary`	Comma-separated toggles: `summary`, `professional`, `sports`, `arts`, `travel`, `culinary`, `family`
`--follow-up TEXT`	none, repeatable	Common follow-up question for every persona
`--single-turn`	off	Bundle every question into one chat call. Auto follow-up disabled
`--dry-run`	off	Run one persona, print to console, write neither JSON nor report
`--output DIR`	`outputs/`	Result JSON directory
`--report / --no-report`	`--report`	Auto-generate the markdown report after the interview
`--resume PATH`	none	Re-run only the `failed` records of a previous result JSON
`--provider {openai,anthropic}`	from `llm.provider`	LLM provider
`--base-url URL`	from `llm.base_url`	LLM server base URL
`--model MODEL_ID`	from `llm.model`	One-shot model override

`report` options

Option	Default	Description
`RESULT_PATH`	required (positional)	Path to an interview JSON
`--top-n N`	`10`	Number of top rejection reasons
`--include-drift`	off	Include `status: drift` records in quantitative aggregation
`--output-dir DIR`	next to input JSON	Where to save the markdown report
`--insight-model MODEL_ID`	from `common.report.insight_model` or `--model`	Use a different model for the qualitative-insight call only

healthcheck and list-personas accept the same provider/base-url/model triple plus filter/limit/seed. See python main.py {sub} --help for the full list.

Filter DSL

Filters use key:value pairs separated by commas. Different keys combine with AND, repeated keys combine with OR.

age:25-39 (range), age:30 (exact)
gender:F, gender:M, gender:여자, gender:남자, gender:여성, gender:남성 (all map to 여자/남자)
region:서울특별시, region:서울 (17 provinces, with full-name aliases)
subregion:강남구 (suffix match against the district column)
occupation_keyword:개발자 (substring match)

Examples.

--filter "age:25-39,region:서울특별시"                    # 25-39 AND Seoul
--filter "age:25-39,region:서울특별시,region:경기도"      # 25-39 AND (Seoul OR Gyeonggi)
--filter "gender:F,occupation_keyword:디자이너"          # female AND occupation contains 디자이너

Output Format

Result JSON

Interview results are written to outputs/interview_{slug}_{YYYYMMDD_HHMMSS}.json. The envelope contains the run metadata (interview_id, slug, product, model, seed, config_snapshot) plus a records array. Each record holds persona_meta, the multi-turn messages, per-question raw_responses, a structured_summary, and flags.

Field	Notes
`interview_id`	uuid, one per run
`schema_version`	`2` since v1.1.0 (was `1` in v1.0.x). Readers can branch on this to handle the `acceptable_price_signal` field
`model`	Resolved model id (e.g. `gpt-4o-mini`)
`meta_extra.usage`	Aggregated `prompt_tokens`, `completion_tokens`, `total_tokens`, `cached_tokens`
`meta_extra.previous_run_id`	Set when the run came from `--resume`. Holds the source run's `interview_id`
`records[].status`	`completed` / `refused` / `failed` / `drift`
`records[].structured_summary`	`intent`, `acceptable_price_signal`, `willingness_to_pay`, `willingness_to_pay_currency`, `rejection_reasons`, `one_line`
`records[].flags`	`persona_drift`, `auto_follow_up_used`, `refusal_detected`, `truncated`, `parse_failed`

See docs/prd/korea-persona-interview.md section 5.4 for the full schema. v1 JSON files load fine on v1.1.0+ (the loader fills acceptable_price_signal=null).

Markdown report

The report subcommand emits outputs/report_{slug}_{YYYYMMDD_HHMMSS}.md next to the input JSON by default.

# 가상 인터뷰 리포트: {product}
| meta table | model, seed, persona counts, dataset, usage |

## 1. 정량 지표
### 1.1. 의향률          # intent share table + bar chart
### 1.2. 가격 수용가     # WTP median, IQR, histogram
### 1.3. 거절 사유 빈도  # top-N rejection reasons table
### 1.4. 코호트별 의향률 # age x region x gender, masked under min cell size

## 2. 정성 인사이트
### 2.1. 공통 반응       # up to 5 shared reactions
### 2.2. 인사이트        # 5-10 actionable insights
### 2.3. 코호트 차이     # cohort-level qualitative differences

## 3. 제외 record 요약   # excluded record counts and reasons

## 4. 한계와 출처        # synthetic-data caveat, dataset citation, model id

Configuration

Settings policy: secrets via env, defaults via yaml, one-off overrides via CLI. Configuration precedence (later overrides earlier): built-in defaults → config.yaml → CLI options.

The only environment variables this tool reads are secrets and the output directory.

Variable	Purpose
`OPENAI_API_KEY`	OpenAI API key (used when `provider=openai`)
`ANTHROPIC_API_KEY`	Anthropic API key (used when `provider=anthropic`)
`KPI_OUTPUT_DIR`	Output directory override (kept for test/CI isolation)

The full annotated yaml lives in config.yaml. Notable keys.

llm.provider / llm.base_url / llm.model - provider and endpoint. Defaults flip with --provider anthropic (claude-haiku-4-5 on https://api.anthropic.com/v1)
llm.context_budget - 32000 token budget for multi-turn history (oldest user/assistant pairs dropped first; system prompt preserved)
llm.streaming / llm.anthropic_cache_control / llm.extra_chat_kwargs - provider-specific tuning
batch.concurrency (1-10, default 4) and batch.partial_failure_threshold (default 0.5)
common.dataset.field_map, common.dataset.gender_aliases, common.dataset.province_aliases - column and value aliases for dataset schema changes
common.persona.fields and common.persona.system_prompt_path - persona toggles and system prompt template path
common.report.cohort_min_cell / histogram_bins / bar_width / insight_model / estimate_wtp_from_signal
common.output.output_dir / log_level / no_color
heuristics.short_answer_threshold / english_ratio_threshold / ambiguous_keywords / refusal_keywords / auto_follow_up_text / auto_follow_up_max / occupation_english_whitelist / llm_drift_review
mcp.mode - orchestrator (default, no server-side key) or server (server-side OpenAI/Anthropic). See ADR-005 for the rationale

Choosing a model

gpt-4o-mini is the default and gives a strong baseline for this workload. If you measure persona-drift rates above 5% on your own runs, try the alternatives below.

gpt-4o-mini (OpenAI) - default. Good Korean fluency and persona adherence
gpt-4o (OpenAI) - higher quality
claude-haiku-4-5 (Anthropic) - default for --provider anthropic
claude-sonnet-4-5 / claude-opus-4-5 (Anthropic) - higher quality
Local LLMs via mlx_lm.server, vLLM, or llama.cpp work as long as they expose the OpenAI Chat Completions API surface. Korean fluency depends on the underlying weights; validate persona drift on a small sample first

Persona-drift behavior has been validated end-to-end with gpt-4o-mini. Other models may need tuned thresholds (heuristics.english_ratio_threshold, heuristics.short_answer_threshold).

Customization

System prompt: edit prompts/system_prompt.txt (must contain {persona_json} and {product} placeholders). Point common.persona.system_prompt_path at a different file to use your own template
Heuristic thresholds: tune heuristics.* in config.yaml (lower short_answer_threshold for tighter follow-ups, raise english_ratio_threshold for technical domains, append domain-specific phrases to refusal_keywords/ambiguous_keywords)
Report output: raise common.report.cohort_min_cell to 5 or 7 for tighter masking; lower bar_width for narrow terminals; tune histogram_bins for different price resolution

Integration with External Agents

There are three entry points: CLI, MCP server, and MCP orchestrator. They are not interchangeable - the choice depends on whether you want server-side LLM calls (CLI, MCP server) or whether the host agent's sub-agent does the LLM work (MCP orchestrator).

Entry point matrix

Entry point	mode (yaml)	Server-side LLM call	Host LLM call	API key required
CLI (`kpi`)	n/a	yes	no	provider-dependent
MCP server	`mcp.mode: "server"`	yes	no	provider-dependent
MCP orchestrator	`mcp.mode: "orchestrator"` (default)	no	yes (host sub-agent)	none

There is no automatic fallback between modes. The chosen path is reflected on every response as "backend": "mcp_server" or "backend": "mcp_orchestrator". ADR-005 captures the rationale (sampling mode was removed in v1.2.0 because mainstream MCP clients did not advertise the capability).

If you run python -m src.mcp_server outside an MCP host with mcp.mode: "orchestrator", the helper tools still work but interview is blocked with a hint to use build_batch_prompts + sub-agent + aggregate_results instead.

Tool exposure by mode

Tool	MCP server	MCP orchestrator	Notes
`healthcheck`	yes	yes	server mode pings the provider; orchestrator mode returns ok + cwd
`list_personas`	yes	yes	preview personas matching a filter
`interview`	yes	no (blocked)	server-side batch interview
`report`	yes	yes	server mode runs the qualitative-insight LLM call; orchestrator mode skips it
`build_persona_prompt`	no	yes	system prompt + persona dict for one persona
`build_batch_prompts`	no	yes	system prompts for N personas (host sub-agent fan-out)
`aggregate_results`	no	yes	takes records from the host and emits the markdown report
`detect_persona_drift` / `should_auto_follow_up` / `parse_structured_summary` / `interview_record_schema`	yes	yes	helpers. CLI and MCP server auto-apply; MCP orchestrator must invoke explicitly

Registering the MCP entry point

Run the server manually to verify it starts.

python -m src.mcp_server

Register it in Claude Code by adding the snippet below to ~/.claude/mcp.json (create the file if it does not exist). The cwd must point at the project root so that config.yaml, prompts/system_prompt.txt, .env, and outputs/ resolve correctly.

{
  "mcpServers": {
    "korea-persona-interview": {
      "command": "/absolute/path/to/.venv/bin/python",
      "args": ["-m", "src.mcp_server"],
      "cwd": "/absolute/path/to/korea-persona-interview"
    }
  }
}

For Cursor, add the snippet to .cursor/mcp.json at the project root. Drop-in copies live under examples/mcp/.

In MCP server mode, drop your OPENAI_API_KEY (or ANTHROPIC_API_KEY) into the project's .env before the first run. The stdlib .env loader uses setdefault semantics so a key already exported in the shell wins. Putting the key in the agent's mcp.json env block also works but the secret ends up in plaintext inside the agent's config and is more likely to leak through git, dotfile sync, or screenshots.

MCP orchestrator mode usage (default)

The host agent owns the LLM. The flow:

Call build_batch_prompts with product, questions, n (and optionally filter, seed, persona_ids). Returns N system prompts plus persona dicts
The host fans out N sub-agents (one per persona). Each sub-agent uses its own LLM with the returned system prompt as the system message and the questions as user turns. The host can also call should_auto_follow_up and detect_persona_drift between turns to keep behavior parity with the CLI heuristic
After the LLM call the host calls parse_structured_summary on the LLM's structured-summary text to get a normalized dict, then assembles a record per interview_record_schema
The host calls aggregate_results with the assembled records. The tool runs the quantitative aggregation and writes the markdown report. Qualitative insights default to a fallback message; the host can pass its own as insights to be embedded

MCP server mode usage

Set mcp.mode: "server" in config.yaml to call OpenAI/Anthropic server-side. Ask the agent in plain Korean: "1인 가구 대상 반찬 정기배송 (월 39,900원)을 25-39세 서울 30명에게 인터뷰 돌리고 리포트까지 만들어 줘" and it will call interview then report back-to-back, returning the markdown path.

--json mode for shell scripts

For agents that drive a CLI directly, pass --json at the root group. Disables tqdm, color, and Korean labels; emits a single JSON document on stdout. Logs continue to flow to stderr and outputs/logs/run_*.jsonl.

python main.py --json healthcheck
# {"ok": true, "base_url": "https://api.openai.com/v1", "model": "gpt-4o-mini", "models": [...]}

python main.py --json interview --product "..." --questions "..." --n 10
# {"ok": true, "output_path": "outputs/interview_*.json", "report_path": "outputs/report_*.md", "summary": {...}, "usage": {...}, "model": "gpt-4o-mini"}

Errors are emitted as {"error": {"code": "...", "message": "...", "exit_code": N}} with a non-zero exit code.

Development

uv venv --python 3.12
source .venv/bin/activate
uv pip sync requirements.lock requirements-dev.lock
pytest tests/ -v

The suite mocks the OpenAI/Anthropic APIs with pytest-httpx and the dataset with monkeypatch fixtures, so it does not require a live API key or network access. Coverage spans config, filter DSL, persona loader, LLM client/backend, interview session, persona drift, batch runner, report quant, MCP dispatch in both modes, MCP orchestrator helper tools, error messages, logging, and CLI integration.

Manual smoke tests that exercise a real LLM API call live under tests/manual/ and are excluded from the default run.

Use Conventional Commits (feat:, fix:, chore:, docs:, refactor:, test:). Do not put Co-Authored-By trailers on commits.

Limitations and Disclaimer

Synthetic personas are not a replacement for real user interviews. The dataset is generated, not sampled from real respondents, so the demographic distribution may diverge from the actual Korean population. Treat the output as a quick gut check before recruiting real participants and as a way to pressure-test interview questions and product copy before spending recruitment budget.

Every report and JSON file produced by this tool also carries the synthetic-data disclaimer in its footer.

The --product text and the persona metadata used for each interview are sent to whichever LLM endpoint you configure (OpenAI, Anthropic, a local server, or the MCP host agent's LLM). Do not put unreleased IP, trade secrets, or personally identifiable information into --product. Abstract or paraphrase sensitive parts before running the tool. The tool itself ships no external telemetry beyond the LLM call and the initial dataset download from Hugging Face.

API billing is the user's responsibility. Token usage (prompt / completion / cached) is printed at the end of each run, written into the result JSON meta_extra.usage, and surfaced in the report header so you can correlate it against your provider's invoice. The tool does not estimate USD cost. Persona-drift quality is validated against gpt-4o-mini; other models may need tuned thresholds.

Legal and ethical review of the output is the user's responsibility. The tool does not run any compliance or PII filter beyond the input-secret policy.

Roadmap

A short list of v1.3.0 candidates. Full details in docs/backlog/v1.3.0.md.

FastAPI REST API on top of the same application layer
OpenAI Batch API path for offline runs
Multi-model A/B routing (run the same persona sample on two different models and diff the outputs)
Provider quality validation report (golden-dataset drift measurement for OpenAI, Anthropic, local LLM)
macOS Keychain / Linux libsecret / Windows Credential Manager integration for API keys
Per-record streaming write to disk so OOM/crash mid-batch loses fewer records than the SIGINT partial save

Dataset and Credits

This project uses the nvidia/Nemotron-Personas-Korea dataset.

Title: Nemotron-Personas-Korea
Author: NVIDIA Corporation (2025)
Source: https://huggingface.co/datasets/nvidia/Nemotron-Personas-Korea
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Modifications: none. The dataset is downloaded from Hugging Face Hub at runtime and sampled in-memory. No derivative dataset is redistributed by this repository

About 1M records and 7M synthetic Korean personas covering name, gender, age, marital status, education, occupation, residence (province and district), and seven persona facets (professional, sports, arts, travel, culinary, family, summary).

CC BY 4.0 permits commercial use with attribution. Credit goes to NVIDIA Corporation. Every markdown report and JSON record produced by this tool also carries the dataset citation and license in its footer so attribution travels with downstream artifacts.

Acknowledgments

This project was developed with Claude Code.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
outputs		outputs
prompts		prompts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
SECURITY.md		SECURITY.md
config.yaml		config.yaml
main.py		main.py
pyproject.toml		pyproject.toml
requirements-dev.lock		requirements-dev.lock
requirements-dev.txt		requirements-dev.txt
requirements.lock		requirements.lock
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

korea-persona-interview

Features

Requirements

Installation

Quick Start

Usage Examples

Validate a product idea

A/B test product copy on the same personas

Cohort comparison

Large-scale screen with single-turn mode

Resume after a partial-failure exit

Tip: ask explicit value-pricing questions

CLI Reference

Subcommands

Root options

interview options

report options

Filter DSL

Output Format

Result JSON

Markdown report

Configuration

Choosing a model

Customization

Integration with External Agents

Entry point matrix

Tool exposure by mode

Registering the MCP entry point

MCP orchestrator mode usage (default)

MCP server mode usage

--json mode for shell scripts

Development

Limitations and Disclaimer

Roadmap

Dataset and Credits

Acknowledgments

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`interview` options

`report` options

Packages