Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,7 @@ test_lhm/
table_info_db
ko_reranker_local
*.csv
bot.log
lang2sql_data.db
lang2sql-datasets/
docs/lang2sql-datasets.zip
63 changes: 63 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Lang2SQL — Claude Code 작업 가이드

## 프로젝트 정체성

"더 좋은 SQL 생성기"가 아니라 **현실의 messy DB에서도 답하는 분석 에이전트**.
Vanna/Wren이 못 푸는 4가지 현실 문제(DB 강건성, 기억, 문서 ingestion, 팀별 시멘틱 분기)를 다룬다.

## 4기둥 현황

| ★ | 기둥 | V1 상태 | V1.5 목표 |
|---|---|---|---|
| ① | **Safety pipeline + DB 강건성** | whitelist/timeout layer만 존재. 메타데이터 자동 보강 **미구현** | 자동 보강이 핵심 차별점 |
| ② | Memory 3축 | in-memory store, inject-all recall, manual extractor | SQLite/keyword/auto |
| ③ | Ingestion matrix | file source + LLM extractor | URL/Notion/DDL |
| ④ | Semantic federation | 3-scope merge 동작 | diff/promote 커맨드 |

## 아키텍처 한 줄 요약

```
frontends → tenancy(조립점) → harness(agent_loop) → 4기둥 → core ports ← adapters
```

- `core/ports/` — Protocol 정의만, 외부 의존 0. **건드리지 말 것**
- `adapters/` — 외부 시스템 구체 구현 (DB, LLM, storage)
- `tenancy/concierge.py` — 유일한 조립점 (구체 클래스 import 허용)

## 현재 작업 포커스: DB 강건성 (★①) 고도화

### 문제
`Column.description`이 비어 있는 실무 DB에서 LLM이 컬럼 의미를 모름 → 잘못된 SQL 생성.

### 관련 파일
- [src/lang2sql/core/ports/explorer.py](src/lang2sql/core/ports/explorer.py) — `Column.description` 필드 (v1.5 자동 보강 예정 주석 있음)
- [src/lang2sql/adapters/db/sqlalchemy_explorer.py](src/lang2sql/adapters/db/sqlalchemy_explorer.py) — `_describe_table_sync`: `c.get("comment") or ""`로 DB comment만 읽음
- [src/lang2sql/tools/explore_schema.py](src/lang2sql/tools/explore_schema.py) — description 있을 때만 프롬프트에 노출
- [src/lang2sql/harness/system_prompt.py](src/lang2sql/harness/system_prompt.py) — 스키마 주입 위치
- [src/lang2sql/safety/](src/lang2sql/safety/) — pipeline.py + layers/ (새 layer 추가 시 여기)

### 확장 패턴 (기존 코드 안 건드리고 추가)
- 새 safety layer → `safety/layers/<name>.py`에 `SafetyLayerPort` 구현 후 `pipeline.py` 목록에 끼우기
- 새 DB 어댑터 → `adapters/db/<name>_explorer.py`에 `ExplorerPort` 구현 후 `factory.py`에 scheme 분기
- 메타데이터 보강 → `ExplorerPort`에 `enrich_metadata()` 메서드 추가 또는 별도 enricher 포트로 추상화

## 개발 환경

```bash
cd /home/sewon/project/Lang2SQL
uv sync
.venv/bin/pytest -q # 110개 테스트 (safety 12개 회귀 포함)
.venv/bin/python bench/ecommerce_demo.py # federation + safety 데모
```

## Git 브랜치 전략

- 내 포크: `git@github.com:thrcle/Lang2SQL.git` (origin)
- 업스트림: `https://github.com/CausalInferenceLab/Lang2SQL.git` (upstream)
- 작업 브랜치 생성 후 origin에 push → upstream으로 PR
- 브랜치 가이드: `docs/branch_guidelines.md`, PR 가이드: `docs/pull_request_guidelines.md`

## 테스트 원칙

- safety 회귀 12케이스는 **머지 게이트** — 새 layer 추가 시 반드시 케이스 추가
- `adapters/llm/fake.py`로 오프라인 LLM 테스트 가능
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,11 +106,11 @@ The bot exits loudly if `DISCORD_BOT_TOKEN` is unset. Full setup and hosting:
## What V1 does / does NOT do yet (honesty section)

**Does:**
- 3-scope semantic federation (guild / channel / thread) with most-specific-wins
resolution; `define_metric` writes to the current scope.
- 3-scope semantic federation (guild / channel / member) with most-specific-wins
resolution; `term_custom` registers definitions per scope (KV-backed).
- Safety pipeline with the V1 layers (whitelist + timeout), gating every query.
- Agent loop with six tools: `run_sql`, `explore_schema`, `define_metric`,
`ingest_doc`, `remember`, `ask_user`.
- Agent loop with eight tools: `run_sql`, `explore_schema`, `enrich_schema`,
`term_custom`, `org_setup`, `ingest_doc`, `remember`, `ask_user`.
- Memory service (in-memory store + inject-all recall + manual `/remember`).
- Discord frontend (bot, commands, session router, render).
- Encrypted-at-rest secrets (Fernet) and SQLite-backed persistence.
Expand Down
103 changes: 49 additions & 54 deletions bench/ecommerce_demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
Run: .venv/bin/python bench/ecommerce_demo.py

This is the study-group demo. It exercises the *real* V1 code paths
(``ContextConcierge`` + scope resolver + the canned Postgres explorer + the
offline ``FakeLLM``) to show three things end-to-end without a token or a live
database:
(``ContextConcierge`` + KV-backed federation + the canned Postgres explorer +
the offline ``FakeLLM``) to show three things end-to-end without a token or a
live database:

Section 1 — define three e-commerce metrics in a channel and read them back.
Section 2 — ★④ semantic federation: the *same* term ``active_user`` carries
Expand All @@ -22,13 +22,13 @@

import asyncio

from lang2sql.adapters.storage.sqlite_store import SqliteStore
from lang2sql.core.identity import Identity
from lang2sql.core.ports.safety import SafetyContext, Verdict
from lang2sql.harness.loop import agent_loop
from lang2sql.safety.pipeline import SafetyPipeline
from lang2sql.semantic.types import Metric
from lang2sql.tenancy.concierge import ContextConcierge
from lang2sql.tenancy.scope_resolver import ScopeResolver
from lang2sql.tools.semantic_federation import FedEntry, _kv_key, _render_effective, _load_all, _resolve_term

# Stable IDs for the demo guild and its two channels.
GUILD = "acme-shop"
Expand All @@ -50,20 +50,13 @@ def _finance_identity() -> Identity:
return Identity(user_id="evan", guild_id=GUILD, channel_id=CH_FINANCE)


def _define_term(store: SqliteStore, scope: str, term: str, layer: str, entity: str, definition: str) -> None:
entry = FedEntry(term=term, layer=layer, entity=entity, definition=definition)
store.kv_set(scope, _kv_key(term, layer, entity), entry.to_json())


async def section_0_harness(concierge: ContextConcierge) -> None:
"""Drive one full agent turn through the assembled harness (offline).

This is the *wiring* proof, not an intelligence proof: ``ContextConcierge``
picks the offline FakeLLM (no OPENAI_API_KEY), starts a session, and wires
the canned Postgres explorer + six tools into a ``HarnessContext`` that
``agent_loop`` drives LLM → tool → LLM to a final answer. No network, no
real database.

The FakeLLM is a deterministic stub: it blindly calls the first tool
(``run_sql``) with placeholder args, so its turn ends up *demonstrating the
safety gate* rather than answering the question. With OPENAI_API_KEY set,
the same loop calls gpt-4.1-mini instead — zero other code changes.
"""
"""Drive one full agent turn through the assembled harness (offline)."""
_hr("SECTION 0 — assembled harness runs one turn (ContextConcierge + FakeLLM)")

ident = _marketing_identity()
Expand All @@ -80,64 +73,67 @@ async def section_0_harness(concierge: ContextConcierge) -> None:
print(" ★① behaviour Section 3 isolates.")


async def section_1_define_metrics(resolver: ScopeResolver) -> None:
async def section_1_define_metrics(store: SqliteStore) -> None:
"""Define three e-commerce metrics in #marketing and read them back."""
_hr("SECTION 1 — define three metrics (★① business-context learning)")

ident = _marketing_identity()
scope = ident.default_write_scope() # current channel by default
print(f"Writing to default scope for this channel: {scope}\n")
channel_id = ident.channel_id or ""
scope = ident.guild_id or GUILD
print(f"Writing to channel layer for #{CH_MARKETING} (channel_id={channel_id})\n")

metrics = [
Metric("total_revenue", "SUM(orders.amount) WHERE status != 'cancelled'"),
Metric("aov", "total_revenue / COUNT(DISTINCT orders.id)"),
Metric("paid_orders", "COUNT(*) FROM orders WHERE status = 'paid'"),
("total_revenue", "SUM(orders.amount) WHERE status != 'cancelled'"),
("aov", "total_revenue / COUNT(DISTINCT orders.id)"),
("paid_orders", "COUNT(*) FROM orders WHERE status = 'paid'"),
]
for m in metrics:
await resolver.define(scope, m)
print(f" defined {m.name:>14} = {m.definition}")
for name, definition in metrics:
_define_term(store, scope, name, "channel", channel_id, definition)
print(f" defined {name:>14} = {definition}")

layer = await resolver.effective_layer(ident)
print(f"\nEffective layer for #{CH_MARKETING} now holds "
f"{len(layer.entries)} definition(s):")
print(layer.render())
rendered = _render_effective(store, scope, channel_id, ident.user_id)
lines = [l for l in rendered.splitlines() if l.startswith("-")]
print(f"\nEffective layer for #{CH_MARKETING} now holds {len(lines)} definition(s):")
print(rendered)


async def section_2_federation(resolver: ScopeResolver) -> None:
async def section_2_federation(store: SqliteStore) -> None:
"""Same term, two channels, two definitions — no conflict (★④)."""
_hr("SECTION 2 — semantic federation: one term, two definitions (★④)")

# #marketing defines active_user one way ...
mkt = _marketing_identity()
await resolver.define(
mkt.default_write_scope(),
Metric("active_user", "user with a login event in the last 30 days"),
)
# ... and #finance defines the SAME name a different way.
fin = _finance_identity()
await resolver.define(
fin.default_write_scope(),
Metric("active_user", "user with an active paid subscription"),
)

_define_term(store, GUILD, "active_user", "channel", CH_MARKETING,
"user with a login event in the last 30 days")
_define_term(store, GUILD, "active_user", "channel", CH_FINANCE,
"user with an active paid subscription")

print("Defined 'active_user' independently in two channels.\n")
print("Now resolving the *effective* definition each channel sees")
print("by walking its scope chain (most specific scope wins):\n")

mkt_layer = await resolver.effective_layer(mkt)
fin_layer = await resolver.effective_layer(fin)
mkt_def = mkt_layer.lookup("active_user")
fin_def = fin_layer.lookup("active_user")
mkt_rendered = _render_effective(store, GUILD, CH_MARKETING, mkt.user_id)
fin_rendered = _render_effective(store, GUILD, CH_FINANCE, fin.user_id)

# Read definitions directly from the store — don't parse rendered display text
by_term = _load_all(store, GUILD)
entries = by_term.get("active_user", [])
mkt_raw = store.kv_get(GUILD, _kv_key("active_user", "channel", CH_MARKETING))
fin_raw = store.kv_get(GUILD, _kv_key("active_user", "channel", CH_FINANCE))
mkt_def = FedEntry.from_json(mkt_raw).definition if mkt_raw else ""
fin_def = FedEntry.from_json(fin_raw).definition if fin_raw else ""

print(f" #{CH_MARKETING:<10} active_user → {mkt_def.definition}")
print(f" #{CH_FINANCE:<10} active_user → {fin_def.definition}")
print(f" #{CH_MARKETING:<10} active_user → {mkt_def}")
print(f" #{CH_FINANCE:<10} active_user → {fin_def}")

assert mkt_def.definition != fin_def.definition
assert mkt_def and fin_def and mkt_def != fin_def, (
f"Federation failed: mkt_def={mkt_def!r}, fin_def={fin_def!r}"
)
print("\n ✅ Same term, two live definitions, zero conflict.")
print(" Each channel is its own branch in the federation tree;")
print(" neither overwrote the other. (Wren's single MDL cannot do this.)")

# Show the scope chain that produced the marketing answer.
chain = " → ".join(str(s) for s in mkt.scope_chain())
print(f"\n #{CH_MARKETING} resolution order: {chain}")
print(" Lookup stops at the first scope that defines the name (CHANNEL),")
Expand Down Expand Up @@ -173,14 +169,13 @@ def section_3_safety(pipeline: SafetyPipeline) -> None:
async def main() -> None:
print("Lang2SQL v4.1 — e-commerce demo (offline: FakeLLM, canned PG, in-memory)")

# One shared resolver so federation state persists across sections 1 and 2.
resolver = ScopeResolver()
store = SqliteStore()
pipeline = SafetyPipeline()
concierge = ContextConcierge()

await section_0_harness(concierge)
await section_1_define_metrics(resolver)
await section_2_federation(resolver)
await section_1_define_metrics(store)
await section_2_federation(store)
section_3_safety(pipeline)

_hr("DONE")
Expand Down
18 changes: 8 additions & 10 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,21 +64,18 @@
시스템 전체의 *어휘*가 모여 있습니다. 외부 의존 0, I/O 0.
- [`types.py`](../src/lang2sql/core/types.py) — `Message`, `ToolCall`, `ToolResult`, `Completion`, `Role`
- [`identity.py`](../src/lang2sql/core/identity.py) — `Identity`, `Scope`, federation의 `scope_chain()` 순서 (narrow→wide)
- [`ports/`](../src/lang2sql/core/ports/) — 11개 Protocol: `LLMPort`, `ExplorerPort`, `ToolPort`, `SafetyLayerPort`, `SafetyPipelinePort`, `StorePort`, `RecallPort`, `ExtractorPort` (memory), `SourcePort`, `DocExtractorPort`, `ScopeResolverPort`, `FrontendPort`, `SecretsPort`, `SessionStorePort`, `AuditPort`
- [`ports/`](../src/lang2sql/core/ports/) — Protocol: `LLMPort`, `ExplorerPort`, `ToolPort`, `SafetyLayerPort`, `SafetyPipelinePort`, `StorePort`, `RecallPort`, `ExtractorPort` (memory), `SourcePort`, `DocExtractorPort`, `FrontendPort`, `SecretsPort`, `SessionStorePort`, `AuditPort`

### `src/lang2sql/harness/` — 에이전트 한 턴의 엔진
- [`context.py`](../src/lang2sql/harness/context.py) — `HarnessContext` (llm + tools + safety + explorer + scope_resolver + session 한 다발)
- [`context.py`](../src/lang2sql/harness/context.py) — `HarnessContext` (llm + tools + safety + explorer + store + session 한 다발)
- [`session.py`](../src/lang2sql/harness/session.py) — 대화 transcript
- [`loop.py`](../src/lang2sql/harness/loop.py) — `agent_loop`: system prompt → LLM → tool 호출 → 다음 턴
- [`tool_registry.py`](../src/lang2sql/harness/tool_registry.py) — 이름→도구 dispatch
- [`system_prompt.py`](../src/lang2sql/harness/system_prompt.py) — 시멘틱 + 스키마 주입

### `src/lang2sql/semantic/` — 시멘틱 레이어 + federation (★④)
### `src/lang2sql/semantic/` — 시멘틱 타입 정의 (★④)
- [`types.py`](../src/lang2sql/semantic/types.py) — `SemanticEntry` (METRIC/DIMENSION/RELATIONSHIP/RULE)
- [`layer.py`](../src/lang2sql/semantic/layer.py) — `SemanticLayer.render()` (시스템 프롬프트로 들어감)
- [`scoped_layer.py`](../src/lang2sql/semantic/scoped_layer.py) — *가장 구체적 scope가 승리*하는 merge
- [`store.py`](../src/lang2sql/semantic/store.py) — in-memory store
- [`sql_composer.py`](../src/lang2sql/semantic/sql_composer.py) — metric 이름 → 정의 펼치기 (V1 최소)
- Federation 로직은 [`tools/semantic_federation.py`](../src/lang2sql/tools/semantic_federation.py)로 통합 (KV 기반)

### `src/lang2sql/safety/` — Read-only 게이트 (★①)
- [`pipeline.py`](../src/lang2sql/safety/pipeline.py) — layer를 순서대로 통과, *첫 비-PASS에서 차단*
Expand All @@ -98,18 +95,19 @@
- [`pipeline.py`](../src/lang2sql/ingestion/pipeline.py) — Source × Extractor matrix

### `src/lang2sql/tools/` — 에이전트가 부르는 capability
6개 도구 (모두 ctx-aware, async):
8개 도구 (모두 ctx-aware, async):
- [`run_sql.py`](../src/lang2sql/tools/run_sql.py) — safety 통과 후 explorer로 실행
- [`explore_schema.py`](../src/lang2sql/tools/explore_schema.py) — 테이블/컬럼 introspection
- [`define_metric.py`](../src/lang2sql/tools/define_metric.py) — scope-aware 정의 쓰기
- [`enrich_schema.py`](../src/lang2sql/tools/enrich_schema.py) — LLM으로 컬럼 메타데이터 자동 보강
- [`semantic_federation.py`](../src/lang2sql/tools/semantic_federation.py) — `term_custom`: guild/channel/member 계층 용어 사전 (KV 기반, narrow→wide lookup)
- [`org_setup.py`](../src/lang2sql/tools/org_setup.py) — 전사/팀 단위 용어 일괄 등록
- [`remember.py`](../src/lang2sql/tools/remember.py) — fact 저장
- [`ask_user.py`](../src/lang2sql/tools/ask_user.py) — 모호하면 사용자에게 질문
- [`ingest_doc.py`](../src/lang2sql/tools/ingest_doc.py) — 문서 → 후보 제안
- [`__init__.py: build_default_tools`](../src/lang2sql/tools/__init__.py) — 어셈블리

### `src/lang2sql/tenancy/` — 조립점
- [`concierge.py`](../src/lang2sql/tenancy/concierge.py) — *유일하게* 구체 클래스를 import 하는 곳. 요청마다 `HarnessContext` 만듦.
- [`scope_resolver.py`](../src/lang2sql/tenancy/scope_resolver.py) — `ScopeResolverPort` 구현 (semantic 위)
- [`encrypted_secrets.py`](../src/lang2sql/tenancy/encrypted_secrets.py) — `cryptography.Fernet` 실 암호화

### `src/lang2sql/adapters/` — 외부 시스템과의 마지막 줄
Expand Down
6 changes: 3 additions & 3 deletions docs/PROJECT.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,9 @@ Vanna AI(~20k★), Wren AI(~12k★), SQLCoder 같은 Text-to-SQL 오픈소스들
- **core 포트 11종** — 모든 외부 의존을 Protocol로 추상화
- **harness** — agent_loop(LLM → tool → 다음 턴), Session, HarnessContext
- **★①~★④ 4기둥** 최소 구현 — safety 12 회귀, memory 3축, ingestion 매트릭스, federation 3-scope
- **도구 6종** — run_sql · explore_schema · define_metric · remember · ask_user · ingest_doc
- **Discord 프론트엔드** — 6개 슬래시 명령 + `/setup` 위저드 (비개발자 DSN-free flow) + bot.py
- **영속화** — SQLite 시멘틱 store + Fernet 실암호화 secrets
- **도구 8종** — run_sql · explore_schema · enrich_schema · term_custom · org_setup · remember · ask_user · ingest_doc
- **Discord 프론트엔드** — 슬래시 명령 + `/setup` 위저드 (비개발자 DSN-free flow) + bot.py
- **영속화** — KV store(federation) + Fernet 실암호화 secrets
- **DB 어댑터** — `SqlAlchemyExplorer` 1개로 Postgres/MySQL/Snowflake/BigQuery/DuckDB 커버 + Cloudflare D1 HTTP 어댑터 + `build_explorer(DSN)` 자동 라우팅
- **106개 자동화 테스트** (safety 회귀 12 포함)
- **bench 데모** — federation + safety 라이브 시연 (`bench/ecommerce_demo.py`)
Expand Down
Loading