v0.2.0 — ChatGPT Adapter MVP
ChatGPT export adapter (F2)
The first production adapter. You can now feed a real ChatGPT
conversations.json export into the library and get back a clean linear
thread of NormalizedMessages.
What works
- ✅ Detects ChatGPT exports by sniffing
"mapping"or"conversations"in the file header - ✅ Reconstructs the active linear thread from the non-linear
mappingtree
(walkscurrent_node→ parent chain → root, then yields root-first) - ✅ Role mapping:
user/assistant/system/tool. Unknown roles
(e.g. custom GPTname) are dropped. - ✅ Skips whitespace-only and empty messages
- ✅ Extracts text
partsonly — drops structured payloads (image_url,
code interpreter, tool calls) - ✅ FNV-1a 64-bit content hash for downstream dedup
- ✅ Slugifies conversation title into a
project_hint
Library API
use loust_llm_mempipe::adapter::chatgpt::ChatGptAdapter;
use loust_llm_mempipe::adapter::Adapter;
let adapter = ChatGptAdapter;
let reader = Box::new(std::fs::File::open("conversations.json")?);
let messages: Vec<_> = adapter.stream_messages(reader)?.collect();Validation
cargo fmt --check— cleancargo clippy --all-targets -- -D warnings— cleancargo test— 16/16 pass (7 new ChatGPT tests + 9 library tests)cargo build --release— OK (~10s clean build)
Limitations (F2 MVP)
- The full export is materialized in memory (~2× JSON size). For 50 MB
exports that's ~100 MB peak — fine for a developer laptop. True
streaming optimization is tracked for F3+ if real exports need it. - CLI surface still placeholder (F4). Library is usable from Rust code.
Next
- F3: pipeline core (Rule E secret scrubber → Jaccard dedup → signal scoring
→ JSONL + Markdown writer) - F4: clap CLI with
--input,--output,--format,--stats - F5: CI + smoke E2E