v0.1.1 — onboarding wizards + 13 fresh-user fixes
v0.1.0 shipped the new project name (Memexa) on PyPI but the email
ingest path turned out to be broken (hard-coded to maintainer-specific
account names that did not exist in the OSS package). v0.1.1 fixes
that and rewrites onboarding around three interactive wizards, then
adds 13 more fixes caught when the whole flow was re-played from a
fresh-user perspective across Win 11 + Mac Studio + USTC Linux, with
real IMAP credentials (QQ + USTC Exmail-reverse-proxy) and real
WeChatMsg-schema JSON.
TL;DR for a brand-new user
pip install memexa==0.1.1
memexa demo # Tier 0 — see it works
memexa init # scaffold ~/.memexa/
memexa init llm # 4 providers; DeepSeek / OpenAI / Qwen / custom
memexa init email # 12+ IMAP providers auto-detected
memexa backend up # docker compose pg + Hindsight
memexa ingest email # IMAP → batch → LLM extract → POST
memexa quick "<question>" # see your own messages backNew CLI (public)
memexa init # legacy scaffold (templates ship in wheel)
memexa init llm # LLM provider wizard (4 providers)
memexa init email # IMAP wizard (12+ providers auto-detected)
memexa init wechat # WeChatMsg export wizard (Windows-only)
memexa backend up # docker compose -f ~/.memexa/docker-compose.yml up -d
memexa backend status # docker ps + curl /health
memexa backend down # compose down
memexa ingest email # fetch IMAP for all configured accounts
memexa ingest wechat # read WeChatMsg export dir → builder → extract → POST
docs/quickstart.md walks through all of these end-to-end.
Highlights
Critical fix carried from v0.1.0
memexa/extraction/email_history_fetcher.py was hard-coded to two
maintainer-specific account names (qq_email, ustc_email) and tried
to import memexa.qq_email / memexa.ustc_email — modules that
do not exist in the OSS package. v0.1.0 PyPI users who followed
the docs got ModuleNotFoundError. v0.1.1 rewrites the fetcher as a
generic IMAP client (stdlib imaplib + email.parser), reads
email.accounts.<name> from ~/.memexa/identity.yaml, supports
multiple accounts.
13 fresh-user blockers (re-verify pass)
Onboarding — install + init
memexa initshipped without the example templates in the wheel.
Fresh user got 3[warn] template missingwarnings and an empty
~/.memexa/. Templates now ship undermemexa/templates/.memexa init llmcrashed on Chinese-locale Windows console with
UnicodeEncodeError: 'gbk' codec can't encode character '\xa5'
(the¥symbol in the DeepSeek provider note). CLI entry now
reconfigures stdio to UTF-8.memexa init emailfor USTC mail printed the wrong host hint
(imap.exmail.qq.com). That endpoint rate-limits / locks new
logins. The right host ismail.ustc.edu.cn:993, LIVE-verified.
Backend — memexa backend up / memexa doctor
memexa doctorprobed/healthzbut Hindsight serves/health.memexa doctorLLM probe double-prefixed/v1, hitting
/v1/v1/chat/completions→ 404. Now detects/v1already present.memexa doctorread a non-existentnodesfield, always
reported "0 nodes" on a populated bank.memexa backend uppolled with a 60s timeout — too short for a
cold BGE-M3 load. Bumped to 180s.docker-compose.ymlroutedHINDSIGHT_API_LLM_MODELto the
EXTRACT model. Reasoning-class models (deepseek-v4-flash-ascend,
qwen-reasoner) emit content inreasoning_contentand leave
contentempty on Hindsight's strict-JSON prompts, so
fact-extraction silently failed andtotal_nodesstayed at 0.
Default switched to the GATE model.docker-compose.ymlsubstituted${HF_ENDPOINT:-}into the
container env. Empty-string substitution made huggingface_hub
raisehttpx.UnsupportedProtocol: Request URL is missing a protocolon cold start. Now loaded viaenv_file:so absent
stays absent (huggingface_hub falls back to its built-in
default). China users opt in by adding
HF_ENDPOINT=https://hf-mirror.comto~/.memexa/.env.memexa backend upno longer leaks staleHINDSIGHT_API_LLM_*
shell exports into the compose process — they were silently
shadowing~/.memexa/.env.
Ingestion — extract → POST → query
_normalize_llm_cardnow enum-coerces everyconfidencevalue
(numeric0.85, English"high"/"low", Chinese"确定"/
"模糊",bool,None) to a canonical Hindsight enum. The
four confidence fields (TimeResolution.confidence+
Entity.resolution_confidence4-value;
IdentityAssertion.confidence+RelationAssertion.confidence
3-value) are all handled. Demo dataset went from 6/18 POST OK
→ 18/18._normalize_llm_cardISO-coercesanchor_message_ts(LLM
sometimes emits a bare date"2026-05-13"). The
when_start_defaultcapture also runs after coercion so the
fallback no longer inherits a bare date._build_wechat_prompt_from_messagesno longer silently drops
every non-text message in a real WeChatMsg export. Image /
sticker / video / voice / location / appmsg / sysmsg all
survive as[图片]/[表情]/[视频]/ etc. placeholders,
with<title>extracted fromType=49appmsg XML. 30-60% of
a real chat history that was being lost is now preserved._build_wechat_prompt_from_messagespropagatesIsSender=1→
per-msgis_self_messagehint + batch-leveln_self_msgs/
n_other_msgs/is_solo_self/is_self_chat. Downstream
§SELF_NOTE_MODE now anchors commitments to the user's own
utterances.
Tests
- 25 new unit tests added:
tests/unit/test_confidence_sanitizer.py
(57 parametrised cases covering numeric / English / Chinese /
bool / None / canonical-4 / canonical-3 enum semantics);
tests/unit/test_wechat_msg_adapter.py(25 cases across 10
msg-type codes +IsSendersemantics + field aliases). - Full pytest: 140 passed, 2 skipped (both pre-existing
prompt-drift tests, queued for the next prompt-maintenance pass).
LIVE verification matrix
| Win 11 (Docker Desktop) | Mac Studio (OrbStack) | USTC Ubuntu 22.04 | |
|---|---|---|---|
| pip install | ✅ | ✅ | ✅ |
memexa demo |
✅ | ✅ | ✅ |
memexa init + wizards |
✅ | ✅ | ✅ |
memexa backend up |
✅ | ✅ (mihomo + HF mirror) | ❌ docker.io blocked by campus firewall (env, not memexa) |
| Real QQ IMAP ingest | ✅ N=10 queryable cards | (same code path) | blocked above |
| Real USTC IMAP ingest | ✅ N=10 (deduped w/ QQ) | (same code path) | blocked above |
| WeChat demo ingest | ✅ 18/18 POST | ✅ 12/12 | blocked above |
| WeChat real-schema | ✅ 7 msgs → 3 cards | (same code path) | blocked above |
| Mixed msg types | ✅ 6/6 survive | (same code path) | blocked above |
memexa doctor |
✅ 4 green | ✅ 4 green | ✅ graceful w/o backend |
Known gaps deferred to v0.1.2
- Other IMAP providers (Gmail / Outlook / iCloud / 163 / 126 / Yeah /
Hotmail / Sina / Live) — wizards have correct host/port/auth-type
hints, but only QQ and USTC have been LIVE-fetched against. Other
providers should work but are not LIVE-attested for this release. - Large-volume ingest (>500 batches) — small dataset proven; rate-
limit / dead-letter back-pressure / memory behavior at scale not
yet stress-tested. - WeChatMsg-from-the-real-tool — adapter is reverse-engineered from
WeChatMsg's documented schema (Type/IsSender/StrContent/
CreateTime/NickName/StrTalker) but a real WeChatMsg
release binary was not in this verification pass. Field aliases
cover the common emissions; truly novel field names would need a
v0.1.2 adapter patch. - Hindsight async-consolidation transparency —
memexa ingestshows
dead-letter: Nwhen verify-after-POST seestotal=0, but the
cards are usually in the document store and just waiting for the
background consolidator. v0.1.2 should auto-trigger
/consolidateand poll.
Install & upgrade
# Fresh install:
pip install memexa==0.1.1
# Upgrade from v0.1.0:
pip install -U memexaFor full quickstart (Tier 0 / Tier 1 / Tier 2), see
docs/quickstart.md.
— feat/v0.1.1-onboarding PR #18 · 15 commits · 140 pytest pass · 18/18 CI green