Release v0.1.1 — onboarding wizards + 13 fresh-user fixes · labazhou2024/memexa

v0.1.1 — onboarding wizards + 13 fresh-user fixes

v0.1.0 shipped the new project name (Memexa) on PyPI but the email
ingest path turned out to be broken (hard-coded to maintainer-specific
account names that did not exist in the OSS package). v0.1.1 fixes
that and rewrites onboarding around three interactive wizards, then
adds 13 more fixes caught when the whole flow was re-played from a
fresh-user perspective across Win 11 + Mac Studio + USTC Linux, with
real IMAP credentials (QQ + USTC Exmail-reverse-proxy) and real
WeChatMsg-schema JSON.

TL;DR for a brand-new user

pip install memexa==0.1.1
memexa demo                                 # Tier 0 — see it works
memexa init                                 # scaffold ~/.memexa/
memexa init llm                             # 4 providers; DeepSeek / OpenAI / Qwen / custom
memexa init email                           # 12+ IMAP providers auto-detected
memexa backend up                           # docker compose pg + Hindsight
memexa ingest email                         # IMAP → batch → LLM extract → POST
memexa quick "<question>"                   # see your own messages back

New CLI (public)

memexa init                  # legacy scaffold (templates ship in wheel)
memexa init llm              # LLM provider wizard (4 providers)
memexa init email            # IMAP wizard (12+ providers auto-detected)
memexa init wechat           # WeChatMsg export wizard (Windows-only)
memexa backend up            # docker compose -f ~/.memexa/docker-compose.yml up -d
memexa backend status        # docker ps + curl /health
memexa backend down          # compose down
memexa ingest email          # fetch IMAP for all configured accounts
memexa ingest wechat         # read WeChatMsg export dir → builder → extract → POST

docs/quickstart.md walks through all of these end-to-end.

Highlights

Critical fix carried from v0.1.0

memexa/extraction/email_history_fetcher.py was hard-coded to two
maintainer-specific account names (qq_email, ustc_email) and tried
to import memexa.qq_email / memexa.ustc_email — modules that
do not exist in the OSS package. v0.1.0 PyPI users who followed
the docs got ModuleNotFoundError. v0.1.1 rewrites the fetcher as a
generic IMAP client (stdlib imaplib + email.parser), reads
email.accounts.<name> from ~/.memexa/identity.yaml, supports
multiple accounts.

13 fresh-user blockers (re-verify pass)

Onboarding — install + init

memexa init shipped without the example templates in the wheel.
Fresh user got 3 [warn] template missing warnings and an empty
~/.memexa/. Templates now ship under memexa/templates/.
memexa init llm crashed on Chinese-locale Windows console with
UnicodeEncodeError: 'gbk' codec can't encode character '\xa5'
(the ¥ symbol in the DeepSeek provider note). CLI entry now
reconfigures stdio to UTF-8.
memexa init email for USTC mail printed the wrong host hint
(imap.exmail.qq.com). That endpoint rate-limits / locks new
logins. The right host is mail.ustc.edu.cn:993, LIVE-verified.

Backend — memexa backend up / memexa doctor

memexa doctor probed /healthz but Hindsight serves /health.
memexa doctor LLM probe double-prefixed /v1, hitting
/v1/v1/chat/completions → 404. Now detects /v1 already present.
memexa doctor read a non-existent nodes field, always
reported "0 nodes" on a populated bank.
memexa backend up polled with a 60s timeout — too short for a
cold BGE-M3 load. Bumped to 180s.
docker-compose.yml routed HINDSIGHT_API_LLM_MODEL to the
EXTRACT model. Reasoning-class models (deepseek-v4-flash-ascend,
qwen-reasoner) emit content in reasoning_content and leave
content empty on Hindsight's strict-JSON prompts, so
fact-extraction silently failed and total_nodes stayed at 0.
Default switched to the GATE model.
docker-compose.yml substituted ${HF_ENDPOINT:-} into the
container env. Empty-string substitution made huggingface_hub
raise httpx.UnsupportedProtocol: Request URL is missing a protocol on cold start. Now loaded via env_file: so absent
stays absent (huggingface_hub falls back to its built-in
default). China users opt in by adding
HF_ENDPOINT=https://hf-mirror.com to ~/.memexa/.env.
memexa backend up no longer leaks stale HINDSIGHT_API_LLM_*
shell exports into the compose process — they were silently
shadowing ~/.memexa/.env.

Ingestion — extract → POST → query

_normalize_llm_card now enum-coerces every confidence value
(numeric 0.85, English "high" / "low", Chinese "确定" /
"模糊", bool, None) to a canonical Hindsight enum. The
four confidence fields (TimeResolution.confidence +
Entity.resolution_confidence 4-value;
IdentityAssertion.confidence + RelationAssertion.confidence
3-value) are all handled. Demo dataset went from 6/18 POST OK
→ 18/18.
_normalize_llm_card ISO-coerces anchor_message_ts (LLM
sometimes emits a bare date "2026-05-13"). The
when_start_default capture also runs after coercion so the
fallback no longer inherits a bare date.
_build_wechat_prompt_from_messages no longer silently drops
every non-text message in a real WeChatMsg export. Image /
sticker / video / voice / location / appmsg / sysmsg all
survive as [图片] / [表情] / [视频] / etc. placeholders,
with <title> extracted from Type=49 appmsg XML. 30-60% of
a real chat history that was being lost is now preserved.
_build_wechat_prompt_from_messages propagates IsSender=1 →
per-msg is_self_message hint + batch-level n_self_msgs /
n_other_msgs / is_solo_self / is_self_chat. Downstream
§SELF_NOTE_MODE now anchors commitments to the user's own
utterances.

Tests

25 new unit tests added: tests/unit/test_confidence_sanitizer.py
(57 parametrised cases covering numeric / English / Chinese /
bool / None / canonical-4 / canonical-3 enum semantics);
tests/unit/test_wechat_msg_adapter.py (25 cases across 10
msg-type codes + IsSender semantics + field aliases).
Full pytest: 140 passed, 2 skipped (both pre-existing
prompt-drift tests, queued for the next prompt-maintenance pass).

LIVE verification matrix

	Win 11 (Docker Desktop)	Mac Studio (OrbStack)	USTC Ubuntu 22.04
pip install	✅	✅	✅
`memexa demo`	✅	✅	✅
`memexa init` + wizards	✅	✅	✅
`memexa backend up`	✅	✅ (mihomo + HF mirror)	❌ docker.io blocked by campus firewall (env, not memexa)
Real QQ IMAP ingest	✅ N=10 queryable cards	(same code path)	blocked above
Real USTC IMAP ingest	✅ N=10 (deduped w/ QQ)	(same code path)	blocked above
WeChat demo ingest	✅ 18/18 POST	✅ 12/12	blocked above
WeChat real-schema	✅ 7 msgs → 3 cards	(same code path)	blocked above
Mixed msg types	✅ 6/6 survive	(same code path)	blocked above
`memexa doctor`	✅ 4 green	✅ 4 green	✅ graceful w/o backend

Known gaps deferred to v0.1.2

Other IMAP providers (Gmail / Outlook / iCloud / 163 / 126 / Yeah /
Hotmail / Sina / Live) — wizards have correct host/port/auth-type
hints, but only QQ and USTC have been LIVE-fetched against. Other
providers should work but are not LIVE-attested for this release.
Large-volume ingest (>500 batches) — small dataset proven; rate-
limit / dead-letter back-pressure / memory behavior at scale not
yet stress-tested.
WeChatMsg-from-the-real-tool — adapter is reverse-engineered from
WeChatMsg's documented schema (Type / IsSender / StrContent /
CreateTime / NickName / StrTalker) but a real WeChatMsg
release binary was not in this verification pass. Field aliases
cover the common emissions; truly novel field names would need a
v0.1.2 adapter patch.
Hindsight async-consolidation transparency — memexa ingest shows
dead-letter: N when verify-after-POST sees total=0, but the
cards are usually in the document store and just waiting for the
background consolidator. v0.1.2 should auto-trigger
/consolidate and poll.

Install & upgrade

# Fresh install:
pip install memexa==0.1.1

# Upgrade from v0.1.0:
pip install -U memexa

For full quickstart (Tier 0 / Tier 1 / Tier 2), see
docs/quickstart.md.

— feat/v0.1.1-onboarding PR #18 · 15 commits · 140 pytest pass · 18/18 CI green

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.1 — onboarding wizards + 13 fresh-user fixes

Choose a tag to compare

Sorry, something went wrong.