Skip to content

feat(v12): migrate backend from MongoDB to PostgreSQL + Snowflake IDs#2659

Merged
Innei merged 62 commits into
masterfrom
codex/postgresql-snowflake-migration-spec
May 4, 2026
Merged

feat(v12): migrate backend from MongoDB to PostgreSQL + Snowflake IDs#2659
Innei merged 62 commits into
masterfrom
codex/postgresql-snowflake-migration-spec

Conversation

@Innei
Copy link
Copy Markdown
Member

@Innei Innei commented May 2, 2026

Summary

This PR cuts over the entire MX Space backend from MongoDB + Mongoose to PostgreSQL + Drizzle ORM, introduces Snowflake IDs as the primary key strategy, and ships a production-ready MongoDB → PostgreSQL data migration CLI.

Scope

Area Change
Database runtime MongoDB + Mongoose → PostgreSQL 16+ + Drizzle ORM
Primary keys MongoDB ObjectId → Snowflake bigint (serialized as text EntityId)
Auth keys Better-Auth-compatible text primary keys (readers, accounts, sessions, …)
Schema definition Typegoose model classes → Drizzle pgTable schema files under src/database/schema/
Data access InjectModel + Mongoose queries → Repository classes extending BaseRepository
Migrations Removed all historical MongoDB-era versioned migrations; added Drizzle SQL migrations (0000_initial.sql, 0001, 0002)
DI wiring New postgresProviders (PG_POOL_TOKEN, PG_DB_TOKEN) + POSTGRES_REPOSITORY_TOKENS registry
Test infra mongodb-memory-server@testcontainers/postgresql + pg-testcontainer.ts helper
API contract Explicit contract tests for every public endpoint shape (admin, Yohaku, aggregate)

Modules migrated

Every backend module was rewired from Mongoose models to PostgreSQL repositories:

  • Content: post, note, page, category, topic, recently, draft, comment, say
  • Auth & User: reader, owner, account, session, apiKey, passkey, verification
  • AI: ai-summary, ai-insights, ai-translation, translation-entry, ai-agent-conversation
  • Ops: activity, analyze, backup, search, serverless, slug-tracker, snippet, subscribe, link, project, poll, webhook, meta-preset, options, file-reference

Files changed (high-level)

  • Added ~147 files: database schema, Drizzle migrations, per-module repositories + types, Snowflake ID utilities, PG test helpers, migration CLI, contract tests, docs
  • Deleted ~92 files: all Mongoose .model.ts files, old migration version scripts, mongodb-memory-server wiring
  • Modified ~264 files: all services, controllers, tests, docker-compose.yml, CI workflow, api-client models, config, interceptors

Key architectural decisions

Snowflake IDs (EntityId)

  • 64-bit timestamp-ordered IDs (epoch 1746144000000, workerId from SNOWFLAKE_WORKER_ID + PM2 offset)
  • Stored as text in PostgreSQL; validated by parseEntityId / serializeEntityId
  • Enables sort-by-ID as a proxy for sort-by-time without extra indexes

Repository pattern

  • BaseRepository provides db (Drizzle handle), pagination helpers, and ID conversion boundary methods (toDbId, toEntityId)
  • Each module exposes a *Repository class registered via a Symbol token in POSTGRES_REPOSITORY_TOKENS
  • Services inject repositories instead of Mongoose models

CRUD factory transformer

  • New crud-factor.pg.transformer.ts generates REST controllers from repository classes (mirrors legacy Mongoose CRUD factory)
  • Events are emitted with EventScope derived from the repository name

Comment thread & follow-ups (detailed)

The comment subsystem was completely rewired from Mongoose sub-documents/populate to explicit PostgreSQL self-references with service-layer hydration.

Database schema

  • comments table uses self-referencing foreign keys: parent_comment_idcomments.id (onDelete: cascade) and root_comment_idcomments.id (onDelete: cascade)
  • Denormalized counters on root comments: reply_count (integer) + latest_reply_at (timestamp) — updated atomically inside a transaction when a reply is created
  • ref_id + ref_type replace the polymorphic Mongoose ref ObjectId; referential integrity is enforced at the application layer

Reply creation

  • CommentRepository.createReply() runs inside a db.transaction():
    1. Locks parent row via SELECT … WHERE id = parentBig
    2. Computes rootCommentId = parent.rootCommentId ?? parent.id
    3. Inserts reply with parentCommentId + rootCommentId
    4. Atomically increments reply_count and sets latest_reply_at on the root comment
  • Eliminates the old two-step Mongoose pattern (create() then updateOne({ $inc: { replyCount: 1 } }))

Thread queries

Query Legacy (Mongoose) New (Drizzle)
Root list by ref paginate() with $and filters + sort: { pin: -1, created: -1 } findRootThreadsByRef() — indexed WHERE parent_comment_id IS NULL with sort variants (pinned / newest / oldest)
Replies for roots find({ rootCommentId: { $in: … } }) findVisibleRepliesForRoots() — single IN query batched for all root IDs on the page
Single root replies find({ rootCommentId }) findVisibleRepliesForRoot()
Around-a-comment findPageContainingComment() with $lt/$gt date filters findPageContainingRootComment() using SQL row-number arithmetic

Reply window & pagination

  • Initial load (GET /comments/ref/:id): root comments carry an inline replies array limited by COMMENT_REPLY_THRESHOLD = 20. If exceeded, only the first 3 (head) + last 3 (tail) replies are returned; a replyWindow metadata object describes hidden count
  • Follow-up load (GET /comments/thread/:rootCommentId): cursor-based pagination over the "middle" replies between head and tail. Returns { replies, nextCursor, remaining, done }
  • This replaces the legacy Mongoose paginate on the entire reply set

Hydration layers (replacing Mongoose populate)

  1. attachRef() — resolves polymorphic (refType, refId) to a slim CommentRefSummary (title/slug/nid/category) via databaseService.findGlobalByIds(). Orphan refs degrade to null instead of crashing
  2. attachParentPreview() — resolves parentCommentId to a privacy-scoped CommentParentPreview ({ id, author, text, isDeleted }). Public endpoints intentionally omit ip/mail/agent to prevent PII leakage
  3. fillAndReplaceAvatarUrl() — replaces avatar placeholders with Gravatar/generated URLs (unchanged logic, adapted to CommentRow)

Lifecycle simplification

  • Removed CommentAnchorService and CommentReaderFillService; anchor resolution and reader identity filling are now inlined into CommentService / CommentRepository
  • cascadeFilesForCommentsIfSpam() — batch hard-deletes reader-uploaded images when comments are marked Junk

API shape — comment-specific changes

Legacy New
_id id
ref (ObjectId) ref_id (string) + ref_type (lowercase)
parentCommentId / rootCommentId Same keys, but string Snowflake IDs
created created_at
pin (Date | null) pin (boolean)
replyCount reply_count
isDeleted is_deleted
isWhispers is_whispers

Contract tests

  • test/src/contracts/comment.contract.spec.ts — admin list, detail, public thread, thread replies
  • test/src/contracts/yohaku/comment-thread.contract.spec.ts — Yohaku consumer contract enforcing field presence and legacy-key absence
  • test/src/modules/comment/comment-thread.spec.ts — unit tests for thread assembly and reply-window logic

API response shape changes (general)

Legacy (Mongoose) New (PostgreSQL)
_id id
created created_at
modified modified_at
count.read / count.like read_count / like_count
pin pin_at
isPublished (admin) is_published (snake_case via interceptor)

The JSONTransformInterceptor continues to convert camelCase → snake_case at the boundary, so external consumers see snake_case keys.

MongoDB → PostgreSQL migration CLI

npx tsx scripts/migrate-mongo-to-postgres.ts --mode dry-run   # verify
npx tsx scripts/migrate-mongo-to-postgres.ts --mode apply     # migrate
  • Reads from MONGO_URI (defaults to local MongoDB), writes to PG_URL
  • Allocates Snowflake IDs via id-map.ts preserving referential integrity across collections
  • Upserts in 200-row chunks; warnings are collected per batch without halting
  • Verified against a production-scale dataset: 167 posts, 189 notes, 558 readers, 2426 comments, 38 says, 18 drafts, 22 options, 106 links, 14 projects, etc.

Verification

  • tsc --noEmit silent across the monorepo
  • vitest run passes for unit, integration, and contract tests against ephemeral PostgreSQL 17 containers
  • E2E smoke tests pass for /api/v2/posts, /notes, /pages, /says/all, /aggregate/top, /categories
  • Contract tests enforce that no legacy Mongoose keys (_id, created, modified, count) leak into public API responses
  • CI updated: GitHub Actions services now spin up postgres:16-alpine instead of mongo

Breaking changes

For frontend / API consumers

  • Response objects now use id (Snowflake string) instead of _id (ObjectId string)
  • Timestamp keys are created_at / modified_at instead of created / modified
  • api-client models have been updated to reflect the new shape; downstream projects (admin-vue3, Shiroi / Yohaku) should upgrade @mx-space/api-client to the version published with this release

For operators / self-hosters

  • Required: PostgreSQL 16+ (replaces MongoDB)
  • Required: SNOWFLAKE_WORKER_ID environment variable (integer, 0–1023)
  • Required: PG_URL or PG_HOST / PG_PORT / PG_USER / PG_PASSWORD / PG_DATABASE
  • MongoDB data must be migrated with the provided CLI before first boot
  • See docs/migrations/v12.md for the full upgrade guide (Docker & PM2)

Test plan

  • CI green: lint + typecheck + unit/integration tests against PG container
  • Local migration dry-run → apply → smoke test on /api/v2/* endpoints
  • Frontend compatibility: verify admin-vue3 and Yohaku consume new id / created_at shape
  • Production migration rehearsal against a copy of production MongoDB data

Innei added 30 commits May 2, 2026 03:13
Resolve spec §18 open questions and §15 Phase 0 checklist:
- Snowflake worker ID via mandatory SNOWFLAKE_WORKER_ID config
- read_count/like_count as integer columns (not JSONB)
- drafts.history as JSONB inside drafts
- legacy Mongo migrations retained as source-only references
- search_documents keeps denormalized shape (no tsvector this cutover)
- PostgreSQL 16 + @better-auth/drizzle-adapter + testcontainers-pg adopted
Phase 1 of postgres migration. Establishes the canonical identity
contract used by repositories, controllers, and the migration tool.

- EntityId branded string + parse/serialize/zod helpers (~/shared/id)
- SnowflakeGenerator (pure) and SnowflakeService (Nest provider) with
  41/10/12 bit layout, custom epoch 2026-05-02T00:00:00.000Z
- SNOWFLAKE_WORKER_ID app.config option (required in production,
  defaults to 0 in dev, defaults to 1 in tests)
- IntIdOrEntityIdDto / EntityIdDto alongside deprecated MongoId variants
- Vitest suites covering range, monotonicity, sequence rollover,
  clock-rollback, decoding, and bigint serialization
…bles

Phase 2 of postgres migration. Establishes the full target schema and a
runtime PG provider that coexists with the existing Mongo connection so
repository cutover can proceed module-by-module.

- 43 tables across content/auth/ai/ops/migration domains, all keyed by
  Snowflake bigint with referential constraints per spec §8/§5
- drizzle-kit config + initial SQL migration (0000_initial.sql)
- Nest providers PG_POOL_TOKEN and PG_DB_TOKEN (drizzle node-postgres)
  registered in DatabaseModule alongside Mongo provider
- POSTGRES_REPOSITORY_TOKENS symbol map for module-by-module wiring
- POSTGRES app.config block with pg_* commander options + env fallback
- docker-compose adds postgres:16-alpine service for both client and
  server compose files
- Smoke test inserts via SnowflakeGenerator and verifies bigint round-
  trip + FK violation against an ephemeral PG container
Phase 3 pilot. Establishes the EntityId/bigint translation contract that
every PostgreSQL-backed repository must follow, then ports three of the
simpler content modules as canonical templates for the remaining work.

- BaseRepository with toEntityId/toBigInt helpers and pagination shape
- CategoryRepository with FK-aware delete, joined post-count aggregate,
  and tag-distribution aggregation (replaces the legacy Mongo $unwind
  pipeline)
- TopicRepository with slug normalization parity
- PageRepository with order-aware listing
- Integration spec covering create/findAll/update/delete/sumPostTags/
  FK-restrict/malformed-id paths against an ephemeral PG container

Service-layer cutover (replacing InjectModel/Mongoose calls) is left to
follow-up commits; these repositories run cleanly against the schema
applied in PR 2.
Continuation of PR 3. All four are pure read/write repositories that
hand off EntityId strings at every boundary; service-layer cutover is
deferred to follow-up commits.

- PostRepository: pagination, list-by-tag/category/published, related-
  post edge management via post_related_posts, atomic increment of
  read_count/like_count, joined category projection.
- NoteRepository: nid sequence allocator, slug/nid lookups, visibility
  predicate parity with the Mongo service, joined topic projection.
- CommentRepository: thread + replies pagination, atomic createReply
  that updates root counters under transaction, soft-delete semantics.
- ReaderRepository: profile patch helpers and owner profile upsert,
  preserving the legacy email/username/role lookup surface.
Captures everything that's committed on this branch (snowflake, schema,
8 repositories, smoke tests), what's still pending, the canonical pattern
to follow for the remaining ~18 repositories, and the exact next-step
plan for service cutover, auth migration, the data migration CLI, and
runtime cleanup.

The next operator should start with this file before reading the design
spec; it points at every file already in place and has copy-paste smoke
commands that work against postgres:16-alpine.
…ry tables

Continuation of PR 3. Brings total repositories to 25 of ~26, covering
every first-class table in the schema except meta-presets (which has
nontrivial nested schema and is service-only). All EntityId/bigint
conversions go through BaseRepository helpers.

- Operational: link, project, say, recently, draft, options, activity,
  analyze, file-reference, subscribe, snippet, slug-tracker, webhook,
  poll-vote, serverless storage + log
- AI: ai-summary, ai-insights, ai-translation + translation-entries
- Each repository preserves the legacy contract surface (lookup keys,
  unique constraints, cascade semantics) so the service-layer cutover
  can swap @InjectModel for the repository with minimal API drift
Closes the repository tier of PR 3. SearchRepository preserves the
denormalized term-frequency cache; the SearchService BM25 ranking can
keep its current algorithm and just hand off persistence to PG.

AuthRepository covers Better Auth's drizzle adapter contract surface:
accounts (provider lookup, password upgrade), sessions (token lookup
+ expiry sweep), API keys (lookup, usage counter, listing), passkeys
(credentialId lookup, counter increment), and verifications (email
flow tokens).

AiAgentConversationRepository preserves the rich JSONB messages array
plus reviewState/diffState contracts and exposes appendMessage as an
atomic transaction so concurrent agent steps do not lose updates.
PR 6 of the postgres migration spec. Builds a dependency-ordered
two-phase migration: first allocate Snowflake IDs for every Mongo
document and persist mongo_id_map, then load all collections in
reference order using the resolved ids.

- src/migration/postgres-data-migration/types.ts          step contract
- src/migration/postgres-data-migration/id-map.ts         allocate / resolve
- src/migration/postgres-data-migration/steps.ts          all collection steps
- src/migration/postgres-data-migration/runner.ts         phase orchestration
- scripts/migrate-mongo-to-postgres.ts                    --mode dry-run|apply

Dry-run mode reads everything but writes nothing; the report shows
allocated row counts plus any missing reference targets so operators
can fix data before apply mode runs. Apply mode persists mongo_id_map
first (safe to re-run), then loads tables in dependency order, then
records a row in data_migration_runs.
After the latest commits the repository tier of PR 3 is complete (28 of
28 first-class tables). PR 6 (Mongo→PG migration CLI) is done. PR 4 has
AuthRepository ready but the betterAuth() adapter swap is still pending.
Service-layer cutover and PR 7 cleanup remain the largest open chunks.
…ctory

The CLI was tripping commander.parse() in app.config because tsx forwards
--mode through process.argv; strip migration-only flags before importing
the runner. The refType handling was rejecting all 2434 comments because
the source data uses lowercase plural ("posts", "notes"), not the
PascalCase singular the previous map expected — normalize across
recentlies, comments, drafts, search, ai-translations, ai-agent, and
file-references so the canonical CollectionRefTypes value lands in PG.
Also lands BasePgCrudFactory as the PG sibling of BaseCrudFactory and a
session-3 doc that locks the six delegated decisions and lays out the
remaining wave-based service cutover plan.

Verified end-to-end against real Mongo at 127.0.0.1:27017/mx-space:
dry-run + apply both succeed (167 posts, 189 notes, 2426 comments,
93373 activities, 101625 mongo_id_map rows). 16 missing-ref warnings
remain — all orphan refIds in source data pointing to deleted parents.
…actory

These two modules have no cross-module `service.model` consumers (verified
by grep across modules/), so they can flip to the PG path independently
without breaking the wider Mongoose mesh. Topic drops its passthrough
TopicService entirely — the repository is registered directly. Both
repositories grow a `list(page, size)` method to satisfy the
BasePgCrudFactory contract. TranslateFields rules switch from `_id` to
`id` to match the EntityId surface returned by the repositories.

Mongo `subscribes`, `posts.categoryId`, etc. remain untouched — the
runtime is still 95% Mongoose; this commit is the first per-wave proof
that the pattern works end-to-end in the build. Compile is green and
the existing 32 PG/foundation tests still pass.
SubscribeService now reads/writes via SubscribeRepository (no more
@InjectModel SubscribeModel). The controller's `service.model.paginate`
is replaced with `service.list(page, size)` returning the repository's
PaginationResult shape. The cancelToken value is coerced through
String() because hashString returns a number whereas the PG schema
column is text — Mongo silently coerced it through Mongoose's loose
typing.

SubscribeRepository grows: list(page, size), updateByEmail,
deleteByEmail, deleteByEmails, deleteAll. All call sites for
mongoose-only operations (find().lean(), updateOne, deleteOne,
deleteMany) are removed from the service.

No external module consumes SubscribeService.model, so this commit does
not affect any other module's compile.
SayService stops holding a Mongoose model and instead delegates to
SayRepository. New named methods `findRecent(size)` and `count()` are
added so aggregate.service can stop touching `sayService.model`. The
controller switches to BasePgCrudFactory and the /random endpoint reads
through the repository.

aggregate.service.ts is updated in this same commit at the two call
sites that referenced sayService.model — the migration approach is
"flip a leaf module + every consumer in the same commit", because
mongoose and PG cannot disagree on what `says` contains once writes
cross over.
LinkService and LinkAvatarService both stop holding the Mongoose
LinkModel — they consume LinkRepository. The CRUD controller switches
to BasePgCrudFactory; the gets/getAll overrides project away the
`email` field for non-admin requests in the controller layer (instead
of the previous mongoose `select: '-email'` on the query side).

LinkRepository grows: list(page, size, {state}), findAvailable
(excludes Audit/Reject), countByState, countByType,
countByTypeAndState, findByUrlOrName, updateState, updateAvatar.

aggregate.service.ts swaps its two `linkService.model.countDocuments`
calls to `linkService.countByState` and updates the LinkState import
source from link.model to link.repository.

approveLink now returns the LinkRow as `link` (no more toObject()).
sendAuditResultByEmail upserts the new state via repository instead of
fetching + mutating + saving.
Replace SnippetService Mongoose backend with SnippetRepository.
Extend repository with secret column, public-by-name lookup,
function-by-custom-path (with method whitelist), prefix matching,
group-by-reference aggregation, and update-by-name for built-in
function reset.

Rewrite SnippetService to encrypt secret on write and surface
SnippetRow shape; controller swaps mongoose paginate for
repository.list and removes raw POST /snippets/aggregate behind
HTTP 400 with a deprecation message.

Cut consumers in the same commit: ServerlessService now consumes
SnippetRepository for compiledCode backfill, built-in pour, and
reset; ServerlessController/CommentLifecycleService/DebugController
swap snippetService.model lookups to repository methods. Module
provider chains use forwardRef(SnippetModule).

ServerlessLogModel and DatabaseService.db (used by mockDb /
mockGetOwner) stay on Mongoose pending wave 3 cleanup.

Verified:
- tsc --noEmit silent
- 32 PG/foundation tests pass under PG_VERIFY_URL
Lock six remaining open questions per user delegation: backend-only
release, regen 0000_initial.sql for auth, separate auth_id_map table,
hard-deprecate POST /snippets/aggregate with 400, single PR, drop
old backup format and mongo: compose service in wave 5.

Document why draft slipped to wave 2 (refId bigint vs Mongo hex
strings from post/note/page consumers) and lay out the wave-2 entry
audit, cross-module hot spots, and subagent pass strategy.
Add count(), findRecent(size, opts?), findManyByIds(ids),
findAdjacent(direction, pivot), and a few specialised batch
operations to PostRepository, NoteRepository, PageRepository,
CommentRepository, and RecentlyRepository.

These are the primitives wave-2 service rewrites will call when
aggregate / search / activity / cron / feed / sitemap / ai consumers
swap their cross-module mongoose lookups for typed repository
methods. No behavioural change today — wave-2 cut overs in a
follow-up commit hook into them.

Verified:
- tsc --noEmit silent
- 32 PG/foundation tests pass under PG_VERIFY_URL
File-by-file mapping of every .Service.model call site to the
target repository method. Lists DI swap matrix, repo gaps to
close (Pass A.5), consumer rewrites, verification gates, and
out-of-scope modules deferred to wave 3+.
post.repository.ts: countByCategoryId, aggregateAllTagCounts,
  aggregateTagCountsByCategory, findByTag, listByCategory,
  findByCategoryId, findOldest, topTagsByCount
draft.repository.ts: count, linkToPublished; list extended with
  page/size/filter params
comment.repository.ts: findByIdWithRelations, countByRef, paginatedFind
note.repository.ts: findArchiveBuckets
recently.repository.ts: findArchiveBuckets

Skipped (already present): CategoryRepository.findBySlug,
  PostRepository.attachCategory, DraftRepository.findByRef
Skipped (no tags column in notes schema): NoteRepository.topTagsByCount
…t to PostgreSQL

Wave 2 Pass B atomic cut. Every content producer service drops
@InjectModel(XModel) and goes through XRepository; .model getters
removed; cross-module consumers now call named service methods.
DatabaseService.findGlobalById / findGlobalByIds / getModelByRefType
fan out across PG repositories. databaseService.db / mongooseConnection
remain as legacy getters for wave-3 ops surfaces.

Producers:
- post: service, controller, module
- note: service, controller, repository, module
- page: service, controller, repository, module
- comment: service, controller, lifecycle, repository, module
- category: service, controller, module
- recently: service, controller, repository, module
- draft: service, controller, module

Consumers:
- aggregate: service, controller
- search: service
- activity: service
- helper.controller, helper.event-payload (processors/helper)
- database.service, database.module (processors/database)
…/poll/search-doc/serverless-log/slug-tracker/webhook to PostgreSQL

Wave 3. Every remaining @InjectModel site is now repository-backed,
except auth/owner/reader (wave 4) and the generic BaseCrudFactory
(wave 5). All ai persistence (summary, insights, translation,
translation-entries, agent-conversation), all ops persistence
(activity, analyze, options, file_references, meta_presets, poll
votes, search_documents, serverless logs, slug_trackers, webhooks),
and the configs/option singleton plus the analyze interceptor and
app.controller all consume Drizzle repositories. New
meta-preset.repository.ts patterned on slug-tracker.

Modules:
- activity: service, controller, module, repository
- ai/ai-agent: ai-agent-conversation.service
- ai/ai-insights: ai-insights, ai-insights-translation, repository
- ai/ai-summary: service, repository
- ai/ai-translation: ai-translation, ai-translation-event-handler, repository
- ai: ai.module
- analyze: service, module, repository
- comment: service (AITranslation leftover)
- configs: service, module, options.repository
- cron-task: cron-business, cron-task.module
- file: file-reference.service, file.controller, file.module, file-reference.repository
- meta-preset: service, module, NEW repository
- poll: service, module, poll-vote.repository
- search: service, module, repository
- serverless: service, module, repository (logs)
- slug-tracker: service, module, repository
- webhook: service, module, repository
- common/interceptors/analyze.interceptor
- app.controller
…izzle adapter

- apps/core/src/database/schema/auth.ts: changed id column from pkBigInt() to text('id').primaryKey().notNull() for readers, ownerProfiles, accounts, sessions, apiKeys, passkeys, verifications; FK columns referencing readers.id converted from refBigInt to text cascades
- apps/core/src/database/schema/migration.ts: added auth_id_map table with collection/mongo_id/pg_id/created_at and two unique indexes
- apps/core/src/database/migrations/0000_initial.sql: regenerated from updated schema; auth tables now declare id text PRIMARY KEY; content tables retain bigint
- apps/core/src/database/migrations/meta/: journal and snapshot normalized to single 0000_initial entry
- apps/core/src/modules/owner/owner.repository.ts: new OwnerRepository wrapping ownerProfiles schema table (text id/readerId)
- apps/core/src/modules/reader/reader.repository.ts: updated ReaderRepository for text id
- apps/core/src/modules/owner/owner.service.ts: dropped @InjectModel(OwnerProfileModel); injected OwnerRepository
- apps/core/src/modules/owner/owner.module.ts: updated providers/exports
- apps/core/src/modules/reader/reader.service.ts: dropped @InjectModel(ReaderModel); injected ReaderRepository
- apps/core/src/modules/reader/reader.module.ts: updated providers/exports
- apps/core/src/modules/auth/auth.repository.ts: id/userId/referenceId fields changed from EntityId to string; dropped toEntityId/parseEntityId conversions; removed snowflake id generation for inserts
- apps/core/src/modules/auth/auth.implement.ts: swapped mongodbAdapter to drizzleAdapter; removed MongoClient/ObjectId imports; wired db from postgres.provider
- apps/core/src/migration/postgres-data-migration/steps.ts: auth steps (readers, accounts, sessions, api_keys, passkeys, verifications, owner_profiles) now use Mongo ObjectId hex string as PG id and populate auth_id_map; content table steps unchanged
- apps/core/src/processors/database/postgres.provider.ts: minor adjustments for adapter wiring
Wave 5. Final cleanup of the PostgreSQL migration. Every Mongoose
trace at the runtime level is gone. Drizzle + Better Auth drizzle
adapter is the sole persistence layer. The migration CLI under
postgres-data-migration/ is the only place that still touches Mongo
(reads source data; writes PG); kept the mongodb dep for it.

Deletions:
- apps/core/src/transformers/crud-factor.transformer.ts
- apps/core/src/transformers/model.transformer.ts (@InjectModel)
- apps/core/src/processors/database/database.models.ts
- apps/core/src/processors/database/database.provider.ts
- apps/core/src/migration/{migrate,history}.ts and version/*
- All apps/core/src/modules/**/*.model.ts (Mongoose decorations)

Type/enum extractions to *.types.ts / *.enum.ts:
- comment, recently, post, note, page, draft, category, owner,
  reader, webhook, file-reference, ai-translation, ai-insights,
  ai-summary, ai-agent, etc. — every prior model file's enums and
  shape interfaces moved to sibling files.

Repository / service updates:
- DatabaseService drops mongoose Connection injection, db /
  mongooseConnection / client getters.
- DatabaseModule drops Mongoose providers.
- ServerlessService mockDb / mockGetOwner now use repositories.
- AI service producers assert non-null id (resultId: doc.id!) to
  match AiStreamEvent contract.
- TranslationEntryKeyPath restored to include note.mood,
  note.weather, topic.introduce.
- Lexical translation strategy gets explicit
  LexicalTranslationInput interface.

Dependency cleanup:
- Removed: mongoose, @typegoose/typegoose, @nestjs/mongoose,
  mongoose-aggregate-paginate-v2, mongoose-autopopulate,
  mongoose-lean-getters, mongoose-paginate-v2, mongoose-leancache,
  mongo-memory-server, mongodb-memory-server.
- Kept: mongodb (devDep, for postgres-data-migration CLI only).

Test infra:
- mongo-memory-server replaced by testcontainers Postgres helper.
- Repository-level mocks replace getModelToken patterns.

docker-compose:
- mongo: service removed from docker-compose.yml and
  docker-compose.server.yml.

Backup tool:
- mongodump/mongorestore replaced by pg_dump --format=custom /
  pg_restore.
Innei added 16 commits May 3, 2026 22:35
- Implemented `recently-list.contract.spec.ts` to test the recently list API, ensuring it exposes required fields and handles references correctly.
- Created `say-list.contract.spec.ts` for the say list API, validating the response structure and pagination.
- Added `topic-detail.contract.spec.ts` to verify topic list and detail retrieval, ensuring all necessary fields are present.

fix: update vitest configuration for increased timeout

- Adjusted `vitest.config.mts` to extend the hook timeout to 60 seconds.

refactor: update API client models and controllers

- Modified various models in the API client to replace `created` and `modified` fields with `createdAt` and `modifiedAt` for consistency.
- Updated controllers to reflect changes in model structures, ensuring proper typing and response handling.
- Removed unnecessary base model inheritance in several models for clarity and simplicity.

chore: enhance recently model structure

- Refined the `RecentlyModel` to include additional fields and ensure proper typing for references and metadata.
- Updated related models and interfaces to maintain consistency across the API client.

style: clean up code formatting and comments

- Improved code readability by adjusting formatting and removing redundant comments across various files.

Signed-off-by: Innei <tukon479@gmail.com>
- Remove mongoose compatibility layer (entryModel: any adapter) from
  translation-entry.service.ts, use direct repository calls
- Convert .then() chains to async/await across ~25 files
- Eliminate nested ternaries in comment.repository.ts, cos.util.ts, time.util.ts
- Flatten conditional logic in recently.service.ts, comment.service.ts
- Remove dead code: redundant nullish coalescing, no-op Promise.all([])
- Fix fire-and-forget patterns: remove unnecessary try/catch, use direct
  repository calls for Redis cleanup
- Type event handlers in ai-translation-event-handler.service.ts (replace
  event: any with proper interfaces)
- Relax TranslationEntryKeyPath from literal union to string for runtime
  dynamic keys
- Disable unicorn/number-literal-case to align with Prettier formatting
- Update test files to match PG repository patterns

All tests pass: 120 files, 701 tests
…initions into dedicated files

- Created new type definition files for PollVote, Post, Project, Reader, Recently, Serverless, Snippet, Subscribe, Topic, and Webhook modules.
- Updated repository and service files to import types from the new type definition files, improving code organization and readability.
- Removed redundant type definitions from repository files, ensuring a cleaner codebase.
- Enhanced the DatabaseService to utilize the new type definitions for better type safety and clarity.

Signed-off-by: Innei <tukon479@gmail.com>
- Migration: comments.reader_id now resolves through resolver.ref('readers',
  ...) instead of mongoHexOf, so it stores the snowflake id rather than a
  legacy Mongo ObjectId hex. Without this, fillAndReplaceAvatarUrl could
  never resolve author/avatar after the PG cutover.
- Service: getComments and findByIdWithRelations now attach a slim
  CommentParentPreview ({id, author, text, isDeleted}) so the admin detail
  panel can render `回复 @{parent.author}` for replies. parent/children are
  removed from CommentRow and lifted into CommentRowWithRelations.
- Tests: expand the admin contract to assert parent.* keys for replies and
  parent: null on root rows; add unit coverage for batched parent lookup,
  no-parent shortcut, orphan parent fallback, and reader-name resolution.
… user id

For better-auth's `credential` provider, `accounts.account_id` is the user's
own id, not an external provider id. The legacy Mongo doc stored the user's
ObjectId hex there; the original migration accepted it verbatim, leaving
post-PG rows with a stray Mongo objectid in `accounts.account_id`. Force
`accountId = userId` (snowflake) for the credential branch — OAuth providers
still keep their external `accountId` (e.g. GitHub numeric id).
Orphan handling — every relational column now either resolves cleanly or the
row is dropped. No more rows left behind with `ref_id = NULL`:

- drafts: refId is required; orphan parent (or missing source refId) drops
  the row instead of persisting an unattachable draft.
- recentlies: standalone rows (no refType) still kept; rows declaring a
  refType whose target is missing are dropped.
- file_references: standalone files (no refType) still kept; rows declaring
  a refType whose target is missing are dropped.

Schema FKs — close the runtime hole where post-PG inserts could write a
reader_id pointing at a non-existent reader:

- comments.reader_id  → readers.id ON DELETE SET NULL
- file_references.reader_id → readers.id ON DELETE SET NULL

Verified after a full schema drop + re-run: zero mongo-hex residue, zero
polymorphic ref orphans, zero FK violations across comments, drafts,
recentlies, file_references, ai_*, accounts, sessions, posts, notes.
…ontent delete

Auditing the post/note/page deletion paths surfaced three polymorphic
ref tables that were not subscribed to delete events, leaving runtime
orphans the migration cannot reach:

- ai_summaries: handler covered POST/NOTE_DELETE but missed PAGE_DELETE.
- ai_insights:  handler covered POST/NOTE_DELETE but missed PAGE_DELETE.
- ai_agent_conversations: no event handler at all; rows persisted forever
  after their parent post/note/page was removed.

Wire all three to listen on POST/NOTE/PAGE_DELETE. Add `deleteForRef` to
the agent-conversation repo+service so the new handler has a single
batched delete by refId. The agent handler swallows + logs failures
because conversation cleanup is best-effort and must not block the
parent delete event chain.

Tested via existing post/note/page/comment/ai suites: 30 files, 178/178
pass.
…tend

- CLAUDE.md: Replace all MongoDB/Mongoose/TypeGoose references with
  PostgreSQL/Drizzle/Better Auth/Snowflake ID. Update testing section
  to reflect testcontainers. Update migration section to Drizzle Kit.

- apps/core/readme.md: Rewrite to reflect current stack. Update Docker
  deployment, project structure, module list, and add tech stack table.
  Replace Shiroi with Yohaku as recommended frontend.

- README.md: Replace Shiroi with Yohaku in Related Projects.
  (Includes prior uncommitted MongoDB→PostgreSQL updates: env vars,
  tech stack table, v12 upgrade section, docker-compose service names.)
PG cutover broke compat with mongo-era field names. SDK now matches the
PG response shape:

- Add CommentParentPreview type and CommentModel.parent for list/detail
- Type comment.getById return as CommentParentPreview (slim, no PII)
- Refresh select fixtures to use createdAt instead of legacy created
- Bump to 4.0.0-next.0 for the next dist-tag
…migration documentation

Signed-off-by: Innei <tukon479@gmail.com>
Server's attachRef injects a slim summary onto recently list/detail rows
when refType/refId resolves; without the type, Yohaku can't read item.ref
through the SDK type.

Bumps to 4.0.0-next.1.
…nullability

Yohaku migration surfaced three SDK type gaps:

- `CommentModel.anchor` was `Record<string, unknown> | null`; consumers
  expect a typed `CommentAnchorModel` (block | range mode + offsets).
- `CommentModel.ref` is hydrated server-side via `attachRef` but the SDK
  did not declare it, forcing every admin caller to widen locally.
- `Post/Page.images` were typed `Image[] | undefined` but the PG `jsonb`
  column legitimately returns null, mirroring `Note.images`.

Bumps to 4.0.0-next.2.
…ract

Verification surfaced two issues:

1. The recently fixture in `recently.contract.spec.ts` used
   `upVotes`/`downVotes`, but the PG schema column is `up`/`down`. Fixed
   the fixture so the SDK shape test exercises the real shape.

2. `CommentFilterEmailInterceptor` strips `mail` from unauthenticated
   responses; the SDK previously typed it as `string | null` (always
   present). Now `mail?: string | null` so consumers don't crash on
   public detail endpoints.

Adds `EXPECTED_*_MODEL_KEYS` to post/comment/recently contract specs
that mirror the SDK type and call `assertHasKeys` against real endpoint
output. Future SDK additions must update these arrays, and any server
field omission triggers a contract-test failure.

Bumps to 4.0.0-next.3.
…eModel.password

NoteModel.password was declared optional but the server's note repository
never returns it (only `hasPassword` is computed). Drop it from the SDK
type so callers don't read undefined.

Adds EXPECTED_NOTE/PAGE/CATEGORY_MODEL_KEYS arrays + assertHasKeys checks
to existing contract specs, mirroring the post/comment/recently pattern
landed in the previous commit. AggregateRoot already had positive-key
checks, so no change there.

Bumps to 4.0.0-next.4.
@Innei Innei changed the title feat: cut over backend storage from MongoDB to PostgreSQL + Snowflake IDs feat(v12): migrate backend from MongoDB to PostgreSQL + Snowflake IDs May 4, 2026
@Innei
Copy link
Copy Markdown
Member Author

Innei commented May 4, 2026

🔍 Follow-up: Comment Thread & Follow-ups 重构细节

评论子系统从 Mongoose sub-document/populate 模式完全迁移到显式 PostgreSQL 自引用 + service-layer hydration。

1. Schema 设计

CREATE TABLE comments (
  id              text PRIMARY KEY,
  ref_type        text NOT NULL,
  ref_id          text NOT NULL,
  parent_comment_id text REFERENCES comments(id) ON DELETE CASCADE,
  root_comment_id   text REFERENCES comments(id) ON DELETE CASCADE,
  reply_count     int NOT NULL DEFAULT 0,
  latest_reply_at timestamptz,
  author          text,
  text            text NOT NULL,
  state           int NOT NULL DEFAULT 0,
  pin             boolean NOT NULL DEFAULT false,
  is_deleted      boolean NOT NULL DEFAULT false,
  is_whispers     boolean NOT NULL DEFAULT false,
  reader_id       text REFERENCES readers(id) ON DELETE SET NULL
);
  • parent_comment_id / root_comment_id 自引用外键 + ON DELETE CASCADE
  • reply_count / latest_reply_at 反规范化计数器,事务内原子更新
  • ref_id + ref_type 替代多态 Mongoose ref ObjectId

2. 回复创建:原子事务

旧模式(两步,非原子):

const reply = await commentModel.create({ ... })
await commentModel.updateOne(
  { _id: rootCommentId },
  { $inc: { replyCount: 1 }, $set: { latestReplyAt: new Date() } }
)

新模式(单事务):

return this.db.transaction(async (tx) => {
  const [parent] = await tx.select().from(comments)
    .where(eq(comments.id, parentBig)).limit(1)
  const rootBig = parent.rootCommentId ?? parent.id
  const [reply] = await tx.insert(comments).values({ ... }).returning()
  await tx.update(comments)
    .set({ replyCount: sql`reply_count + 1`, latestReplyAt: now })
    .where(eq(comments.id, rootBig))
  return reply
})

3. Thread 查询对照

场景 旧 (Mongoose) 新 (Drizzle)
根评论列表 paginate() + $and 过滤器 findRootThreadsByRef() — WHERE parent_comment_id IS NULL
批量查回复 find({ rootCommentId: { $in: rootIds } }) findVisibleRepliesForRoots() — 单条 IN 查询
单 root 回复 find({ rootCommentId }) findVisibleRepliesForRoot()
定位评论所在页 $lt/$gt 日期过滤 findPageContainingRootComment() — SQL 行号计算

4. Reply Window & 分页

  • 初始加载 GET /comments/ref/:id:根评论内联 replies 数组,上限 COMMENT_REPLY_THRESHOLD = 20。超过时只返回 head 3 + tail 3,附带 replyWindow 元数据
  • Follow-up 加载 GET /comments/thread/:rootCommentId:cursor 分页遍历中间回复,返回 { replies, nextCursor, remaining, done }

5. Hydration 层(替代 Mongoose populate)

  1. attachRef() — 解析 (refType, refId)CommentRefSummary。orphan ref 降级为 null
  2. attachParentPreview() — 解析 parentCommentId → 隐私裁剪的 { id, author, text, isDeleted }。公开端点去除 ip/mail/agent
  3. fillAndReplaceAvatarUrl() — 替换头像占位符

6. Lifecycle 简化

  • 删除 CommentAnchorServiceCommentReaderFillService
  • 新增 cascadeFilesForCommentsIfSpam() — 评论标记 Junk 时批量硬删读者上传图片

7. API Shape 变更(评论专用)

旧字段 新字段
_id id
ref (ObjectId) ref_id + ref_type
created created_at
pin (Date | null) pin (boolean)
replyCount reply_count
isDeleted is_deleted
isWhispers is_whispers

8. Contract Tests

  • test/src/contracts/comment.contract.spec.ts
  • test/src/contracts/yohaku/comment-thread.contract.spec.ts
  • test/src/modules/comment/comment-thread.spec.ts

@Innei
Copy link
Copy Markdown
Member Author

Innei commented May 4, 2026

🔍 Follow-up: Auth & Identity 模型变更

Better Auth 的 PostgreSQL schema 与 MX Space 原有 MongoDB 结构差异较大,以下是适配细节。

Schema 设计

Auth 表使用 text 主键(非 Snowflake),保持与 Better Auth 原生兼容:

主键类型 说明
readers text 读者/用户统一表,role = 'reader' | 'owner'
owner_profiles text FK → readers.id,扩展 owner 专属字段
accounts text 第三方登录账号,复合唯一索引 (provider_id, provider_account_id)
sessions text 会话,唯一索引 token
api_keys text API Key,唯一索引 key
passkeys text WebAuthn,唯一索引 credential_id
verifications text 验证码,索引 identifier

与内容表的区别: 内容表(post/note/comment 等)使用 Snowflake bigint 存为 text,auth 表使用 uuid/random text,二者不混用。

Owner / Reader 全局模块调整

  • OwnerServiceReaderService 改为 Global() 模块提供,解决循环依赖(CommentServiceReaderServiceAuthService
  • OwnerModel 不再继承 ReaderModel,而是通过 reader_id FK 关联

Repository 模式

  • AuthRepository 封装 Better Auth 的底层操作
  • ReaderRepository / OwnerRepository 继承 BaseRepository,通过 Drizzle 直接查询
  • 所有 auth 查询从 Mongoose lean() 改为 Drizzle select() + eq()

@Innei
Copy link
Copy Markdown
Member Author

Innei commented May 4, 2026

🔍 Follow-up: MongoDB → PostgreSQL 迁移 CLI 实现细节

scripts/migrate-mongo-to-postgres.ts + src/migration/postgres-data-migration/ 是本次 PR 的关键生产工具。

架构

runner.ts          -- CLI 入口,解析 --mode {dry-run|apply}
steps.ts           -- 按集合顺序定义 30+ 张表的迁移步骤
id-map.ts          -- MongoDB ObjectId → Snowflake ID 分配器
                   -- 保证跨集合引用一致性
types.ts           -- MigrationContext / MigrationStep / 报告类型

ID 映射策略

  1. 对每个 MongoDB 集合,先遍历所有文档
  2. 为每个 _id 预分配一个 Snowflake ID(或保留 auth text ID)
  3. 写入 mongo_id_map(内容表)和 auth_id_map(auth 表)
  4. 后续文档内的引用(如 comment.refIdpost.categoryId)通过映射表转换

执行顺序

按依赖拓扑排序:

readers → accounts/sessions/api_keys/passkeys/verifications
  → categories → topics
    → posts → post_related_posts
      → notes → pages → recentlies → comments
        → drafts → draft_histories
          → links → projects → says → snippets → subscribes
            → search_documents → activities → analyzes
              → file_references → serverless_storages/logs
                → slug_trackers → webhooks → webhook_events
                  → poll_votes → options → meta_presets

容错设计

  • 200 行/批次的 chunked upsert,ON CONFLICT DO NOTHING
  • 单批次失败记录 warning,不中断整体迁移
  • dry-run 模式只计数不写入,用于预先评估
  • 支持断点续传:重复执行 apply 不会重复导入已迁移数据

已知限制

  • passkey 批次迁移有一个已知 warning:passkeys batch: value.map is not a function(passkey 行被跳过,不影响其他数据)
  • 16 个 orphan ref 为历史遗留数据,预期行为

@Innei
Copy link
Copy Markdown
Member Author

Innei commented May 4, 2026

🔍 Follow-up: 测试基础设施重构

从 MongoDB 内存服务器全面迁移到 PostgreSQL Testcontainers,这是保证迁移质量的核心支柱。

基础设施替换

mongodb-memory-server @testcontainers/postgresql + testcontainers
每个 test 文件独立 MongoDB 实例 每个 test 文件独立 PostgreSQL 17 容器
Mongoose model mock Drizzle AppDatabase mock + repository provider mock

核心 helper

test/helper/pg-testcontainer.ts

  • 基于 PostgreSqlContainer 启动临时 PG 17
  • 自动执行 drizzle-migrate 加载 0000_initial.sql
  • 返回 { pool, db, stop },test 结束后自动销毁容器

test/helper/pg-repository-mock.ts

  • 为各 repository 提供 mock factory
  • 支持 createMockRepository(moduleRef, { post: [...], note: [...] })

test/helper/create-e2e-app.ts

  • 已适配:E2E 测试现在注入 postgresProviders 而非 DatabaseModule
  • 通过 __setTestPostgresInstance() 将容器 db 替换到全局缓存

Contract Tests(新增 ~25 个文件)

每个 public endpoint 都有独立的 shape contract:

领域 文件
Content post.contract.spec.ts, note.contract.spec.ts, page.contract.spec.ts, draft.contract.spec.ts, say.contract.spec.ts
Taxonomy category.contract.spec.ts, topic.contract.spec.ts
Comment comment.contract.spec.ts, yohaku/comment-thread.contract.spec.ts
Admin admin/*-admin.contract.spec.ts (posts, notes, pages, drafts, comments, topics, aggregate)
Yohaku yohaku/post-list.contract.spec.ts, yohaku/note-detail.contract.spec.ts, yohaku/aggregate-*.contract.spec.ts, …

每个 contract test 使用 assertNoLegacyKeys() 强制检测:响应中不能出现 _idcreatedmodifiedcount 等 Mongoose 遗留字段。

运行方式

# 单文件(自动启动 PG container)
pnpm test -- test/src/modules/comment/comment-thread.spec.ts

# 全量(CI 模式)
CI=true vitest run

环境变量:

SNOWFLAKE_WORKER_ID=1
PG_HOST=127.0.0.1
PG_PORT=5432
PG_USER=mx
PG_PASSWORD=mx
PG_DATABASE=mx_core

@Innei
Copy link
Copy Markdown
Member Author

Innei commented May 4, 2026

🔍 Follow-up: AI 模块 PostgreSQL 适配

AI 子系统(ai-summary / ai-translation / ai-insights / ai-agent-conversation)从 Mongoose 迁移到 Drizzle,涉及大量 JSONB 字段和复杂关联。

Schema 设计

-- AI 翻译条目
CREATE TABLE ai_translations (
  id          text PRIMARY KEY,
  type        text NOT NULL,          -- 'post' | 'note' | 'page' | 'recently'
  ref_id      text NOT NULL,
  lang        text NOT NULL,
  content     text NOT NULL,
  content_format text NOT NULL DEFAULT 'markdown',
  status      int NOT NULL DEFAULT 0, -- 0=pending 1=success 2=failed
  created_at  timestamptz NOT NULL DEFAULT now()
);

-- AI 摘要
CREATE TABLE ai_summaries (
  id          text PRIMARY KEY,
  ref_id      text NOT NULL,
  ref_type    text NOT NULL,
  summary     text NOT NULL,
  lang        text NOT NULL,
  created_at  timestamptz NOT NULL DEFAULT now()
);

-- AI Insights
CREATE TABLE ai_insights (
  id          text PRIMARY KEY,
  ref_id      text NOT NULL,
  ref_type    text NOT NULL,
  insights    jsonb NOT NULL DEFAULT '{}',
  created_at  timestamptz NOT NULL DEFAULT now()
);

-- 翻译条目(Lexical 专用)
CREATE TABLE translation_entries (
  id          text PRIMARY KEY,
  hash        text NOT NULL,
  original    text NOT NULL,
  translation text,
  lang        text NOT NULL,
  updated_at  timestamptz NOT NULL DEFAULT now(),
  UNIQUE(hash, lang)
);

-- AI Agent 对话
CREATE TABLE ai_agent_conversations (
  id          text PRIMARY KEY,
  session_id  text NOT NULL,
  role        text NOT NULL,          -- 'user' | 'assistant' | 'system'
  content     text NOT NULL,
  metadata    jsonb,
  created_at  timestamptz NOT NULL DEFAULT now()
);

关键变更

  1. ai-translation
  • 原文与翻译分离存储:ai_translations 存储完整翻译结果,translation_entries 存储 Lexical block-level 翻译缓存(hash 索引去重)
  • 新增 content_format 字段区分 markdown / lexical 内容
  • sourceModified → sourceModifiedAt 重命名(与全局 timestamp 命名一致)
  1. ai-summary
  • 摘要结果直接关联 ref_id + ref_type,不再通过 Mongoose ref ObjectId
  • 支持多语言摘要独立存储
  1. ai-insights
  • insights 使用 jsonb 存储结构化分析结果
  • Repository 提供 findByRef() 按内容 ID 批量查询
  1. Lexical 翻译策略
  • helper.lexical.service.ts 保持 createHeadlessEditor() + $toMarkdown() 不变
  • 翻译管线:Lexical JSON → Headless Editor → Markdown → 翻译 → 回填

Repository 查询示例

// ai-translation.repository.ts
async findByRef(refId: string, type: string): Promise<AiTranslationRow[]> {
  return this.db.select().from(aiTranslations)
    .where(and(
      eq(aiTranslations.refId, refId),
      eq(aiTranslations.type, type)
    ))
}

// translation-entry.repository.ts
async upsertEntry(hash: string, lang: string, translation: string) {
  return this.db.insert(translationEntries)
    .values({ hash, lang, original, translation })
    .onConflictDoUpdate({
      target: [translationEntries.hash, translationEntries.lang],
      set: { translation, updatedAt: new Date() }
    })
}

测试覆盖

  • test/src/modules/ai/ai-translation.service.spec.ts — 核心翻译流程
  • test/src/modules/ai/translation-consistency.service.spec.ts — 14 tests, sourceModifiedAt rename
  • test/src/modules/ai/lexical-translation-e2e.spec.ts — Lexical JSON 端到端翻译
  • test/src/modules/ai/ai-insights-translation.service.spec.ts — insights 多语言
  • test/src/modules/ai/ai-summary.service.spec.ts — 摘要生成与存储

@Innei
Copy link
Copy Markdown
Member Author

Innei commented May 4, 2026

🔍 Follow-up: Post / Note / Page / Recently 内容模型变更

核心内容模块从 Mongoose 的 WriteBaseModel 继承体系迁移到 Drizzle 显式 schema,取消了虚拟字段和 getter/setter。

Schema 对照

Mongoose (旧) Drizzle (新) 说明
_id: ObjectId id: text PRIMARY KEY Snowflake ID
title: string title: text NOT NULL 直接约束
slug: string slug: text NOT NULL 唯一索引
text: string text: text 可 null
content: string content: text 可 null,与 text 互斥
contentFormat: 'markdown' | 'lexical' content_format text NOT NULL 强制标记
created: Date created_at timestamptz NOT NULL DEFAULT now()
modified: Date modified_at timestamptz 可 null
count.read read_count int NOT NULL DEFAULT 0 反规范化
count.like like_count int NOT NULL DEFAULT 0 反规范化
images: Image[] images jsonb 数组存 JSONB
meta: Record<string, any> meta jsonb
tags: string[] tags text[] NOT NULL DEFAULT '{}' PostgreSQL 原生数组

多态内容格式

Markdown 模式: text 存原始 markdown,content 为 null
Lexical 模式: content 存 Lexical JSON 字符串,text 为 null 或存预览文本

// post.repository.ts
async findBySlug(slug: string): Promise<PostRow | null> {
  const [row] = await this.db.select().from(posts)
    .where(eq(posts.slug, slug)).limit(1)
  return row ? this.mapBase(row) : null
}

async incrementReadCount(id: string) {
  return this.db.update(posts)
    .set({ readCount: sql`read_count + 1` })
    .where(eq(posts.id, parseEntityId(id)))
}

关联设计

Post → Category (posts.category_idcategories.id)

  • ON DELETE RESTRICT:删除分类前必须清空关联文章
  • Repository 提供 findByCategoryId() 批量查询

Post → Related Posts (post_related_posts 关联表)

  • 独立表存储双向关联 + position 排序
  • ON DELETE CASCADE:删除文章时自动清理关联

Note → Topic (notes.topic_idtopics.id)

  • ON DELETE SET NULL:删除专题时文章变为未分类
  • nid 保留自增数字 ID(独立唯一索引),用于 URL 短链

Recently → Polymorphic Ref (recentlies.ref_type + recentlies.ref_id)

  • 应用层验证 ref_type ∈ {post, note, page, recently}
  • ref_id 可为 null(纯文本动态)

聚合查询

-- 首页聚合:最近文章 + 热门 + 动态
SELECT * FROM posts WHERE is_published = true ORDER BY pin_at DESC NULLS LAST, created_at DESC LIMIT 10;
SELECT * FROM notes WHERE is_published = true ORDER BY nid DESC LIMIT 10;
SELECT * FROM recentlies ORDER BY created_at DESC LIMIT 20;

Repository 模式对比

旧 (Mongoose) 新 (Drizzle)
PostModel.find().populate('category').lean() postRepo.list() + categoryRepo.findByIds() 批量关联
postModel.paginate({ categoryId }, { sort: { created: -1 } }) postRepo.paginatedFind({ categoryId }, page, size) + SQL ORDER BY created_at DESC
postModel.updateOne({ _id }, { $inc: { 'count.read': 1 } }) postRepo.incrementReadCount(id) + sql\read_count + 1``
postModel.aggregate([{ $lookup: ... }]) Drizzle select() + leftJoin() 或应用层批量查询

Page 特殊处理

  • order 字段用于自定义页面排序(索引 pages_order_idx
  • subtitle 支持副标题展示
  • category/tags 关联,结构最简单

@Innei
Copy link
Copy Markdown
Member Author

Innei commented May 4, 2026

🔍 Follow-up: Docker / CI / 部署配置变更

MongoDB 相关的所有部署配置被替换为 PostgreSQL,包括 Docker Compose、GitHub Actions CI、Dockerfile。

Docker Compose

# docker-compose.yml
services:
  app:
    depends_on:
      - postgres   # 替换 mongo
      - redis
    environment:
      - PG_HOST=postgres
      - PG_PORT=5432
      - PG_USER=mx
      - PG_PASSWORD=mx
      - PG_DATABASE=mx_core
      - SNOWFLAKE_WORKER_ID=1
      - REDIS_HOST=redis

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: mx
      POSTGRES_PASSWORD: mx
      POSTGRES_DB: mx_core
    volumes:
      - ./data/postgres:/var/lib/postgresql/data
    healthcheck:
      test: ['CMD-SHELL', 'pg_isready -U mx -d mx_core']
      interval: 30s
      timeout: 5s
      retries: 5
      start_period: 10s

  # mongo 服务已删除

Dockerfile

  • 移除 MongoDB 相关的环境变量和启动脚本
  • 新增 PG_* 环境变量声明
  • 构建阶段不再依赖 mongod 二进制文件

GitHub Actions CI

# .github/workflows/ci.yml
services:
  postgres:
    image: postgres:16-alpine
    env:
      POSTGRES_USER: mx
      POSTGRES_PASSWORD: mx
      POSTGRES_DB: mx_core
    ports:
      - 5432:5432
    options: >-
      --health-cmd "pg_isready -U mx -d mx_core"
      --health-interval 10s
      --health-timeout 5s
      --health-retries 5
  # mongodb 服务已删除

env:
  SNOWFLAKE_WORKER_ID: 1
  PG_HOST: 127.0.0.1
  PG_PORT: 5432
  PG_USER: mx
  PG_PASSWORD: mx
  PG_DATABASE: mx_core
  MIGRATIONS_DIR: ${{ github.workspace }}/apps/core/src/database/migrations

环境变量变更清单

旧变量 新变量 必填
DB_HOST PG_HOST
MONGO_CONNECTION PG_URL / PG_CONNECTION_STRING
PG_PORT 否(默认 5432)
PG_USER 否(默认 mx)
PG_PASSWORD 否(默认 mx)
PG_DATABASE 否(默认 mx_core)
SNOWFLAKE_WORKER_ID
SNOWFLAKE_WORKER_OFFSET 否(PM2 多实例用)

生产部署注意事项

  1. 首次启动前必须执行迁移

    docker compose run --rm app npx tsx scripts/migrate-mongo-to-postgres.ts --mode apply
  2. 数据卷路径变更

    • 旧:./data/db:/data/db(MongoDB)
    • 新:./data/postgres:/var/lib/postgresql/data
  3. 备份命令变更

    • 旧:mongodump --archive > backup.archive
    • 新:pg_dump -U mx -d mx_core > backup.sql
  4. PM2 部署需安装 PostgreSQL 16+

    # Ubuntu/Debian
    sudo apt install postgresql-16
    # 创建数据库和用户
    sudo -u postgres psql -c "CREATE USER mx WITH PASSWORD 'mx';"
    sudo -u postgres psql -c "CREATE DATABASE mx_core OWNER mx;"

@Innei
Copy link
Copy Markdown
Member Author

Innei commented May 4, 2026

🔍 Follow-up: @mx-space/api-client 模型重写

packages/api-client 的所有模型从 Mongoose shape 全面迁移到 PostgreSQL shape,是前端消费方的主要 breaking change 来源。

基类删除

旧版存在 TextBaseModel / BaseModel 等共享基类,带有 _idcreatedmodifiedcount 等字段。新版全部改为独立接口,显式声明每个字段:

// 旧 (base.ts)
export interface TextBaseModel extends BaseModel {
  title: string
  text: string
  summary?: string
  count: Count
}

export interface BaseModel {
  _id: string
  created: string
  modified: string | null
  id: string
}

// 新 (post.ts) — 完全独立
export interface PostModelMarkdown {
  id: string
  createdAt: string
  modifiedAt: string | null
  title: string
  text: string
  contentFormat?: 'markdown'
  content?: undefined
  summary?: string | null
  copyright: boolean
  tags: string[]
  slug: string
  categoryId: string
  category: CategoryModel
  images?: Image[] | null
  isPublished: boolean
  readCount: number
  likeCount: number
  pinAt?: string | null
  pinOrder?: number | null
  related?: PostRelatedSummary[]
}

各模型变更摘要

模型 主要变更
PostModel 拆分为 PostModelMarkdown | PostModelLexicalcountreadCount/likeCountpinpinAtcreatedcreatedAt
NoteModel nid 保留;count → 独立字段;topicId 新增;createdcreatedAt
PageModel order 新增;subtitle 新增;createdcreatedAt
CommentModel refrefId + refTypeparentCommentId/rootCommentId 显式声明;replyCountreply_count(API 输出)
CategoryModel slug + namecreatedcreatedAt_idid
RecentlyModel refId + refTypecommentsIndex 新增;createdcreatedAt
DraftModel refId + refTypecontentFormatversion/publishedVersioncreatedcreatedAt
LinkModel name/url/avatar/descriptioncreatedcreatedAt
SayModel text/source/authorcreatedcreatedAt
ProjectModel name/url/avatar/descriptioncreatedcreatedAt
ActivityModel type/payloadcreatedcreatedAt
ReaderModel handle/displayUsername/rolecreatedcreatedAt
AggregateModel 顶层聚合字段全部 snake_case;createdcreatedAt

Controller 层适配

// packages/api-client/controllers/post.ts
// 旧
async getPostById(id: string): Promise<PostModel> {
  return this.proxy.get(`/posts/${id}`)
}

// 新 — 类型不变,但返回结构已更新
async getPostById(id: string): Promise<PostModel> {
  return this.proxy.get(`/posts/${id}`)
}

Controller 方法签名不变,但返回数据的字段名已更新。前端升级 @mx-space/api-client 后需要同步修改字段访问。

前端升级指南

对于 admin-vue3 / Shiroi / Yohaku 项目:

  1. 升级 @mx-space/api-client 到与 core v12 对应的版本
  2. 全局替换字段名:
    • _idid
    • createdcreatedAt
    • modifiedmodifiedAt
    • count.readreadCount
    • count.likelikeCount
    • pin (Date) → pinAt
    • comment.refcomment.refId / comment.refType
    • comment.countcomment.replyCount
  3. 处理 PostModel 联合类型:运行时检查 contentFormat === 'lexical' 决定渲染方式

版本号策略

建议 @mx-space/api-client 发布 major 版本(如 v5 → v6),以明示 breaking change。core 的 package.json 同步更新 workspace 引用。

Innei added 3 commits May 4, 2026 23:35
…tering

- Added `LexicalService` for extracting summaries from lexical content.
- Introduced `year` filter in note listing methods.
- Updated `NoteModel` to include an optional `summary` field.
- Implemented `truncateAtBoundary` utility for locale-aware text truncation.
- Enhanced tests to cover new functionality and ensure proper behavior.
- Updated package dependencies to include `rebuild`.

Signed-off-by: Innei <tukon479@gmail.com>
- note: attachTopics now selects description/introduce/icon/createdAt so
  Yohaku NoteTopicDetail can render avatar + intro + description (mongo's
  autopopulate returned the full topic doc; PG was projecting to id/name/slug)
- recently: attach live commentsIndex via batched countManyByRef on every
  read path (findById/findRecent/getOffset/getAll); the stored counter
  drifts since it is not incremented on comment create
- comment: enrich author/avatar via reader fill before broadcasting
  COMMENT_CREATE so admin in-app + browser notifications no longer render
  "null: <text>" for logged-in readers; extend fallback chain to
  displayUsername/username/handle/email-prefix for OAuth users without
  profile name
Mongo stored `remaining` as Double(NaN) on legacy api_keys docs; the prior
`d.remaining ?? null` mapping let NaN through to PostgreSQL, which rejected
the entire batch with `invalid input syntax for type integer: "NaN"`.
Add intOrNull/intOr helpers and apply them to all api_keys integer fields.
@Innei Innei merged commit 3dd35b0 into master May 4, 2026
7 of 9 checks passed
Innei added a commit to mx-space/mx-admin that referenced this pull request May 10, 2026
Backend PG migration (mx-space/core#2659) renamed date fields
to the *At form. Updated AISummary/AIInsights/AITranslation,
ServerlessLogEntry, MetaPresetField, and WriteBaseType, plus all
callsites in summary/insights/translation detail panels, write
views, fn-log drawer, and text-base drawer.

Removed the dead webhook created/updated cards (backend never
emitted those fields).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants