feat(server): parse ydoc to markdown #12812

fengmk2 · 2025-06-13T03:52:45Z

close AI-190

Summary by CodeRabbit

New Features
- Introduced an endpoint to retrieve a document's markdown content and title.
- Added backend support for parsing document snapshots directly into markdown format.
Tests
- Added comprehensive tests and snapshot files for markdown retrieval, including success and error scenarios.
- Improved test coverage for content type validation and markdown parsing utilities.
Documentation
- Enhanced internal documentation through detailed test cases and snapshot references for new markdown features.

PR Dependency Tree

PR feat(server): parse ydoc to markdown #12812 👈
- PR feat(server): parse ydoc to ai editable markdown format #12846
  - PR feat(server): add read doc tool #12811

This tree was auto-generated by Charcoal

coderabbitai · 2025-06-13T03:52:51Z

Walkthrough

A new feature was added to support retrieving a document's complete markdown content via a backend API. This includes a new endpoint, utility functions for parsing document snapshots to markdown, test cases (unit and end-to-end), and snapshot files to verify output correctness. No existing exported entities were changed.

Changes

File(s)	Change Summary
.../src/core/utils/blocksuite.ts	Added `parseDocToMarkdownFromDocSnapshot` function for parsing Yjs doc snapshots to markdown.
.../src/core/doc/reader.ts	Added `DocMarkdown` interface and `getDocMarkdown` method to `DocReader` and its subclasses.
.../src/core/doc-service/controller.ts	Added `/workspaces/:workspaceId/docs/:docId/markdown` endpoint to return markdown content.
.../src/core/utils/tests/blocksute.spec.ts .../blocksute.spec.ts.md	Added a test and snapshot for parsing doc snapshot to markdown.
.../src/core/doc/tests/reader-from-database.spec.ts .../reader-from-database.spec.ts.md	Added tests and snapshots for `getDocMarkdown` in the database doc reader.
.../src/core/doc/tests/reader-from-rpc.spec.ts .../reader-from-rpc.spec.ts.md	Added tests and snapshots for `getDocMarkdown` in the RPC doc reader.
.../src/core/doc-service/tests/controller.spec.ts	Added/updated tests for content type and markdown retrieval in doc service controller.
.../src/tests/e2e/doc-service/controller.spec.ts .../controller.spec.ts.md	Added end-to-end tests and snapshots for the markdown retrieval API endpoint.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Controller
    participant DocReader
    participant Database
    participant RPCService

    Client->>Controller: GET /workspaces/:wsId/docs/:docId/markdown
    Controller->>DocReader: getDocMarkdown(wsId, docId)
    alt RPC DocReader
        DocReader->>RPCService: fetch markdown (wsId, docId)
        alt Success
            RPCService-->>DocReader: {title, markdown}
            DocReader-->>Controller: {title, markdown}
        else Failure (network, etc.)
            DocReader->>Database: getDocMarkdown(wsId, docId)
            Database-->>DocReader: {title, markdown} or null
            DocReader-->>Controller: {title, markdown} or null
        end
    else Database DocReader
        DocReader->>Database: getDocMarkdown(wsId, docId)
        Database-->>DocReader: {title, markdown} or null
        DocReader-->>Controller: {title, markdown} or null
    end
    alt Found
        Controller-->>Client: 200 OK, JSON {title, markdown}
    else Not found
        Controller-->>Client: 404 Not Found, error JSON
    end

Assessment against linked issues

Objective	Addressed	Explanation
Implement doc reader to retrieve complete markdown (AI-190)	✅
Markdown output must conform to specified format/rules (AI-190)	✅

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes found.

Possibly related PRs

toeverything/AFFiNE#12840: Introduces or moves the markdown parsing logic (parsePageDoc) used in this PR's new utilities and endpoints.
toeverything/AFFiNE#12625: Adjusts document snapshot reading utilities, which are also extended in this PR for markdown parsing.

Suggested reviewers

forehalo

Poem

In the warren where code bunnies dwell,
Markdown magic, we now compel!
Snapshots abound, tests hop in line,
New endpoints serve content just fine.
With every doc, a story to share—
🐇 Happy coding, everywhere!

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4d26eb and 541a2cc.

⛔ Files ignored due to path filters (4)

packages/backend/server/src/__tests__/e2e/doc-service/__snapshots__/controller.spec.ts.snap is excluded by !**/*.snap
packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-database.spec.ts.snap is excluded by !**/*.snap
packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-rpc.spec.ts.snap is excluded by !**/*.snap
packages/backend/server/src/core/utils/__tests__/__snapshots__/blocksute.spec.ts.snap is excluded by !**/*.snap

📒 Files selected for processing (12)

packages/backend/server/src/__tests__/e2e/doc-service/__snapshots__/controller.spec.ts.md (1 hunks)
packages/backend/server/src/__tests__/e2e/doc-service/controller.spec.ts (1 hunks)
packages/backend/server/src/core/doc-service/__tests__/controller.spec.ts (4 hunks)
packages/backend/server/src/core/doc-service/controller.ts (1 hunks)
packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-database.spec.ts.md (1 hunks)
packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-rpc.spec.ts.md (1 hunks)
packages/backend/server/src/core/doc/__tests__/reader-from-database.spec.ts (1 hunks)
packages/backend/server/src/core/doc/__tests__/reader-from-rpc.spec.ts (3 hunks)
packages/backend/server/src/core/doc/reader.ts (5 hunks)
packages/backend/server/src/core/utils/__tests__/__snapshots__/blocksute.spec.ts.md (1 hunks)
packages/backend/server/src/core/utils/__tests__/blocksute.spec.ts (2 hunks)
packages/backend/server/src/core/utils/blocksuite.ts (2 hunks)

✅ Files skipped from review due to trivial changes (3)

packages/backend/server/src/tests/e2e/doc-service/snapshots/controller.spec.ts.md
packages/backend/server/src/core/doc/tests/snapshots/reader-from-database.spec.ts.md
packages/backend/server/src/tests/e2e/doc-service/controller.spec.ts

🚧 Files skipped from review as they are similar to previous changes (9)

packages/backend/server/src/core/utils/blocksuite.ts
packages/backend/server/src/core/utils/tests/blocksute.spec.ts
packages/backend/server/src/core/doc/tests/snapshots/reader-from-rpc.spec.ts.md
packages/backend/server/src/core/utils/tests/snapshots/blocksute.spec.ts.md
packages/backend/server/src/core/doc/tests/reader-from-rpc.spec.ts
packages/backend/server/src/core/doc-service/controller.ts
packages/backend/server/src/core/doc/tests/reader-from-database.spec.ts
packages/backend/server/src/core/doc-service/tests/controller.spec.ts
packages/backend/server/src/core/doc/reader.ts

⏰ Context from checks skipped due to timeout of 90000ms (56)

GitHub Check: Unit Test (5)
GitHub Check: Unit Test (2)
GitHub Check: Unit Test (4)
GitHub Check: Unit Test (3)
GitHub Check: Unit Test (1)
GitHub Check: Native Unit Test
GitHub Check: test-build-mobile-app / build-android-web
GitHub Check: test-build-mobile-app / build-ios-web
GitHub Check: loom thread test
GitHub Check: y-octo binding test on aarch64-pc-windows-msvc
GitHub Check: fuzzing
GitHub Check: E2E BlockSuite Test (9)
GitHub Check: y-octo binding test on x86_64-pc-windows-msvc
GitHub Check: E2E Test (6)
GitHub Check: y-octo binding test on aarch64-apple-darwin
GitHub Check: y-octo binding test on x86_64-apple-darwin
GitHub Check: E2E BlockSuite Test (7)
GitHub Check: E2E BlockSuite Test (6)
GitHub Check: Run native tests
GitHub Check: E2E BlockSuite Test (2)
GitHub Check: E2E BlockSuite Test (10)
GitHub Check: E2E BlockSuite Test (8)
GitHub Check: E2E BlockSuite Test (5)
GitHub Check: E2E BlockSuite Test (3)
GitHub Check: E2E Test (7)
GitHub Check: E2E BlockSuite Test (4)
GitHub Check: E2E BlockSuite Test (1)
GitHub Check: E2E Test (10)
GitHub Check: E2E Test (9)
GitHub Check: E2E Test (4)
GitHub Check: E2E Test (2)
GitHub Check: E2E Test (8)
GitHub Check: E2E Test (5)
GitHub Check: E2E Test (3)
GitHub Check: E2E Mobile Test (3)
GitHub Check: E2E Mobile Test (2)
GitHub Check: E2E Test (1)
GitHub Check: E2E Mobile Test (4)
GitHub Check: E2E Mobile Test (5)
GitHub Check: E2E Mobile Test (1)
GitHub Check: Typecheck
GitHub Check: E2E BlockSuite Cross Browser Test (2, webkit)
GitHub Check: Build @affine/electron renderer
GitHub Check: E2E BlockSuite Cross Browser Test (2, firefox)
GitHub Check: E2E BlockSuite Cross Browser Test (1, webkit)
GitHub Check: E2E BlockSuite Cross Browser Test (2, chromium)
GitHub Check: E2E BlockSuite Cross Browser Test (1, chromium)
GitHub Check: Build AFFiNE native (aarch64-pc-windows-msvc)
GitHub Check: E2E BlockSuite Cross Browser Test (1, firefox)
GitHub Check: Build AFFiNE native (x86_64-pc-windows-msvc)
GitHub Check: Analyze (typescript, blocksuite)
GitHub Check: Analyze (typescript, affine)
GitHub Check: Build Server native
GitHub Check: Analyze (javascript, affine)
GitHub Check: Analyze (javascript, blocksuite)
GitHub Check: Lint

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

codecov · 2025-06-13T04:31:19Z

Codecov Report

Attention: Patch coverage is 84.61538% with 14 lines in your changes missing coverage. Please review.

Project coverage is 55.62%. Comparing base (899ffd1) to head (541a2cc).

Files with missing lines	Patch %	Lines
packages/backend/server/src/core/doc/reader.ts	73.46%	13 Missing ⚠️
...ckages/backend/server/src/core/utils/blocksuite.ts	96.42%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           canary   #12812      +/-   ##
==========================================
+ Coverage   55.59%   55.62%   +0.02%     
==========================================
  Files        2658     2658              
  Lines      125982   126073      +91     
  Branches    19961    19968       +7     
==========================================
+ Hits        70042    70124      +82     
- Misses      53601    53610       +9     
  Partials     2339     2339

Flag	Coverage Δ
server-test	`78.88% <84.61%> (+0.01%)`	⬆️
unittest	`31.67% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (10)

packages/backend/server/src/core/utils/blocksuite.ts (2)
11-14: Avoid ambiguous aliasing – keep original helper name to aid readability

parsePageDoc is re-exported under the local alias parseDocToMarkdown.
Because the file later introduces another public function named parseDocToMarkdownFromDocSnapshot, this aliasing makes it harder to reason about which symbol is doing the heavy lifting. Keeping the original import name (or choosing a more explicit alias such as parsePageDocToMarkdown) markedly reduces mental overhead.
-import {
-  parsePageDoc as parseDocToMarkdown,
+import {
+  parsePageDoc as parsePageDocToMarkdown,
and adjust references accordingly.

201-226: URL helper uses singular “workspace” – check for broken links

buildDocUrl returns /workspace/${workspaceId}/${docId} (singular workspace).
Elsewhere in the backend the canonical route is /workspaces/:workspaceId/... (plural). Rendering markdown with wrong links will yield 404s in the UI.
-    buildDocUrl: (docId: string) => {
-      return `/workspace/${workspaceId}/${docId}`;
+    buildDocUrl: (docId: string) => {
+      return `/workspaces/${workspaceId}/${docId}`;
     },
Please confirm the expected URL schema throughout the client before applying.
packages/backend/server/src/core/utils/__tests__/blocksute.spec.ts (1)
93-101: Test does not assert non-null result – potential false positives

The snapshot will happily record null, so the test would still pass even if markdown parsing failed. Add an explicit truthy check to guard against silent regressions.
-  const result = parseDocToMarkdownFromDocSnapshot(
+  const result = parseDocToMarkdownFromDocSnapshot(
     workspace.id,
     docSnapshot.id,
     docSnapshot.blob
   );
+  t.truthy(result);
   t.snapshot(result);
packages/backend/server/src/core/doc/__tests__/reader-from-database.spec.ts (1)
267-274: Strengthen assertion before snapshot

Same issue as in the util‐test: if getDocMarkdown returns null, the snapshot will still pass. Add a sanity check.
-  const result = await docReader.getDocMarkdown(workspace.id, docSnapshot.id);
+  const result = await docReader.getDocMarkdown(workspace.id, docSnapshot.id);
+  t.truthy(result);
   t.snapshot(result);
packages/backend/server/src/core/doc-service/controller.ts (1)

45-57: Consider exposing content type or headers for markdown endpoint

getDocMarkdown returns a JSON object containing title & markdown, yet consumers might reasonably expect raw text/markdown. Clarify the contract (e.g. /markdown → JSON vs /markdown/raw → plain MD) or at least document it. Returning the markdown directly with res.type('text/markdown') could avoid an extra unwrap on the client side.

packages/backend/server/src/core/doc/__tests__/reader-from-rpc.spec.ts (1)

379-407: Large snapshot duplication inflates repo & review noise.

The exact same ~120-line markdown blob is now stored in three snapshot files. Consider:

Extracting it to a fixture (fixtures/affine-doc.md) and snapshotting only a hash or fs.readFileSync result.

Using an AVA macro to share a single snapshot across suites.

That keeps repo size down and avoids triple updates whenever the markdown changes.

packages/backend/server/src/__tests__/e2e/doc-service/controller.spec.ts (2)

19-25: Token binding could be tightened to the workspace for better security.

crypto.sign(docSnapshot.id) signs only the document id, so any user who learns a valid token for one doc can try it against other workspaces.
Consider signing both identifiers (e.g. workspaceId:docId) or embedding a nonce.

This is an optional hardening step but cheap to adopt now.

28-42: Minor: you’re asserting the full Content-Type string; matching only the MIME (/application\/json/) avoids failures if charset changes.
packages/backend/server/src/core/doc/reader.ts (2)
37-41: Interface added, but consider future-proof naming

DocMarkdown currently contains only title and markdown.
If we expect front-matter (tags, createdAt, etc.) later, a more generic DocMarkdownContent (or similar) would leave room for extension without a breaking rename.

329-354: RPC fallback path duplicates logging logic

The catch-block here is an almost verbatim copy of the one in getDoc / getDocDiff.
Refactor to a private helper to reduce duplication and guarantee consistent logging semantics.
-    } catch (e) {
-      if (e instanceof UserFriendlyError) {
-        throw e;
-      }
-      const err = e as Error;
-      ...
-      return await super.getDocMarkdown(workspaceId, docId);
-    }
+    } catch (e) {
+      return await this.handleRpcError(
+        e,
+        `doc markdown ${url}`,
+        () => super.getDocMarkdown(workspaceId, docId)
+      );
+    }
(Helper would encapsulate the existing logic.)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c844786 and e4d26eb.

⛔ Files ignored due to path filters (4)

packages/backend/server/src/__tests__/e2e/doc-service/__snapshots__/controller.spec.ts.snap is excluded by !**/*.snap
packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-database.spec.ts.snap is excluded by !**/*.snap
packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-rpc.spec.ts.snap is excluded by !**/*.snap
packages/backend/server/src/core/utils/__tests__/__snapshots__/blocksute.spec.ts.snap is excluded by !**/*.snap

📒 Files selected for processing (12)

packages/backend/server/src/__tests__/e2e/doc-service/__snapshots__/controller.spec.ts.md (1 hunks)
packages/backend/server/src/__tests__/e2e/doc-service/controller.spec.ts (1 hunks)
packages/backend/server/src/core/doc-service/__tests__/controller.spec.ts (4 hunks)
packages/backend/server/src/core/doc-service/controller.ts (1 hunks)
packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-database.spec.ts.md (1 hunks)
packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-rpc.spec.ts.md (1 hunks)
packages/backend/server/src/core/doc/__tests__/reader-from-database.spec.ts (1 hunks)
packages/backend/server/src/core/doc/__tests__/reader-from-rpc.spec.ts (3 hunks)
packages/backend/server/src/core/doc/reader.ts (5 hunks)
packages/backend/server/src/core/utils/__tests__/__snapshots__/blocksute.spec.ts.md (1 hunks)
packages/backend/server/src/core/utils/__tests__/blocksute.spec.ts (2 hunks)
packages/backend/server/src/core/utils/blocksuite.ts (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (55)

GitHub Check: Unit Test (3)
GitHub Check: Unit Test (4)
GitHub Check: Unit Test (1)
GitHub Check: Unit Test (2)
GitHub Check: Unit Test (5)
GitHub Check: Native Unit Test
GitHub Check: test-build-mobile-app / build-android-web
GitHub Check: test-build-mobile-app / build-ios-web
GitHub Check: fuzzing
GitHub Check: E2E Test (1)
GitHub Check: y-octo binding test on x86_64-pc-windows-msvc
GitHub Check: y-octo binding test on x86_64-apple-darwin
GitHub Check: Run native tests
GitHub Check: y-octo binding test on aarch64-pc-windows-msvc
GitHub Check: E2E BlockSuite Test (6)
GitHub Check: E2E BlockSuite Test (7)
GitHub Check: E2E BlockSuite Test (2)
GitHub Check: E2E BlockSuite Test (8)
GitHub Check: E2E BlockSuite Test (9)
GitHub Check: E2E BlockSuite Test (10)
GitHub Check: E2E BlockSuite Test (5)
GitHub Check: E2E BlockSuite Test (4)
GitHub Check: E2E BlockSuite Test (1)
GitHub Check: E2E BlockSuite Test (3)
GitHub Check: E2E Test (10)
GitHub Check: Build Server native
GitHub Check: E2E Test (6)
GitHub Check: E2E Test (4)
GitHub Check: E2E Test (8)
GitHub Check: E2E Test (5)
GitHub Check: E2E Test (7)
GitHub Check: E2E Test (2)
GitHub Check: E2E Test (3)
GitHub Check: E2E Test (9)
GitHub Check: E2E BlockSuite Cross Browser Test (1, chromium)
GitHub Check: E2E Mobile Test (5)
GitHub Check: E2E Mobile Test (4)
GitHub Check: loom thread test
GitHub Check: E2E Mobile Test (3)
GitHub Check: Build AFFiNE native (aarch64-pc-windows-msvc)
GitHub Check: E2E Mobile Test (2)
GitHub Check: E2E Mobile Test (1)
GitHub Check: Build AFFiNE native (x86_64-pc-windows-msvc)
GitHub Check: E2E BlockSuite Cross Browser Test (2, webkit)
GitHub Check: E2E BlockSuite Cross Browser Test (1, webkit)
GitHub Check: Build @affine/electron renderer
GitHub Check: E2E BlockSuite Cross Browser Test (2, firefox)
GitHub Check: E2E BlockSuite Cross Browser Test (1, firefox)
GitHub Check: Analyze (typescript, blocksuite)
GitHub Check: E2E BlockSuite Cross Browser Test (2, chromium)
GitHub Check: Analyze (javascript, affine)
GitHub Check: Analyze (typescript, affine)
GitHub Check: Analyze (javascript, blocksuite)
GitHub Check: Typecheck
GitHub Check: Lint

🔇 Additional comments (9)

packages/backend/server/src/core/utils/blocksuite.ts (1)

211-220: Future-proof for possible async return value

If @affine/reader/dist.parsePageDocToMarkdown (currently synchronous) ever becomes async, this wrapper will silently return a pending Promise.
Safest is to await and mark this function async now – zero runtime cost today, avoids a class of bugs later.
-export function parseDocToMarkdownFromDocSnapshot(
+export async function parseDocToMarkdownFromDocSnapshot(
 ...
-) {
 ...
-  const parsed = parseDocToMarkdown({
+  const parsed = await parsePageDocToMarkdown({
packages/backend/server/src/__tests__/e2e/doc-service/__snapshots__/controller.spec.ts.md (1)

1-119: Snapshot file – nothing actionable.

packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-database.spec.ts.md (1)

1-107: Snapshot file – nothing actionable.

packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-rpc.spec.ts.md (1)

1-107: Snapshot file – nothing actionable.

packages/backend/server/src/core/doc-service/__tests__/controller.spec.ts (4)

178-193: Good addition: explicitly asserting JSON Content-Type.

The extra Content-Type expectations tighten the contract of the endpoint and protect against accidental regressions (e.g. serving plain text).
Looks correct and stable.

210-215: Consistent header assertion retained

Same positive remark as above – thanks for keeping the tests symmetric between full=true and default branches.

258-278: 👍 New happy-path test for /markdown endpoint

The new test covers:

endpoint routing

DB reader wiring through the mock

correct headers + body

Nothing missing here.

280-297: 404 branch well-covered

Negative path mirrors other endpoints; test logic & assertions LGTM.

packages/backend/server/src/core/utils/__tests__/__snapshots__/blocksute.spec.ts.md (1)

1370-1470: Approve new snapshot for markdown parsing
The freshly added snapshot for can parse doc to markdown from doc snapshot accurately captures the expected markdown string and title output. The headings, blank lines, lists, links, and table formatting conform to the feature’s requirements and ensure reliable regression testing.

packages/backend/server/src/core/doc/__tests__/reader-from-rpc.spec.ts

coderabbitai · 2025-06-18T05:46:54Z

packages/backend/server/src/core/doc/__tests__/reader-from-rpc.spec.ts

+const module = await createModule({
+  imports: [DocStorageModule],
+});
+


🛠️ Refactor suggestion

Avoid shadowing Node’s built-in module identifier & heavy top-level await.

Declaring a variable literally named module hides Node’s built-in module (always defined in CJS, sometimes poly-filled by tooling). That invites “cannot read property of undefined”–style head-scratchers when helpers assume the global still exists.

Kicking off createModule with top-level await means expensive I/O runs as soon as the file is required, even when AVA filters tests. Moving it into test.before() keeps the test runner fast and predictable.

-const module = await createModule({ - imports: [DocStorageModule], -}); +// Initialise lazily inside a hook, and use a less confusing name. +let testModule: Awaited<ReturnType<typeof createModule>>; + +test.before(async () => { + testModule = await createModule({ imports: [DocStorageModule] }); +});

(You’ll need to update subsequent references from module → testModule.)

🤖 Prompt for AI Agents

In packages/backend/server/src/core/doc/__tests__/reader-from-rpc.spec.ts around lines 22 to 25, rename the variable `module` to `testModule` to avoid shadowing Node's built-in `module` identifier. Also, move the `await createModule({ imports: [DocStorageModule] })` call from the top-level to inside a `test.before()` hook to prevent expensive I/O from running immediately on file load. Update all subsequent references from `module` to `testModule` accordingly.

coderabbitai · 2025-06-18T05:46:55Z

packages/backend/server/src/core/doc/reader.ts

+  abstract getDocMarkdown(
+    workspaceId: string,
+    docId: string
+  ): Promise<DocMarkdown | null>;
+


🛠️ Refactor suggestion

Abstract method lacks caching strategy

getDocMarkdown is introduced without a cache analogue to getDocContent.
Rendering markdown from a YDoc snapshot is CPU-heavy and repeatedly hitting the DB on every request is avoidable.

+ // TODO: consider `getDocMarkdownWithoutCache` + 7-day cache, symmetric with doc content

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

abstract getDocMarkdown(

workspaceId: string,

docId: string

): Promise<DocMarkdown | null>;

// TODO: consider `getDocMarkdownWithoutCache` + 7-day cache, symmetric with doc content

abstract getDocMarkdown(

workspaceId: string,

docId: string

): Promise<DocMarkdown | null>;

🤖 Prompt for AI Agents

In packages/backend/server/src/core/doc/reader.ts around lines 68 to 72, the abstract method getDocMarkdown lacks a caching mechanism similar to getDocContent. To fix this, introduce a caching strategy for getDocMarkdown to store and retrieve rendered markdown results, reducing CPU load and database hits. Implement a cache layer that checks for existing markdown before rendering from the YDoc snapshot and fetching from the database.

coderabbitai · 2025-06-18T05:46:55Z

packages/backend/server/src/core/doc/reader.ts

+  async getDocMarkdown(
+    workspaceId: string,
+    docId: string
+  ): Promise<DocMarkdown | null> {
+    const doc = await this.workspace.getDoc(workspaceId, docId);
+    if (!doc) {
+      return null;
+    }
+    return parseDocToMarkdownFromDocSnapshot(workspaceId, docId, doc.bin);
+  }


⚠️ Potential issue

Database implementation: one missing null-guard

parseDocToMarkdownFromDocSnapshot can theoretically return null/undefined (e.g. corrupted snapshot).
If that happens we’ll violate the declared return type and propagate an undefined value.

- return parseDocToMarkdownFromDocSnapshot(workspaceId, docId, doc.bin); + const markdown = parseDocToMarkdownFromDocSnapshot(workspaceId, docId, doc.bin); + return markdown ?? null;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async getDocMarkdown(

workspaceId: string,

docId: string

): Promise<DocMarkdown | null> {

const doc = await this.workspace.getDoc(workspaceId, docId);

if (!doc) {

return null;

}

return parseDocToMarkdownFromDocSnapshot(workspaceId, docId, doc.bin);

}

async getDocMarkdown(

workspaceId: string,

docId: string

): Promise<DocMarkdown | null> {

const doc = await this.workspace.getDoc(workspaceId, docId);

if (!doc) {

return null;

}

const markdown = parseDocToMarkdownFromDocSnapshot(workspaceId, docId, doc.bin);

return markdown ?? null;

}

🤖 Prompt for AI Agents

In packages/backend/server/src/core/doc/reader.ts around lines 185 to 194, the function getDocMarkdown calls parseDocToMarkdownFromDocSnapshot which may return null or undefined, violating the declared return type Promise<DocMarkdown | null>. To fix this, add a null check after calling parseDocToMarkdownFromDocSnapshot and explicitly return null if the result is null or undefined, ensuring the function always returns a value matching its declared type.

github-project-automation bot added this to AFFiNE Project Jun 13, 2025

fengmk2 mentioned this pull request Jun 13, 2025

feat(server): add read doc tool #12811

Open

github-actions bot added app:server test Related to test cases labels Jun 13, 2025

fengmk2 force-pushed the read-doc-markdown branch from 03e15d8 to dd21429 Compare June 13, 2025 04:25

fengmk2 force-pushed the read-doc-markdown branch 5 times, most recently from 016adc0 to a9e4dd0 Compare June 18, 2025 02:03

fengmk2 mentioned this pull request Jun 18, 2025

feat(server): parse ydoc to ai editable markdown format #12846

Open

fengmk2 force-pushed the read-doc-markdown branch from a9e4dd0 to e4d26eb Compare June 18, 2025 05:41

fengmk2 marked this pull request as ready for review June 18, 2025 05:41

fengmk2 requested review from darkskygit and pengx17 June 18, 2025 05:41

coderabbitai bot reviewed Jun 18, 2025

View reviewed changes

feat(server): parse ydoc to markdown

541a2cc

fengmk2 force-pushed the read-doc-markdown branch from e4d26eb to 541a2cc Compare June 18, 2025 06:03

forehalo approved these changes Jun 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(server): parse ydoc to markdown #12812

feat(server): parse ydoc to markdown #12812

fengmk2 commented Jun 13, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Jun 13, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

codecov bot commented Jun 13, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Jun 18, 2025

Uh oh!

coderabbitai bot Jun 18, 2025

Uh oh!

coderabbitai bot Jun 18, 2025

Uh oh!

Uh oh!

Uh oh!

feat(server): parse ydoc to markdown #12812

Are you sure you want to change the base?

feat(server): parse ydoc to markdown #12812

Conversation

fengmk2 commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

PR Dependency Tree

Uh oh!

coderabbitai bot commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Assessment against linked issues

Assessment against linked issues: Out-of-scope changes

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

codecov bot commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fengmk2 commented Jun 13, 2025 •

edited

Loading

coderabbitai bot commented Jun 13, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Jun 13, 2025 •

edited

Loading