Skip to content

feat(server): parse ydoc to markdown #12812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: canary
Choose a base branch
from
Open

Conversation

fengmk2
Copy link
Member

@fengmk2 fengmk2 commented Jun 13, 2025

close AI-190

Summary by CodeRabbit

  • New Features

    • Introduced an endpoint to retrieve a document's markdown content and title.
    • Added backend support for parsing document snapshots directly into markdown format.
  • Tests

    • Added comprehensive tests and snapshot files for markdown retrieval, including success and error scenarios.
    • Improved test coverage for content type validation and markdown parsing utilities.
  • Documentation

    • Enhanced internal documentation through detailed test cases and snapshot references for new markdown features.

PR Dependency Tree

This tree was auto-generated by Charcoal

Copy link

coderabbitai bot commented Jun 13, 2025

Walkthrough

A new feature was added to support retrieving a document's complete markdown content via a backend API. This includes a new endpoint, utility functions for parsing document snapshots to markdown, test cases (unit and end-to-end), and snapshot files to verify output correctness. No existing exported entities were changed.

Changes

File(s) Change Summary
.../src/core/utils/blocksuite.ts Added parseDocToMarkdownFromDocSnapshot function for parsing Yjs doc snapshots to markdown.
.../src/core/doc/reader.ts Added DocMarkdown interface and getDocMarkdown method to DocReader and its subclasses.
.../src/core/doc-service/controller.ts Added /workspaces/:workspaceId/docs/:docId/markdown endpoint to return markdown content.
.../src/core/utils/tests/blocksute.spec.ts
.../blocksute.spec.ts.md
Added a test and snapshot for parsing doc snapshot to markdown.
.../src/core/doc/tests/reader-from-database.spec.ts
.../reader-from-database.spec.ts.md
Added tests and snapshots for getDocMarkdown in the database doc reader.
.../src/core/doc/tests/reader-from-rpc.spec.ts
.../reader-from-rpc.spec.ts.md
Added tests and snapshots for getDocMarkdown in the RPC doc reader.
.../src/core/doc-service/tests/controller.spec.ts Added/updated tests for content type and markdown retrieval in doc service controller.
.../src/tests/e2e/doc-service/controller.spec.ts
.../controller.spec.ts.md
Added end-to-end tests and snapshots for the markdown retrieval API endpoint.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Controller
    participant DocReader
    participant Database
    participant RPCService

    Client->>Controller: GET /workspaces/:wsId/docs/:docId/markdown
    Controller->>DocReader: getDocMarkdown(wsId, docId)
    alt RPC DocReader
        DocReader->>RPCService: fetch markdown (wsId, docId)
        alt Success
            RPCService-->>DocReader: {title, markdown}
            DocReader-->>Controller: {title, markdown}
        else Failure (network, etc.)
            DocReader->>Database: getDocMarkdown(wsId, docId)
            Database-->>DocReader: {title, markdown} or null
            DocReader-->>Controller: {title, markdown} or null
        end
    else Database DocReader
        DocReader->>Database: getDocMarkdown(wsId, docId)
        Database-->>DocReader: {title, markdown} or null
        DocReader-->>Controller: {title, markdown} or null
    end
    alt Found
        Controller-->>Client: 200 OK, JSON {title, markdown}
    else Not found
        Controller-->>Client: 404 Not Found, error JSON
    end
Loading

Assessment against linked issues

Objective Addressed Explanation
Implement doc reader to retrieve complete markdown (AI-190)
Markdown output must conform to specified format/rules (AI-190)

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes found.

Possibly related PRs

  • toeverything/AFFiNE#12840: Introduces or moves the markdown parsing logic (parsePageDoc) used in this PR's new utilities and endpoints.
  • toeverything/AFFiNE#12625: Adjusts document snapshot reading utilities, which are also extended in this PR for markdown parsing.

Suggested reviewers

  • forehalo

Poem

In the warren where code bunnies dwell,
Markdown magic, we now compel!
Snapshots abound, tests hop in line,
New endpoints serve content just fine.
With every doc, a story to share—
🐇 Happy coding, everywhere!


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4d26eb and 541a2cc.

⛔ Files ignored due to path filters (4)
  • packages/backend/server/src/__tests__/e2e/doc-service/__snapshots__/controller.spec.ts.snap is excluded by !**/*.snap
  • packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-database.spec.ts.snap is excluded by !**/*.snap
  • packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-rpc.spec.ts.snap is excluded by !**/*.snap
  • packages/backend/server/src/core/utils/__tests__/__snapshots__/blocksute.spec.ts.snap is excluded by !**/*.snap
📒 Files selected for processing (12)
  • packages/backend/server/src/__tests__/e2e/doc-service/__snapshots__/controller.spec.ts.md (1 hunks)
  • packages/backend/server/src/__tests__/e2e/doc-service/controller.spec.ts (1 hunks)
  • packages/backend/server/src/core/doc-service/__tests__/controller.spec.ts (4 hunks)
  • packages/backend/server/src/core/doc-service/controller.ts (1 hunks)
  • packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-database.spec.ts.md (1 hunks)
  • packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-rpc.spec.ts.md (1 hunks)
  • packages/backend/server/src/core/doc/__tests__/reader-from-database.spec.ts (1 hunks)
  • packages/backend/server/src/core/doc/__tests__/reader-from-rpc.spec.ts (3 hunks)
  • packages/backend/server/src/core/doc/reader.ts (5 hunks)
  • packages/backend/server/src/core/utils/__tests__/__snapshots__/blocksute.spec.ts.md (1 hunks)
  • packages/backend/server/src/core/utils/__tests__/blocksute.spec.ts (2 hunks)
  • packages/backend/server/src/core/utils/blocksuite.ts (2 hunks)
✅ Files skipped from review due to trivial changes (3)
  • packages/backend/server/src/tests/e2e/doc-service/snapshots/controller.spec.ts.md
  • packages/backend/server/src/core/doc/tests/snapshots/reader-from-database.spec.ts.md
  • packages/backend/server/src/tests/e2e/doc-service/controller.spec.ts
🚧 Files skipped from review as they are similar to previous changes (9)
  • packages/backend/server/src/core/utils/blocksuite.ts
  • packages/backend/server/src/core/utils/tests/blocksute.spec.ts
  • packages/backend/server/src/core/doc/tests/snapshots/reader-from-rpc.spec.ts.md
  • packages/backend/server/src/core/utils/tests/snapshots/blocksute.spec.ts.md
  • packages/backend/server/src/core/doc/tests/reader-from-rpc.spec.ts
  • packages/backend/server/src/core/doc-service/controller.ts
  • packages/backend/server/src/core/doc/tests/reader-from-database.spec.ts
  • packages/backend/server/src/core/doc-service/tests/controller.spec.ts
  • packages/backend/server/src/core/doc/reader.ts
⏰ Context from checks skipped due to timeout of 90000ms (56)
  • GitHub Check: Unit Test (5)
  • GitHub Check: Unit Test (2)
  • GitHub Check: Unit Test (4)
  • GitHub Check: Unit Test (3)
  • GitHub Check: Unit Test (1)
  • GitHub Check: Native Unit Test
  • GitHub Check: test-build-mobile-app / build-android-web
  • GitHub Check: test-build-mobile-app / build-ios-web
  • GitHub Check: loom thread test
  • GitHub Check: y-octo binding test on aarch64-pc-windows-msvc
  • GitHub Check: fuzzing
  • GitHub Check: E2E BlockSuite Test (9)
  • GitHub Check: y-octo binding test on x86_64-pc-windows-msvc
  • GitHub Check: E2E Test (6)
  • GitHub Check: y-octo binding test on aarch64-apple-darwin
  • GitHub Check: y-octo binding test on x86_64-apple-darwin
  • GitHub Check: E2E BlockSuite Test (7)
  • GitHub Check: E2E BlockSuite Test (6)
  • GitHub Check: Run native tests
  • GitHub Check: E2E BlockSuite Test (2)
  • GitHub Check: E2E BlockSuite Test (10)
  • GitHub Check: E2E BlockSuite Test (8)
  • GitHub Check: E2E BlockSuite Test (5)
  • GitHub Check: E2E BlockSuite Test (3)
  • GitHub Check: E2E Test (7)
  • GitHub Check: E2E BlockSuite Test (4)
  • GitHub Check: E2E BlockSuite Test (1)
  • GitHub Check: E2E Test (10)
  • GitHub Check: E2E Test (9)
  • GitHub Check: E2E Test (4)
  • GitHub Check: E2E Test (2)
  • GitHub Check: E2E Test (8)
  • GitHub Check: E2E Test (5)
  • GitHub Check: E2E Test (3)
  • GitHub Check: E2E Mobile Test (3)
  • GitHub Check: E2E Mobile Test (2)
  • GitHub Check: E2E Test (1)
  • GitHub Check: E2E Mobile Test (4)
  • GitHub Check: E2E Mobile Test (5)
  • GitHub Check: E2E Mobile Test (1)
  • GitHub Check: Typecheck
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, webkit)
  • GitHub Check: Build @affine/electron renderer
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, firefox)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, webkit)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, chromium)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, chromium)
  • GitHub Check: Build AFFiNE native (aarch64-pc-windows-msvc)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, firefox)
  • GitHub Check: Build AFFiNE native (x86_64-pc-windows-msvc)
  • GitHub Check: Analyze (typescript, blocksuite)
  • GitHub Check: Analyze (typescript, affine)
  • GitHub Check: Build Server native
  • GitHub Check: Analyze (javascript, affine)
  • GitHub Check: Analyze (javascript, blocksuite)
  • GitHub Check: Lint
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added app:server test Related to test cases labels Jun 13, 2025
@fengmk2 fengmk2 force-pushed the read-doc-markdown branch from 03e15d8 to dd21429 Compare June 13, 2025 04:25
Copy link

codecov bot commented Jun 13, 2025

Codecov Report

Attention: Patch coverage is 84.61538% with 14 lines in your changes missing coverage. Please review.

Project coverage is 55.62%. Comparing base (899ffd1) to head (541a2cc).

Files with missing lines Patch % Lines
packages/backend/server/src/core/doc/reader.ts 73.46% 13 Missing ⚠️
...ckages/backend/server/src/core/utils/blocksuite.ts 96.42% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           canary   #12812      +/-   ##
==========================================
+ Coverage   55.59%   55.62%   +0.02%     
==========================================
  Files        2658     2658              
  Lines      125982   126073      +91     
  Branches    19961    19968       +7     
==========================================
+ Hits        70042    70124      +82     
- Misses      53601    53610       +9     
  Partials     2339     2339              
Flag Coverage Δ
server-test 78.88% <84.61%> (+0.01%) ⬆️
unittest 31.67% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@fengmk2 fengmk2 force-pushed the read-doc-markdown branch 5 times, most recently from 016adc0 to a9e4dd0 Compare June 18, 2025 02:03
@fengmk2 fengmk2 force-pushed the read-doc-markdown branch from a9e4dd0 to e4d26eb Compare June 18, 2025 05:41
@fengmk2 fengmk2 marked this pull request as ready for review June 18, 2025 05:41
@fengmk2 fengmk2 requested review from darkskygit and pengx17 June 18, 2025 05:41
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (10)
packages/backend/server/src/core/utils/blocksuite.ts (2)

11-14: Avoid ambiguous aliasing – keep original helper name to aid readability

parsePageDoc is re-exported under the local alias parseDocToMarkdown.
Because the file later introduces another public function named parseDocToMarkdownFromDocSnapshot, this aliasing makes it harder to reason about which symbol is doing the heavy lifting. Keeping the original import name (or choosing a more explicit alias such as parsePageDocToMarkdown) markedly reduces mental overhead.

-import {
-  parsePageDoc as parseDocToMarkdown,
+import {
+  parsePageDoc as parsePageDocToMarkdown,

and adjust references accordingly.


201-226: URL helper uses singular “workspace” – check for broken links

buildDocUrl returns /workspace/${workspaceId}/${docId} (singular workspace).
Elsewhere in the backend the canonical route is /workspaces/:workspaceId/... (plural). Rendering markdown with wrong links will yield 404s in the UI.

-    buildDocUrl: (docId: string) => {
-      return `/workspace/${workspaceId}/${docId}`;
+    buildDocUrl: (docId: string) => {
+      return `/workspaces/${workspaceId}/${docId}`;
     },

Please confirm the expected URL schema throughout the client before applying.

packages/backend/server/src/core/utils/__tests__/blocksute.spec.ts (1)

93-101: Test does not assert non-null result – potential false positives

The snapshot will happily record null, so the test would still pass even if markdown parsing failed. Add an explicit truthy check to guard against silent regressions.

-  const result = parseDocToMarkdownFromDocSnapshot(
+  const result = parseDocToMarkdownFromDocSnapshot(
     workspace.id,
     docSnapshot.id,
     docSnapshot.blob
   );
+  t.truthy(result);
   t.snapshot(result);
packages/backend/server/src/core/doc/__tests__/reader-from-database.spec.ts (1)

267-274: Strengthen assertion before snapshot

Same issue as in the util‐test: if getDocMarkdown returns null, the snapshot will still pass. Add a sanity check.

-  const result = await docReader.getDocMarkdown(workspace.id, docSnapshot.id);
+  const result = await docReader.getDocMarkdown(workspace.id, docSnapshot.id);
+  t.truthy(result);
   t.snapshot(result);
packages/backend/server/src/core/doc-service/controller.ts (1)

45-57: Consider exposing content type or headers for markdown endpoint

getDocMarkdown returns a JSON object containing title & markdown, yet consumers might reasonably expect raw text/markdown. Clarify the contract (e.g. /markdown → JSON vs /markdown/raw → plain MD) or at least document it. Returning the markdown directly with res.type('text/markdown') could avoid an extra unwrap on the client side.

packages/backend/server/src/core/doc/__tests__/reader-from-rpc.spec.ts (1)

379-407: Large snapshot duplication inflates repo & review noise.

The exact same ~120-line markdown blob is now stored in three snapshot files. Consider:

  • Extracting it to a fixture (fixtures/affine-doc.md) and snapshotting only a hash or fs.readFileSync result.
  • Using an AVA macro to share a single snapshot across suites.

That keeps repo size down and avoids triple updates whenever the markdown changes.

packages/backend/server/src/__tests__/e2e/doc-service/controller.spec.ts (2)

19-25: Token binding could be tightened to the workspace for better security.

crypto.sign(docSnapshot.id) signs only the document id, so any user who learns a valid token for one doc can try it against other workspaces.
Consider signing both identifiers (e.g. workspaceId:docId) or embedding a nonce.

This is an optional hardening step but cheap to adopt now.


28-42: Minor: you’re asserting the full Content-Type string; matching only the MIME (/application\/json/) avoids failures if charset changes.

packages/backend/server/src/core/doc/reader.ts (2)

37-41: Interface added, but consider future-proof naming

DocMarkdown currently contains only title and markdown.
If we expect front-matter (tags, createdAt, etc.) later, a more generic DocMarkdownContent (or similar) would leave room for extension without a breaking rename.


329-354: RPC fallback path duplicates logging logic

The catch-block here is an almost verbatim copy of the one in getDoc / getDocDiff.
Refactor to a private helper to reduce duplication and guarantee consistent logging semantics.

-    } catch (e) {
-      if (e instanceof UserFriendlyError) {
-        throw e;
-      }
-      const err = e as Error;
-      ...
-      return await super.getDocMarkdown(workspaceId, docId);
-    }
+    } catch (e) {
+      return await this.handleRpcError(
+        e,
+        `doc markdown ${url}`,
+        () => super.getDocMarkdown(workspaceId, docId)
+      );
+    }

(Helper would encapsulate the existing logic.)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c844786 and e4d26eb.

⛔ Files ignored due to path filters (4)
  • packages/backend/server/src/__tests__/e2e/doc-service/__snapshots__/controller.spec.ts.snap is excluded by !**/*.snap
  • packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-database.spec.ts.snap is excluded by !**/*.snap
  • packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-rpc.spec.ts.snap is excluded by !**/*.snap
  • packages/backend/server/src/core/utils/__tests__/__snapshots__/blocksute.spec.ts.snap is excluded by !**/*.snap
📒 Files selected for processing (12)
  • packages/backend/server/src/__tests__/e2e/doc-service/__snapshots__/controller.spec.ts.md (1 hunks)
  • packages/backend/server/src/__tests__/e2e/doc-service/controller.spec.ts (1 hunks)
  • packages/backend/server/src/core/doc-service/__tests__/controller.spec.ts (4 hunks)
  • packages/backend/server/src/core/doc-service/controller.ts (1 hunks)
  • packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-database.spec.ts.md (1 hunks)
  • packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-rpc.spec.ts.md (1 hunks)
  • packages/backend/server/src/core/doc/__tests__/reader-from-database.spec.ts (1 hunks)
  • packages/backend/server/src/core/doc/__tests__/reader-from-rpc.spec.ts (3 hunks)
  • packages/backend/server/src/core/doc/reader.ts (5 hunks)
  • packages/backend/server/src/core/utils/__tests__/__snapshots__/blocksute.spec.ts.md (1 hunks)
  • packages/backend/server/src/core/utils/__tests__/blocksute.spec.ts (2 hunks)
  • packages/backend/server/src/core/utils/blocksuite.ts (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (55)
  • GitHub Check: Unit Test (3)
  • GitHub Check: Unit Test (4)
  • GitHub Check: Unit Test (1)
  • GitHub Check: Unit Test (2)
  • GitHub Check: Unit Test (5)
  • GitHub Check: Native Unit Test
  • GitHub Check: test-build-mobile-app / build-android-web
  • GitHub Check: test-build-mobile-app / build-ios-web
  • GitHub Check: fuzzing
  • GitHub Check: E2E Test (1)
  • GitHub Check: y-octo binding test on x86_64-pc-windows-msvc
  • GitHub Check: y-octo binding test on x86_64-apple-darwin
  • GitHub Check: Run native tests
  • GitHub Check: y-octo binding test on aarch64-pc-windows-msvc
  • GitHub Check: E2E BlockSuite Test (6)
  • GitHub Check: E2E BlockSuite Test (7)
  • GitHub Check: E2E BlockSuite Test (2)
  • GitHub Check: E2E BlockSuite Test (8)
  • GitHub Check: E2E BlockSuite Test (9)
  • GitHub Check: E2E BlockSuite Test (10)
  • GitHub Check: E2E BlockSuite Test (5)
  • GitHub Check: E2E BlockSuite Test (4)
  • GitHub Check: E2E BlockSuite Test (1)
  • GitHub Check: E2E BlockSuite Test (3)
  • GitHub Check: E2E Test (10)
  • GitHub Check: Build Server native
  • GitHub Check: E2E Test (6)
  • GitHub Check: E2E Test (4)
  • GitHub Check: E2E Test (8)
  • GitHub Check: E2E Test (5)
  • GitHub Check: E2E Test (7)
  • GitHub Check: E2E Test (2)
  • GitHub Check: E2E Test (3)
  • GitHub Check: E2E Test (9)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, chromium)
  • GitHub Check: E2E Mobile Test (5)
  • GitHub Check: E2E Mobile Test (4)
  • GitHub Check: loom thread test
  • GitHub Check: E2E Mobile Test (3)
  • GitHub Check: Build AFFiNE native (aarch64-pc-windows-msvc)
  • GitHub Check: E2E Mobile Test (2)
  • GitHub Check: E2E Mobile Test (1)
  • GitHub Check: Build AFFiNE native (x86_64-pc-windows-msvc)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, webkit)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, webkit)
  • GitHub Check: Build @affine/electron renderer
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, firefox)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, firefox)
  • GitHub Check: Analyze (typescript, blocksuite)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, chromium)
  • GitHub Check: Analyze (javascript, affine)
  • GitHub Check: Analyze (typescript, affine)
  • GitHub Check: Analyze (javascript, blocksuite)
  • GitHub Check: Typecheck
  • GitHub Check: Lint
🔇 Additional comments (9)
packages/backend/server/src/core/utils/blocksuite.ts (1)

211-220: Future-proof for possible async return value

If @affine/reader/dist.parsePageDocToMarkdown (currently synchronous) ever becomes async, this wrapper will silently return a pending Promise.
Safest is to await and mark this function async now – zero runtime cost today, avoids a class of bugs later.

-export function parseDocToMarkdownFromDocSnapshot(
+export async function parseDocToMarkdownFromDocSnapshot(
 ...
-) {
 ...
-  const parsed = parseDocToMarkdown({
+  const parsed = await parsePageDocToMarkdown({
packages/backend/server/src/__tests__/e2e/doc-service/__snapshots__/controller.spec.ts.md (1)

1-119: Snapshot file – nothing actionable.

packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-database.spec.ts.md (1)

1-107: Snapshot file – nothing actionable.

packages/backend/server/src/core/doc/__tests__/__snapshots__/reader-from-rpc.spec.ts.md (1)

1-107: Snapshot file – nothing actionable.

packages/backend/server/src/core/doc-service/__tests__/controller.spec.ts (4)

178-193: Good addition: explicitly asserting JSON Content-Type.

The extra Content-Type expectations tighten the contract of the endpoint and protect against accidental regressions (e.g. serving plain text).
Looks correct and stable.


210-215: Consistent header assertion retained

Same positive remark as above – thanks for keeping the tests symmetric between full=true and default branches.


258-278: 👍 New happy-path test for /markdown endpoint

The new test covers:

  • endpoint routing
  • DB reader wiring through the mock
  • correct headers + body

Nothing missing here.


280-297: 404 branch well-covered

Negative path mirrors other endpoints; test logic & assertions LGTM.

packages/backend/server/src/core/utils/__tests__/__snapshots__/blocksute.spec.ts.md (1)

1370-1470: Approve new snapshot for markdown parsing
The freshly added snapshot for can parse doc to markdown from doc snapshot accurately captures the expected markdown string and title output. The headings, blank lines, lists, links, and table formatting conform to the feature’s requirements and ensure reliable regression testing.

Comment on lines +22 to +25
const module = await createModule({
imports: [DocStorageModule],
});

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid shadowing Node’s built-in module identifier & heavy top-level await.

  1. Declaring a variable literally named module hides Node’s built-in module (always defined in CJS, sometimes poly-filled by tooling). That invites “cannot read property of undefined”–style head-scratchers when helpers assume the global still exists.
  2. Kicking off createModule with top-level await means expensive I/O runs as soon as the file is required, even when AVA filters tests. Moving it into test.before() keeps the test runner fast and predictable.
-const module = await createModule({
-  imports: [DocStorageModule],
-});
+// Initialise lazily inside a hook, and use a less confusing name.
+let testModule: Awaited<ReturnType<typeof createModule>>;
+
+test.before(async () => {
+  testModule = await createModule({ imports: [DocStorageModule] });
+});

(You’ll need to update subsequent references from moduletestModule.)

🤖 Prompt for AI Agents
In packages/backend/server/src/core/doc/__tests__/reader-from-rpc.spec.ts around
lines 22 to 25, rename the variable `module` to `testModule` to avoid shadowing
Node's built-in `module` identifier. Also, move the `await createModule({
imports: [DocStorageModule] })` call from the top-level to inside a
`test.before()` hook to prevent expensive I/O from running immediately on file
load. Update all subsequent references from `module` to `testModule`
accordingly.

Comment on lines +68 to +72
abstract getDocMarkdown(
workspaceId: string,
docId: string
): Promise<DocMarkdown | null>;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Abstract method lacks caching strategy

getDocMarkdown is introduced without a cache analogue to getDocContent.
Rendering markdown from a YDoc snapshot is CPU-heavy and repeatedly hitting the DB on every request is avoidable.

+  // TODO: consider `getDocMarkdownWithoutCache` + 7-day cache, symmetric with doc content
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
abstract getDocMarkdown(
workspaceId: string,
docId: string
): Promise<DocMarkdown | null>;
// TODO: consider `getDocMarkdownWithoutCache` + 7-day cache, symmetric with doc content
abstract getDocMarkdown(
workspaceId: string,
docId: string
): Promise<DocMarkdown | null>;
🤖 Prompt for AI Agents
In packages/backend/server/src/core/doc/reader.ts around lines 68 to 72, the
abstract method getDocMarkdown lacks a caching mechanism similar to
getDocContent. To fix this, introduce a caching strategy for getDocMarkdown to
store and retrieve rendered markdown results, reducing CPU load and database
hits. Implement a cache layer that checks for existing markdown before rendering
from the YDoc snapshot and fetching from the database.

Comment on lines +185 to +194
async getDocMarkdown(
workspaceId: string,
docId: string
): Promise<DocMarkdown | null> {
const doc = await this.workspace.getDoc(workspaceId, docId);
if (!doc) {
return null;
}
return parseDocToMarkdownFromDocSnapshot(workspaceId, docId, doc.bin);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Database implementation: one missing null-guard

parseDocToMarkdownFromDocSnapshot can theoretically return null/undefined (e.g. corrupted snapshot).
If that happens we’ll violate the declared return type and propagate an undefined value.

-    return parseDocToMarkdownFromDocSnapshot(workspaceId, docId, doc.bin);
+    const markdown = parseDocToMarkdownFromDocSnapshot(workspaceId, docId, doc.bin);
+    return markdown ?? null;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async getDocMarkdown(
workspaceId: string,
docId: string
): Promise<DocMarkdown | null> {
const doc = await this.workspace.getDoc(workspaceId, docId);
if (!doc) {
return null;
}
return parseDocToMarkdownFromDocSnapshot(workspaceId, docId, doc.bin);
}
async getDocMarkdown(
workspaceId: string,
docId: string
): Promise<DocMarkdown | null> {
const doc = await this.workspace.getDoc(workspaceId, docId);
if (!doc) {
return null;
}
const markdown = parseDocToMarkdownFromDocSnapshot(workspaceId, docId, doc.bin);
return markdown ?? null;
}
🤖 Prompt for AI Agents
In packages/backend/server/src/core/doc/reader.ts around lines 185 to 194, the
function getDocMarkdown calls parseDocToMarkdownFromDocSnapshot which may return
null or undefined, violating the declared return type Promise<DocMarkdown |
null>. To fix this, add a null check after calling
parseDocToMarkdownFromDocSnapshot and explicitly return null if the result is
null or undefined, ensuring the function always returns a value matching its
declared type.

@fengmk2 fengmk2 force-pushed the read-doc-markdown branch from e4d26eb to 541a2cc Compare June 18, 2025 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app:server test Related to test cases
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants