Skip to content

Conversation

@Artmann
Copy link
Contributor

@Artmann Artmann commented Dec 4, 2025

Fixes #238

Summary by CodeRabbit

  • Bug Fixes

    • Improved handling of large image data in Deepnote notebooks using chunked base64 processing to avoid stack overflow.
    • Fixed notebook serialization to safely handle circular references in output metadata.
  • Tests

    • Added tests verifying large image outputs and circular-reference outputs serialize correctly without errors.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

📝 Walkthrough

Walkthrough

Updates the Deepnote notebook serialization pipeline to handle edge cases in binary data and object references. Adds a chunked base64 converter to prevent stack overflow when encoding large image outputs. Introduces a deep clone helper that removes true circular references while preserving shared structure. Refactors serialization to validate project/notebook context, convert cells to blocks, clone blocks to remove circular refs, update project metadata.modifiedAt, and emit YAML from the mutated original project. Expands unit tests to cover large binary outputs and circular-reference-containing outputs.

Pre-merge checks

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title clearly summarizes the main change: chunked base64 conversion prevents stack overflow when saving notebooks with large outputs.
Linked Issues check ✅ Passed PR addresses stack overflow during save [#238] by replacing inline base64conversion with chunked approach, adding large-data tests, removing circular refs in serialization, and improving logging.
Out of Scope Changes check ✅ Passed All changes directly target the stack overflow issue: chunked base64 conversion, circular reference handling, debug logging, and large-output test coverage.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9cc2764 and e4fbf59.

📒 Files selected for processing (2)
  • src/notebooks/deepnote/deepnoteDataConverter.ts (2 hunks)
  • src/notebooks/deepnote/deepnoteSerializer.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use l10n.t() for all user-facing strings in TypeScript files
Use typed error classes from src/platform/errors/ instead of generic errors
Use ILogger service instead of console.log for logging
Preserve error details in error messages while scrubbing personally identifiable information (PII)
Prefer async/await over promise chains
Handle cancellation with CancellationToken

Order method, fields and properties first by accessibility (public/private/protected) and then by alphabetical order

Files:

  • src/notebooks/deepnote/deepnoteDataConverter.ts
  • src/notebooks/deepnote/deepnoteSerializer.ts
**/*.ts

📄 CodeRabbit inference engine (.github/instructions/typescript.instructions.md)

**/*.ts: ALWAYS check the 'Core - Build' task output for TypeScript compilation errors before running any script or declaring work complete
ALWAYS run npm run format-fix before committing changes to ensure proper code formatting
FIX all TypeScript compilation errors before moving forward with development
Use npm run format-fix to auto-fix TypeScript formatting issues before committing
Use npm run lint to check for linter issues in TypeScript files and attempt to fix before committing

Files:

  • src/notebooks/deepnote/deepnoteDataConverter.ts
  • src/notebooks/deepnote/deepnoteSerializer.ts
src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.{ts,tsx}: Don't add the Microsoft copyright header to new files
Use Uri.joinPath() for constructing file paths instead of string concatenation with / to ensure platform-correct path separators
Follow established patterns when importing new packages, using helper imports rather than direct imports (e.g., use import { generateUuid } from '../platform/common/uuid' instead of importing uuid directly)
Add blank lines after const groups and before return statements for readability
Separate third-party and local file imports

Files:

  • src/notebooks/deepnote/deepnoteDataConverter.ts
  • src/notebooks/deepnote/deepnoteSerializer.ts
src/notebooks/deepnote/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Deepnote integration located in src/notebooks/deepnote/ with refactored architecture: deepnoteTypes.ts (type definitions), deepnoteNotebookManager.ts (state management), deepnoteNotebookSelector.ts (UI selection logic), deepnoteDataConverter.ts (data transformations), deepnoteSerializer.ts (main serializer/orchestration), deepnoteActivationService.ts (VSCode activation)

Files:

  • src/notebooks/deepnote/deepnoteDataConverter.ts
  • src/notebooks/deepnote/deepnoteSerializer.ts
🧬 Code graph analysis (1)
src/notebooks/deepnote/deepnoteSerializer.ts (1)
src/platform/deepnote/deepnoteTypes.ts (2)
  • DeepnoteFile (6-6)
  • DeepnoteBlock (6-6)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build & Test
🔇 Additional comments (5)
src/notebooks/deepnote/deepnoteDataConverter.ts (2)

246-248: LGTM!

Direct use of item.data without redundant new Uint8Array() wrapper. Clean delegation to the chunked helper.


527-542: Solid fix for the stack overflow.

Chunked processing avoids spreading huge arrays. The 8192 chunk size is well within safe limits.

Minor: string concatenation in a loop is O(n²) worst-case for very large outputs. If performance becomes an issue, accumulating chunks in an array and joining once would be faster. Fine for now.

src/notebooks/deepnote/deepnoteSerializer.ts (3)

185-189: Direct mutation of originalProject is intentional but worth noting.

Assigning to notebook.blocks and updating metadata.modifiedAt directly mutates the stored project. This works for incremental saves but means the in-memory state drifts from disk if serialization fails after mutation.

Consider cloning before mutation if rollback semantics are needed. Current approach is simpler and likely acceptable.


144-200: Debug logging is appropriate for diagnosing save issues.

Uses logger.debug, no PII, tracks the serialization pipeline. Aligns with PR objectives for improved observability.


12-43: Recursion stack pattern correctly implemented.

The try/finally with seen.delete() properly preserves shared references while only dropping true cycles. The function is used to clean up VS Code-introduced circular references before YAML serialization (which also has noRefs: true as a safeguard).

The concern about Date, Map, Set, RegExp being converted to plain objects is theoretically valid but not a practical issue here—convertCellsToBlocks produces plain objects with primitive properties, and Deepnote's JSON-based format doesn't contain these special types. No action needed.


Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Dec 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0%. Comparing base (b8684a9) to head (e4fbf59).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@     Coverage Diff     @@
##   main   #239   +/-   ##
===========================
===========================
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b8684a9 and 9cc2764.

📒 Files selected for processing (4)
  • src/notebooks/deepnote/deepnoteDataConverter.ts (2 hunks)
  • src/notebooks/deepnote/deepnoteDataConverter.unit.test.ts (2 hunks)
  • src/notebooks/deepnote/deepnoteSerializer.ts (2 hunks)
  • src/notebooks/deepnote/deepnoteSerializer.unit.test.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.unit.test.ts

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Create unit tests in *.unit.test.ts files

**/*.unit.test.ts: Unit tests use Mocha/Chai framework with .unit.test.ts extension
Test files should be placed alongside the source files they test
Use assert.deepStrictEqual() for object comparisons instead of checking individual properties

Files:

  • src/notebooks/deepnote/deepnoteSerializer.unit.test.ts
  • src/notebooks/deepnote/deepnoteDataConverter.unit.test.ts
**/*.test.ts

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Create integration tests in *.test.ts files (not *.unit.test.ts)

**/*.test.ts: ALWAYS check the 'Unittest - Build' task output for TypeScript compilation errors before running any script or declaring work complete
When a mock is returned from a promise in unit tests, ensure the mocked instance has an undefined then property to avoid hanging tests
Use npm run test:unittests for TypeScript unit tests before committing changes

Files:

  • src/notebooks/deepnote/deepnoteSerializer.unit.test.ts
  • src/notebooks/deepnote/deepnoteDataConverter.unit.test.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use l10n.t() for all user-facing strings in TypeScript files
Use typed error classes from src/platform/errors/ instead of generic errors
Use ILogger service instead of console.log for logging
Preserve error details in error messages while scrubbing personally identifiable information (PII)
Prefer async/await over promise chains
Handle cancellation with CancellationToken

Order method, fields and properties first by accessibility (public/private/protected) and then by alphabetical order

Files:

  • src/notebooks/deepnote/deepnoteSerializer.unit.test.ts
  • src/notebooks/deepnote/deepnoteDataConverter.unit.test.ts
  • src/notebooks/deepnote/deepnoteSerializer.ts
  • src/notebooks/deepnote/deepnoteDataConverter.ts
**/*.ts

📄 CodeRabbit inference engine (.github/instructions/typescript.instructions.md)

**/*.ts: ALWAYS check the 'Core - Build' task output for TypeScript compilation errors before running any script or declaring work complete
ALWAYS run npm run format-fix before committing changes to ensure proper code formatting
FIX all TypeScript compilation errors before moving forward with development
Use npm run format-fix to auto-fix TypeScript formatting issues before committing
Use npm run lint to check for linter issues in TypeScript files and attempt to fix before committing

Files:

  • src/notebooks/deepnote/deepnoteSerializer.unit.test.ts
  • src/notebooks/deepnote/deepnoteDataConverter.unit.test.ts
  • src/notebooks/deepnote/deepnoteSerializer.ts
  • src/notebooks/deepnote/deepnoteDataConverter.ts
src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.{ts,tsx}: Don't add the Microsoft copyright header to new files
Use Uri.joinPath() for constructing file paths instead of string concatenation with / to ensure platform-correct path separators
Follow established patterns when importing new packages, using helper imports rather than direct imports (e.g., use import { generateUuid } from '../platform/common/uuid' instead of importing uuid directly)
Add blank lines after const groups and before return statements for readability
Separate third-party and local file imports

Files:

  • src/notebooks/deepnote/deepnoteSerializer.unit.test.ts
  • src/notebooks/deepnote/deepnoteDataConverter.unit.test.ts
  • src/notebooks/deepnote/deepnoteSerializer.ts
  • src/notebooks/deepnote/deepnoteDataConverter.ts
src/notebooks/deepnote/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Deepnote integration located in src/notebooks/deepnote/ with refactored architecture: deepnoteTypes.ts (type definitions), deepnoteNotebookManager.ts (state management), deepnoteNotebookSelector.ts (UI selection logic), deepnoteDataConverter.ts (data transformations), deepnoteSerializer.ts (main serializer/orchestration), deepnoteActivationService.ts (VSCode activation)

Files:

  • src/notebooks/deepnote/deepnoteSerializer.unit.test.ts
  • src/notebooks/deepnote/deepnoteDataConverter.unit.test.ts
  • src/notebooks/deepnote/deepnoteSerializer.ts
  • src/notebooks/deepnote/deepnoteDataConverter.ts
🧬 Code graph analysis (2)
src/notebooks/deepnote/deepnoteDataConverter.unit.test.ts (1)
src/test/mocks/vsc/extHostedTypes.ts (3)
  • NotebookCellData (2546-2572)
  • NotebookCellOutput (71-103)
  • NotebookCellOutputItem (16-69)
src/notebooks/deepnote/deepnoteSerializer.ts (2)
src/platform/logging/index.ts (1)
  • logger (35-48)
src/platform/deepnote/deepnoteTypes.ts (2)
  • DeepnoteFile (6-6)
  • DeepnoteBlock (6-6)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build & Test
🔇 Additional comments (5)
src/notebooks/deepnote/deepnoteSerializer.unit.test.ts (1)

328-397: Good targeted regression test for circular output metadata

This suite cleanly reproduces the circular metadata.self scenario and verifies that serializeNotebook no longer blows up while still emitting valid YAML containing the project ID. Coverage looks appropriate for the circular‑reference fix.

src/notebooks/deepnote/deepnoteDataConverter.unit.test.ts (2)

2-2: Importing notebook output types is consistent with the converter’s expectations

Bringing in NotebookCellOutput and NotebookCellOutputItem from vscode lets the tests construct outputs with the same shape the converter handles, which keeps these tests close to real usage.


801-846: Large image output test covers the stack‑overflow regression well

The 1MB Uint8Array setup and subsequent assertions that a single code block with a base64‑encoded 'image/png' payload is produced give solid coverage for large binary outputs without over‑asserting internals.

src/notebooks/deepnote/deepnoteDataConverter.ts (1)

527-542: Chunked uint8ArrayToBase64 implementation is appropriate for large outputs

The chunked loop with chunkSize = 8192 avoids the previous argument explosion while preserving the straightforward btoa(binaryString) behavior. Given the 1MB test coverage, this looks sufficient for typical Deepnote outputs without introducing unnecessary complexity.

src/notebooks/deepnote/deepnoteSerializer.ts (1)

139-195: Updated serialize path and logging align with circular‑ref and overflow fixes

The added debug logging around project/notebook resolution, block conversion, and YAML dumping is useful and stays within the logger abstraction. Converting cells -> blocks, then assigning notebook.blocks = cloneWithoutCircularRefs<DeepnoteBlock[]>(blocks) before updating metadata.modifiedAt and calling yaml.dump is a sensible way to both drop problematic cycles and keep the original project structure intact. This ties in well with the new circular‑reference test and the large‑output handling in the converter.

@Artmann Artmann marked this pull request as ready for review December 4, 2025 11:50
@Artmann Artmann requested a review from a team as a code owner December 4, 2025 11:50
@dinohamzic dinohamzic self-requested a review December 4, 2025 14:12
@Artmann Artmann merged commit 2590146 into main Dec 4, 2025
13 checks passed
@Artmann Artmann deleted the chris/fix-save-maximum-stack-size-errors branch December 4, 2025 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failed to save projected

3 participants