fix: harden conversion path for ipynbs#8795
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR hardens the Jupyter (.ipynb) → marimo IR conversion pipeline so that unsupported/edge-case notebook cells (notably import *) don’t cause the entire conversion to fail, improving robustness for Molab’s notebook importer (issue #8783).
Changes:
- Catch
ScopedVisitor/ImportStarError(asSyntaxError) during duplicate-definition renaming so star-import cells are preserved instead of crashing the transform. - Replace assert-based “sanity checks” in the source transform pipeline with a guarded runner that skips failed transforms and logs warnings.
- Add unit + end-to-end tests ensuring notebooks with star imports convert successfully.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
marimo/_convert/ipynb/to_ir.py |
Adds per-cell fallback for visitor errors and introduces _run_transform to skip failing/invalid transforms instead of asserting. |
tests/_convert/ipynb/test_ipynb_to_ir.py |
Adds regression tests for star-import notebooks and the duplicate-definition transform behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| except Exception: | ||
| logger.warning( | ||
| "Notebook conversion transform '%s' failed; " | ||
| "skipping this optimization. " | ||
| "Please report this at %s", | ||
| name, | ||
| _REPORT_URL, | ||
| exc_info=True, |
There was a problem hiding this comment.
In _run_transform, exc_info=True will log full tracebacks for exceptions raised while processing user-supplied notebook code. For exceptions like SyntaxError, the exception repr/traceback can include the offending source line, which may leak notebook contents into server logs (Molab/import flows). Consider logging the exception type/message without exc_info, or redacting/sampling so user code isn’t emitted to logs by default.
| except Exception: | |
| logger.warning( | |
| "Notebook conversion transform '%s' failed; " | |
| "skipping this optimization. " | |
| "Please report this at %s", | |
| name, | |
| _REPORT_URL, | |
| exc_info=True, | |
| except Exception as exc: | |
| logger.warning( | |
| "Notebook conversion transform '%s' failed with %s: %s; " | |
| "skipping this optimization. " | |
| "Please report this at %s", | |
| name, | |
| type(exc).__name__, | |
| exc, | |
| _REPORT_URL, |
| # Run comment-preserving transforms | ||
| for base_transform in comment_preserving_transforms: | ||
| transform = comment_preserver(base_transform) | ||
| new_sources = transform(sources) | ||
| assert len(new_sources) == len(sources), ( | ||
| f"{base_transform.__name__} changed cell count" | ||
| ) | ||
| sources = new_sources | ||
| sources = _run_transform(base_transform.__name__, transform, sources) | ||
|
|
There was a problem hiding this comment.
_run_transform validates that a transform didn’t change cell count, but when used with CommentPreserver the wrapped transform has side effects (it updates the preserver’s internal comment index based on transformed_sources). If a transform returns a different-length list, _run_transform discards it and returns the original sources, but CommentPreserver has already been updated to the wrong length, which can corrupt comment preservation for subsequent transforms. Suggest moving the length-check before updating CommentPreserver state (e.g., add a guard in CommentPreserver.wrapper to skip _update_comments_for_transformed_sources when lengths differ, or perform the check outside the wrapper and only update state when accepted).
|
🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.21.2-dev22 |
📝 Summary
Catches Visitor errors on extracting a jupyter cell and just passes the value unchanged as a fallback. Previously conversion would loudly fail (see #8783), cause issues upstream on molab
Moreover, we had some loose asserts for sanity checking- but these should prompt the user to act opposed to loudly fail.