You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR combines three related improvements to notebook-agent behavior and tool response clarity.
Prompt instruction refinement for notebook editing/execution workflows.
Changes add-code-cell and add-markdown-cell to return the inserted cellIndex when available, improving traceability in iterative notebook editing.
Adds UTF-8-safe output truncation to read_file so large file reads stay within an output budget, with regression tests covering truncation, multibyte content, and non-text files.
Motivation
In a data analysis and modeling task, the agent followed the prompt by first stating a high-level goal and then proceeding to generate the entire notebook implementation in one pass. For exploratory notebook work, this is not a reasonable default behavior. The notebook instructions should instead encourage incremental analysis, intermediate validation, and task-adaptive planning rather than upfront full-pipeline code generation.
add-cell tool actions previously returned generic success responses, which made it harder to reference newly inserted cells in subsequent steps. Returning the inserted cellIndex improves traceability for iterative notebook workflows.
read_file could emit overly large outputs, which is risky for tool usability and model context management.
What changed
Updated NOTEBOOK_EDIT_INSTRUCTIONS to clarify when to prefer exploratory workflows vs. construction workflows.
Updated NOTEBOOK_EXECUTE_INSTRUCTIONS to encourage smaller, iterative execution cycles for exploratory/scientific tasks.
Updated notebook add_code_cell and add_markdown_cell responses to return { cellIndex } when the UI command provides it.
Updated Python tool wrappers and Claude tool handlers to surface the inserted cell index in tool responses.
Added max_output_tokens to read_file, with UTF-8-safe truncation and an [output truncated] marker.
moyiliyi
changed the title
refactor(prompts): Refine notebook editing and execution prompts
refactor(agent): improve notebook prompts and add-cell response feedback
May 28, 2026
moyiliyi
changed the title
refactor(agent): improve notebook prompts and add-cell response feedback
refactor(agent): improve notebook prompts and tool feedback
May 30, 2026
moyiliyi
changed the title
refactor(agent): improve notebook prompts and tool feedback
refactor(agent): improve notebook prompts and tool response clarity
May 30, 2026
Really nice work on this, and thank you for taking the time to include tests. It's a pleasure to review.
A few notes from reading through it:
Returning the inserted cell index is a great idea, and the code looks spot on. I followed it all the way through: newCellIndex is set correctly in both spots in src/index.ts, it matches where the cell actually lands, it comes back to Python as an int so the isinstance(cell_index, int) checks pass, and those checks also fall back gracefully to the old message if the response isn't what's expected. I especially like that the index is immediately useful, since the other notebook tools (run_cell, get_cell_output, get_cell_type_and_source, set_cell_type_and_source, delete_cell, insert_cell) all take a cell index, so the agent can act on the new cell right away instead of having to ask for the cell count first. Small change, real quality-of-life win.
The read_file size limit is done with real care. The way it trims on UTF-8 boundaries avoids producing broken characters, and the result stays within the byte budget. That multibyte test checking every visible character is still whole is a lovely touch.
Two things on read_file I'd gently float for your consideration:
After it trims, the header still claims the full file. By the time the output is cut down, end_line has already been set to the last line of the file, so the header still says (lines X-Y) for the whole range even though only part of it is actually there, and [output truncated] doesn't say where it stopped. Since read_file normally lets you read a specific line range, the agent doesn't have an easy way to pick up where it left off. It might help to show the range that was really included (say, the last line that made it in) so the model can do a follow-up read for the rest.
max_output_tokens ends up being something the model can set. Because it's an argument on the tool, it shows up in what the model sees, so the model can change it on any call. A large value quietly cancels out the limit you're enforcing, and a very small value hits the path that trims the header itself (and for the tiniest values returns just a piece of the [output truncated] text with no file content). If the limit is mostly there to keep file reads from flooding the context, it might be simpler to keep it as a fixed default on the server side rather than something the model passes in, or at least set a minimum so the header and marker always survive. And if you'd rather keep it adjustable, a quick test for the very-small case would cover that branch nicely, since the current tests only exercise the normal trimming path.
The notebook instruction updates toward smaller, step-by-step exploration read really well too, and splitting the guidance into exploratory versus more defined tasks is a thoughtful distinction.
Overall this is clean, careful, well-tested work. The cell-index part looks ready to go as is, and the two read_file notes above are just the things I'd look at before merging. Thanks again for the contribution.
I’ve just pushed a follow-up fix based on your feedback.
The truncation marker now shows where the output stopped [output truncated within line {line} column {column}], and max_output_tokens is no longer exposed in the public tool schema.
I also updated the tests accordingly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR combines three related improvements to notebook-agent behavior and tool response clarity.
add-code-cellandadd-markdown-cellto return the insertedcellIndexwhen available, improving traceability in iterative notebook editing.read_fileso large file reads stay within an output budget, with regression tests covering truncation, multibyte content, and non-text files.Motivation
In a data analysis and modeling task, the agent followed the prompt by first stating a high-level goal and then proceeding to generate the entire notebook implementation in one pass. For exploratory notebook work, this is not a reasonable default behavior. The notebook instructions should instead encourage incremental analysis, intermediate validation, and task-adaptive planning rather than upfront full-pipeline code generation.
add-celltool actions previously returned generic success responses, which made it harder to reference newly inserted cells in subsequent steps. Returning the inserted cellIndex improves traceability for iterative notebook workflows.read_filecould emit overly large outputs, which is risky for tool usability and model context management.What changed
add_code_cellandadd_markdown_cellresponses to return{ cellIndex }when the UI command provides it.max_output_tokenstoread_file, with UTF-8-safe truncation and an[output truncated]marker.