Skip to content

Align Chronicle session store schema with actual stored data#313797

Open
jeffreybulanadi wants to merge 1 commit intomicrosoft:mainfrom
jeffreybulanadi:dev/jeffreyb/chronicle-schema-alignment
Open

Align Chronicle session store schema with actual stored data#313797
jeffreybulanadi wants to merge 1 commit intomicrosoft:mainfrom
jeffreybulanadi:dev/jeffreyb/chronicle-schema-alignment

Conversation

@jeffreybulanadi
Copy link
Copy Markdown

Fixes #313640.

Two divergences between the documented schema and stored data:

1. Truncation marker

The schema description stated truncated content ends with the unicode ellipsis (U+2026 ...), but truncateForStore appended three ASCII periods (...). This caused queries like assistant_response LIKE '%...' to return 0 rows while LIKE '%...' returned 36 - the opposite of what the documentation implied.

Change the ellipsis constant from '...' to the single unicode character so the stored marker matches the documented value. A secondary benefit: the suffix shrinks from 3 code units to 1, freeing 2 more characters of content per truncated value without exceeding maxLength.

2. session_files tool_name scope

read_file, list_dir, and view_image were included in FILE_TRACKING_TOOLS, so every file the agent read was inserted into session_files. In practice read_file alone accounted for 90% of rows (165 out of ~183). The intent of session_files is to record files the agent wrote, not every file it inspected.

Remove those three read-only tools from FILE_TRACKING_TOOLS. Only write operations (replace_string_in_file, insert_edit_into_file, create_file, edit_notebook_file, apply_patch, str_replace_editor, create, create_directory, multi_replace_string_in_file) now populate session_files.

Update the schema description in chronicleIntent.ts and the JSDoc on FILE_TRACKING_TOOLS to match.

Tests

  • read_file and list_dir cases updated to assert undefined (not tracked).
  • view_image added to the read-only/unknown assertion group.
  • Length-based truncation tests in sessionReindexer.spec.ts are unaffected because they only check .length <= max, not the specific suffix character.

Two divergences between the documented schema and actual stored data:

1. Truncation marker: the schema description stated truncated content
   ends with the unicode ellipsis (U+2026) but truncateForStore used
   three ASCII periods. Change the ellipsis constant to the single
   unicode character so LIKE '%...' queries no longer return rows
   that LIKE '%...' should have missed, and vice versa. The stored
   character count also improves by two for every truncated value
   since the suffix is now one code unit instead of three.

2. session_files tool_name scope: read_file, list_dir and view_image
   were included in FILE_TRACKING_TOOLS, causing read-only operations
   to populate session_files even though the table is intended to
   record files the agent edited or created. Remove those three tools
   so session_files only contains rows for write operations. Update
   the schema description in chronicleIntent.ts and the JSDoc comment
   on FILE_TRACKING_TOOLS to reflect the narrower, mutating-only scope.

Tests updated accordingly: the two cases that previously asserted
extractFilePath returns a path for read_file and list_dir now assert
undefined, and view_image is added to the read-only/unknown group.
Copilot AI review requested due to automatic review settings May 1, 2026 20:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Aligns Chronicle’s documented session-store schema with what is actually persisted, focusing on truncation markers and the intended semantics of session_files (write-tracking vs read-tracking).

Changes:

  • Switch truncation marker used by truncateForStore from ASCII ... to Unicode ellipsis .
  • Limit session_files population to write tools only by removing read-only tools (read_file, list_dir, view_image) from FILE_TRACKING_TOOLS.
  • Update the Chronicle schema description and adjust extractFilePath unit tests to reflect the new write-only tracking behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
extensions/copilot/src/extension/intents/node/chronicleIntent.ts Updates the schema description strings to match the new truncation marker and write-only session_files semantics.
extensions/copilot/src/extension/chronicle/common/test/sessionStoreTracking.spec.ts Updates tests to assert read-only tools are not tracked in session_files, and adds view_image to the non-tracked group.
extensions/copilot/src/extension/chronicle/common/sessionStoreTracking.ts Implements the Unicode ellipsis truncation marker and restricts file tracking to write tools only.
Comments suppressed due to low confidence (3)

extensions/copilot/src/extension/chronicle/common/sessionStoreTracking.ts:70

  • After dropping list_dir from FILE_TRACKING_TOOLS, the inline comment in extractFilePath still says “list_dir uses 'path'”. Please update that comment so it no longer references tools that are intentionally not tracked, to avoid future confusion while maintaining/expanding tool tracking.
/** Tools whose arguments contain a file path being written (created or modified). */
const FILE_TRACKING_TOOLS = new Set([
	// VS Code model-facing tool names (from ToolName enum)
	'replace_string_in_file',
	'multi_replace_string_in_file',

extensions/copilot/src/extension/chronicle/common/sessionStoreTracking.ts:41

  • truncateForStore now uses the Unicode ellipsis () as the truncation marker, but there’s no unit test asserting the exact suffix (only length-based assertions elsewhere). Adding a small test here would prevent regressions back to '...' and ensure the documented schema marker stays aligned with stored data.
/**
 * Truncate a string to at most `maxLength` stored characters, appending '…' if truncated.
 * The returned value, including the truncation suffix, never exceeds `maxLength`.
 * Returns `undefined` for falsy input.
 */
export function truncateForStore(value: string | undefined, maxLength: number): string | undefined {
	if (!value) {
		return undefined;
	}
	if (value.length <= maxLength) {
		return value;
	}
	const ellipsis = '…';
	if (maxLength <= ellipsis.length) {
		return ellipsis.slice(0, maxLength);
	}
	return value.slice(0, maxLength - ellipsis.length).trimEnd() + ellipsis;

extensions/copilot/src/extension/chronicle/common/sessionStoreTracking.ts:70

  • Removing read_file/list_dir/view_image from FILE_TRACKING_TOOLS will change extractFilePath() behavior and currently breaks existing tests that still expect read-only tools to populate session_files (e.g. extensions/copilot/src/extension/chronicle/node/test/sessionReindexer.spec.ts and .../common/test/standupPrompt.spec.ts). Update those tests/fixtures to only assert tracking for write tools (e.g. create_file, apply_patch, etc.).
/** Tools whose arguments contain a file path being written (created or modified). */
const FILE_TRACKING_TOOLS = new Set([
	// VS Code model-facing tool names (from ToolName enum)
	'replace_string_in_file',
	'multi_replace_string_in_file',

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Chronicle: schema description divergences — truncation marker and tool_name values

3 participants