Skip to content

fix: eliminate writeCount double-counting#23078

Closed
Copilot wants to merge 1 commit intomainfrom
copilot/refactor-write-count-calculation
Closed

fix: eliminate writeCount double-counting#23078
Copilot wants to merge 1 commit intomainfrom
copilot/refactor-write-count-calculation

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 26, 2026

writeCount in both buildBehaviorFingerprint and buildAgenticAssessments was computed as len(createdItems) + SafeItemsCount — but both fields derive from the same extractCreatedItemsFromManifest call, so the same signal was summed twice. This inflated writeCount, misclassifying ActuationStyle/ResourceProfile and bloating write_actions evidence in resource_heavy_for_domain assessments.

Changes

  • buildBehaviorFingerprint and buildAgenticAssessments: derive writeCount from a single source — SafeItemsCount as primary (canonical persisted field), falling back to len(createdItems) only when SafeItemsCount is zero
// Before — sums the same manifest data twice
writeCount := len(createdItems) + processedRun.Run.SafeItemsCount

// After — single source with fallback
writeCount := processedRun.Run.SafeItemsCount
if writeCount == 0 {
    writeCount = len(createdItems)
}

…in agentic analysis

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e0315538-7b99-4c78-9bf3-2dd647e93fdf

Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
@mnkiefer mnkiefer changed the title fix: eliminate writeCount double-counting in agentic analysis fix: eliminate writeCount double-counting Mar 26, 2026
@mnkiefer mnkiefer marked this pull request as ready for review March 26, 2026 10:06
Copilot AI review requested due to automatic review settings March 26, 2026 10:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes inflated writeCount values used in agentic analysis by removing double-counting between createdItems and SafeItemsCount, improving downstream ActuationStyle / ResourceProfile classification and assessment evidence.

Changes:

  • Update buildBehaviorFingerprint to derive writeCount from SafeItemsCount, falling back to len(createdItems) only when SafeItemsCount is zero.
  • Update buildAgenticAssessments to use the same writeCount derivation, preventing over-reporting of write_actions evidence.
Comments suppressed due to low confidence (1)

pkg/cli/audit_agentic_analysis.go:232

  • Same writeCount fallback logic is repeated here (also present in buildBehaviorFingerprint). To reduce drift risk, consider using a shared helper so both the fingerprint thresholds and assessment evidence always use the exact same write action count heuristic.
	writeCount := processedRun.Run.SafeItemsCount
	if writeCount == 0 {
		writeCount = len(createdItems)
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +167 to +170
writeCount := processedRun.Run.SafeItemsCount
if writeCount == 0 {
writeCount = len(createdItems)
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The writeCount “SafeItemsCount with fallback to len(createdItems)” logic is now duplicated here and again in buildAgenticAssessments. Consider extracting a small helper (e.g., deriveWriteCount(processedRun.Run.SafeItemsCount, createdItems)) so the classification logic stays consistent if this heuristic changes (and so other call sites like computeAgenticFraction can reuse the same definition of write actions).

This issue also appears on line 229 of the same file.

See below for a potential fix:

func deriveWriteCount(safeItemsCount int, createdItems []CreatedItemReport) int {
	if safeItemsCount > 0 {
		return safeItemsCount
	}
	return len(createdItems)
}

func buildBehaviorFingerprint(processedRun ProcessedRun, metrics MetricsData, toolUsage []ToolUsageInfo, createdItems []CreatedItemReport, awContext *AwContext) *BehaviorFingerprint {
	toolTypes := len(toolUsage)
	writeCount := deriveWriteCount(processedRun.Run.SafeItemsCount, createdItems)

Copilot uses AI. Check for mistakes.
Comment on lines +167 to +170
writeCount := processedRun.Run.SafeItemsCount
if writeCount == 0 {
writeCount = len(createdItems)
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no unit test asserting the intended behavior change (avoid double-counting when both createdItems and SafeItemsCount are populated). Adding a focused test that covers cases like (SafeItemsCount>0, createdItems non-empty) and (SafeItemsCount==0, createdItems non-empty) would help prevent regressions in ActuationStyle/ResourceProfile classification and assessment evidence.

Copilot uses AI. Check for mistakes.
@mnkiefer
Copy link
Copy Markdown
Contributor

  • Needs further checks

@mnkiefer mnkiefer closed this Mar 26, 2026
@github-actions github-actions bot deleted the copilot/refactor-write-count-calculation branch April 2, 2026 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants