Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 1, 2025

Print estimated token count for each file added to context and total tokens used.

Implementation

  • Token estimation: ~4 chars per token heuristic using utf8.RuneCountInString()
  • Per-file output: Display tokens alongside existing file inclusion messages
  • Total output: Print aggregate count after all files processed
  • Template expansion: Count tokens after variable substitution

Output Format

Using persona file: .prompts/personas/expert.md (~35 tokens)
Including memory file: .prompts/memories/setup.md (~30 tokens)
Using task file: .prompts/tasks/implement.md (~46 tokens)
Total estimated tokens: 111

Files

  • token_counter.go: Estimation function
  • token_counter_test.go: Unit tests covering edge cases
  • main.go: Integration at persona/memory/task processing points
  • integration_test.go: End-to-end validation
Original prompt

Each time a file is selected and added. Print an estimate of the number of LLM tokens used. Also print the total number of tokens used.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI self-assigned this Nov 1, 2025
Co-authored-by: alexec <1142830+alexec@users.noreply.github.com>
Copilot AI changed the title [WIP] Add token estimation for selected files Add LLM token estimation for file selection output Nov 1, 2025
Copilot AI requested a review from alexec November 1, 2025 15:02
Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Please change

main.go Outdated
// Estimate tokens for this file
tokens := estimateTokens(expanded)
totalTokens += tokens
fmt.Fprintf(os.Stdout, "Using prompt file: %s (~%d tokens)\n", path, tokens)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should say "task" not "prompt"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "Using task file:" in commit ad6de93.

token_counter.go Outdated
charCount := utf8.RuneCountInString(text)
// Approximate: 1 token ≈ 4 characters
tokens := charCount / 4
if tokens == 0 && charCount > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the zero check. This will never happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the zero check in commit ad6de93.

…nnecessary zero check

Co-authored-by: alexec <1142830+alexec@users.noreply.github.com>
Copilot AI requested a review from alexec November 1, 2025 15:09
@alexec alexec marked this pull request as ready for review November 1, 2025 15:24
@alexec
Copy link
Contributor

alexec commented Nov 1, 2025

@copilot Make the requested changes

Copy link
Contributor Author

Copilot AI commented Nov 1, 2025

@copilot Make the requested changes

I've already addressed the two review comments in commit ad6de93:

  1. Changed "Using prompt file:" to "Using task file:"
  2. Removed the unnecessary zero check from estimateTokens()

Could you clarify what additional changes you'd like me to make?

@alexec alexec requested a review from Copilot November 1, 2025 15:37
@alexec alexec merged commit f664140 into main Nov 1, 2025
7 checks passed
@alexec alexec deleted the copilot/add-token-estimation-logic branch November 1, 2025 15:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds token counting functionality to estimate the number of LLM tokens in processed files. The implementation uses a simple heuristic of approximately 4 characters per token.

  • Adds estimateTokens() function with UTF-8 character counting
  • Updates output messages to display token counts for persona, memory, and task files
  • Adds total token count tracking and display

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
token_counter.go Implements the token estimation function using a 4-char-per-token heuristic
token_counter_test.go Adds unit tests for the token estimation function
main.go Integrates token counting into file processing workflow with per-file and total counts
integration_test.go Adds end-to-end test verifying token counting output

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1361 to +1363
if !strings.Contains(totalLine, "Total estimated tokens:") {
t.Errorf("Expected 'Total estimated tokens:' in output, got: %s", totalLine)
}
Copy link

Copilot AI Nov 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 1356-1358 already verify that totalLine contains 'Total estimated tokens:' (since it's found by searching for that string in lines 1352-1354). The check on lines 1361-1363 is redundant and can be removed.

Suggested change
if !strings.Contains(totalLine, "Total estimated tokens:") {
t.Errorf("Expected 'Total estimated tokens:' in output, got: %s", totalLine)
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants