Skip to content

Optimized tokenizer for diff engine#286

Merged
svarlamov merged 2 commits intomainfrom
feat/optimize-checkpoint-size
Dec 12, 2025
Merged

Optimized tokenizer for diff engine#286
svarlamov merged 2 commits intomainfrom
feat/optimize-checkpoint-size

Conversation

@svarlamov
Copy link
Copy Markdown
Member

Instead of tokenizing by character, we now have a custom tokenizer optimized for source code. Since it's able to represent the same changes with far fewer tokens (100x+ less for typical code files), the worst case diffing performance should be much faster now.

@git-ai-cloud-dev
Copy link
Copy Markdown

Stats powered by Git AI

🧠 you    ████████████████████  100%
🤖 ai     ░░░░░░░░░░░░░░░░░░░░  0%
More stats
  • 0.0 lines generated for every 1 accepted
  • 0 seconds waiting for AI

AI code tracked with git-ai

@svarlamov svarlamov merged commit 83229a0 into main Dec 12, 2025
6 checks passed
@svarlamov svarlamov deleted the feat/optimize-checkpoint-size branch December 12, 2025 05:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant