Skip to content

feat(lsp): diagnostic debouncing and semantic token provider (#272)#417

Merged
ajitpratap0 merged 3 commits intomainfrom
feat/lsp-debounce-semantic-tokens
Mar 19, 2026
Merged

feat(lsp): diagnostic debouncing and semantic token provider (#272)#417
ajitpratap0 merged 3 commits intomainfrom
feat/lsp-debounce-semantic-tokens

Conversation

@ajitpratap0
Copy link
Owner

Summary

Implements the remaining items from #272. Incremental text sync was already done; this PR adds debounced diagnostics and the textDocument/semanticTokens/full capability.

Changes

pkg/lsp/handler.go

Diagnostic debouncing

  • Added debounceMu sync.Mutex and debounceTimers map[string]*time.Timer to Handler struct
  • Initialized debounceTimers in NewHandler
  • handleDidChange now schedules a 300ms time.AfterFunc instead of calling validateDocument synchronously — existing timers are stopped before rescheduling; content snapshot is captured at change time (not at fire time) to avoid races
  • handleDidClose cancels and removes the timer for the closed URI before Documents().Close()

Semantic tokens

  • Added case "textDocument/semanticTokens/full" to HandleRequest
  • handleInitialize now advertises SemanticTokensProvider in ServerCapabilities
  • handleSemanticTokensFull: tokenizes document content via tokenizer.GetTokenizer/PutTokenizer, then encodes to LSP 5-integer delta format
  • classifyToken: maps models.TokenType* constants to 6-type legend (keyword, identifier, number, string, operator, comment)
  • encodeSemanticTokens: 0-based LSP delta encoding from 1-based models.Location

pkg/lsp/protocol.go

  • Added SemanticTokensProvider *SemanticTokensOptions to ServerCapabilities
  • New types: SemanticTokensLegend, SemanticTokensOptions, SemanticTokensParams, SemanticTokens

Test plan

  • go test ./pkg/lsp/... -race — no data races on debounce timer map
  • Open a large SQL file in a Neovim/VSCode LSP client — diagnostics should lag 300ms behind rapid typing, not fire per keystroke
  • LSP client that supports semantic tokens should receive keyword highlighting

Closes #272

🤖 Generated with Claude Code

- Debounce validateDocument calls with 300ms timer per document URI
  to avoid N parse calls per rapid-typing burst
- Add textDocument/semanticTokens/full handler with 6-type legend
  (keyword, identifier, number, string, operator, comment)
- Clean up debounce timers on textDocument/didClose

Closes #272

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added the lsp label Mar 19, 2026
@vercel
Copy link

vercel bot commented Mar 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gosqlx-website Ready Ready Preview, Comment Mar 19, 2026 3:52pm

@github-actions
Copy link

github-actions bot commented Mar 19, 2026

Claude finished @ajitpratap0's task —— View job


Code Review for LSP Diagnostic Debouncing and Semantic Tokens

Todo List:

  • Read repository guidelines and understand codebase structure
  • Analyze pkg/lsp/handler.go changes for diagnostic debouncing
  • Analyze pkg/lsp/protocol.go changes for semantic tokens
  • Review code quality and best practices
  • Check for potential bugs and race conditions
  • Evaluate performance considerations
  • Assess security concerns
  • Review test coverage
  • Provide final feedback and recommendations

🔍 Review Summary

This PR successfully implements diagnostic debouncing and semantic token provider for the LSP server. The implementation is well-structured and follows the project's architectural patterns. Here's my detailed analysis:

✅ Positive Aspects

Strong Implementation Design

  • Proper Object Pooling: Correctly uses tokenizer.GetTokenizer() / PutTokenizer() pattern as mandated by CLAUDE.md
  • Thread Safety: Proper mutex usage for debouncing timers with sync.Mutex and cleanup patterns
  • LSP Compliance: Follows LSP 3.17 specification correctly for semantic tokens
  • Memory Efficiency: Uses delta encoding for semantic tokens, reducing payload size
  • Repository Guidelines: Adheres to coding standards and architectural patterns

Clean Code Quality

  • Well-documented functions with comprehensive comments
  • Consistent error handling patterns
  • Proper resource cleanup with defer statements
  • Clear separation of concerns

⚠️ Issues & Recommendations

Critical: Potential Race Condition

Location: pkg/lsp/handler.go:336-342

h.debounceTimers[uri] = time.AfterFunc(300*time.Millisecond, func() {
    h.debounceMu.Lock()
    delete(h.debounceTimers, uri)
    h.debounceMu.Unlock()
    h.validateDocument(uri, content, version)
})

Issue: The content is captured at change time, but if the document is modified again before the timer fires, validateDocument might run on stale content.

Fix: Capture content inside the timer callback:

h.debounceTimers[uri] = time.AfterFunc(300*time.Millisecond, func() {
    h.debounceMu.Lock()
    delete(h.debounceTimers, uri)
    h.debounceMu.Unlock()
    // Get fresh content
    if freshContent, ok := h.server.Documents().GetContent(uri); ok {
        h.validateDocument(uri, freshContent, version)
    }
})

Performance Optimizations

  1. Token Classification Efficiency (pkg/lsp/handler.go:1771-1817)

    • The switch statement is extensive. Consider using a map for O(1) lookups:
    var tokenTypeMap = map[models.TokenType]int{
        models.TokenTypeKeyword: semTokKeyword,
        models.TokenTypeSelect: semTokKeyword,
        // ... etc
    }
  2. Memory Pre-allocation (pkg/lsp/handler.go:1725)

    // Current: grows dynamically
    data := make([]uint32, 0, len(tokens)*5)
    // Better: pre-allocate known size
    data := make([]uint32, len(tokens)*5)[:0]

Minor Improvements

  1. Magic Number: The 300ms debounce timeout should be configurable:

    const defaultDebounceTimeout = 300 * time.Millisecond
  2. Token Length Calculation (pkg/lsp/handler.go:1739-1742)

    • Consider validating that tok.End.Column >= tok.Start.Column before subtraction
  3. Error Handling: Consider logging tokenization errors instead of silently returning empty tokens

🔒 Security Assessment

  • No Security Issues: The implementation doesn't expose sensitive data
  • Input Validation: Proper bounds checking in position calculations
  • Resource Management: Correct cleanup of timers prevents memory leaks

🧪 Test Coverage Recommendations

The PR lacks tests. Consider adding:

  1. Race Condition Tests:

    go test -race ./pkg/lsp/... -run TestDebounce
  2. Semantic Token Tests:

    • Verify correct token classification
    • Test delta encoding accuracy
    • Validate LSP protocol compliance
  3. Integration Tests:

    • Test debouncing with rapid document changes
    • Verify timer cleanup on document close

📊 Performance Impact

  • Positive: Reduces diagnostic spam during rapid typing (300ms debounce)
  • Positive: Efficient semantic token encoding reduces network overhead
  • Minimal: Object pooling usage aligns with existing patterns
  • Memory: Debounce timer map grows with open documents (acceptable)

🏆 Final Verdict

Status: ✅ Approve with minor fixes

This is a solid implementation that adds valuable LSP functionality. The diagnostic debouncing will significantly improve user experience during active editing. The semantic token provider follows LSP standards correctly.

Required before merge:

  1. Fix the race condition in debounce timer callback
  2. Add basic test coverage for the debouncing mechanism

Optional improvements:

  • Make debounce timeout configurable
  • Optimize token classification with lookup map
  • Add comprehensive test suite

The code demonstrates good understanding of the GoSQLX architecture and LSP protocols. Well done! 🎉


Job Run | Branch: feat/lsp-debounce-semantic-tokens

@github-actions
Copy link

github-actions bot commented Mar 19, 2026

Claude finished @ajitpratap0's task —— View job


Code Review Complete ✅

I've completed a comprehensive review of the PR implementing diagnostic debouncing and semantic token provider. Here's my analysis:

  • Code quality and best practices
  • Potential bugs or issues
  • Performance considerations
  • Security concerns
  • Test coverage

Overall Assessment

This is a well-implemented PR that adds two important LSP features: diagnostic debouncing and semantic tokens support. The implementation follows good practices and maintains thread safety.

Strengths

Diagnostic Debouncing (pkg/lsp/handler.go:329-342)

  • Thread-safe design: Uses sync.Mutex to protect the debounceTimers map from concurrent access
  • Smart content capture: Captures content snapshot at change time (line 340), not at fire time, preventing race conditions
  • Proper cleanup: Cancels existing timers before scheduling new ones, preventing timer leaks
  • Document lifecycle integration: Properly cancels timers on document close (lines 356-362)

Semantic Tokens (pkg/lsp/handler.go:1702-1818)

  • LSP-compliant encoding: Correctly implements the 5-integer delta format
  • Memory efficient: Uses object pooling with tokenizer.GetTokenizer()/PutTokenizer()
  • Comprehensive token classification: Maps 30+ token types to 6 semantic categories
  • Proper coordinate conversion: Converts from 1-based parser locations to 0-based LSP positions

Protocol Extensions (pkg/lsp/protocol.go:237-258)

  • Complete type definitions: All necessary LSP semantic token types properly defined
  • Standards compliance: Follows LSP specification exactly

Issues Found

Critical Issues

None found - the implementation is solid.

Minor Issues

  1. Hardcoded debounce delay (pkg/lsp/handler.go:336):

    h.debounceTimers[uri] = time.AfterFunc(300*time.Millisecond, func() {

    Consider making the 300ms configurable via server options for different use cases.

  2. Error handling in semantic tokens (pkg/lsp/handler.go:1712-1715):

    tokens, err := tkz.Tokenize([]byte(doc.Content))
    if err != nil {
        return &SemanticTokens{Data: []uint32{}}, nil
    }

    Silent failure might make debugging difficult. Consider logging tokenization errors.

  3. Token length fallback (pkg/lsp/handler.go:1739-1742):
    The fallback to len(tok.Token.Value) for token length could be inaccurate for Unicode strings. The current approach is acceptable but could be improved.

Performance Considerations

Excellent performance characteristics:

  • Object pooling: Proper use of tokenizer pool prevents allocation overhead
  • Debouncing efficiency: 300ms delay reduces diagnostic overhead during rapid typing
  • Memory management: No memory leaks in timer management
  • Delta encoding: Efficient LSP semantic token encoding reduces payload size

Security Assessment

No security concerns identified:

  • Input validation through existing LSP framework
  • No user input directly used in system calls
  • Thread-safe operations prevent race conditions
  • Proper resource cleanup prevents resource exhaustion

Test Coverage Gaps

The PR mentions a test plan but no actual tests are included:

## Test plan
- [ ] `go test ./pkg/lsp/... -race` — no data races on debounce timer map
- [ ] Open a large SQL file in a Neovim/VSCode LSP client — diagnostics should lag 300ms behind rapid typing, not fire per keystroke
- [ ] LSP client that supports semantic tokens should receive keyword highlighting

Recommended additions:

  1. Unit tests for debouncing behavior with concurrent access
  2. Tests verifying semantic token encoding matches LSP specification
  3. Integration tests with mock LSP clients

Code Quality

Excellent overall quality:

  • Clear, self-documenting code
  • Consistent error handling patterns
  • Good separation of concerns
  • Comprehensive documentation

Recommendations

  1. Make debounce delay configurable:

    type Handler struct {
        // ... existing fields
        debounceDelay time.Duration
    }
  2. Add debug logging for tokenization errors:

    if err != nil {
        h.server.Logger().Printf("Semantic tokenization failed for %s: %v", params.TextDocument.URI, err)
        return &SemanticTokens{Data: []uint32{}}, nil
    }
  3. Add comprehensive tests covering the scenarios mentioned in the test plan

  4. Consider exposing semantic token statistics for debugging/monitoring

Conclusion

This is a high-quality implementation that significantly enhances the LSP server's capabilities. The code demonstrates excellent understanding of both LSP protocol requirements and Go concurrency patterns. The diagnostic debouncing will greatly improve user experience during rapid editing, and semantic tokens will enhance IDE syntax highlighting.

Recommendation: ✅ Approve - This PR is ready for merge with the minor suggestions above addressed in a follow-up if desired.

Job run

@github-actions
Copy link

github-actions bot commented Mar 19, 2026

Claude finished @ajitpratap0's task —— View job


Code Review Complete ✅

Todo List:

  • Read repository guidelines and understand codebase structure
  • Analyze pkg/lsp/handler.go changes for diagnostic debouncing
  • Analyze pkg/lsp/protocol.go changes for semantic tokens
  • Review code quality and best practices
  • Check for potential bugs and race conditions
  • Evaluate performance considerations
  • Assess security concerns
  • Review test coverage
  • Provide final feedback and recommendations

🔍 Review Summary

This PR successfully implements diagnostic debouncing and semantic token provider for the LSP server. The implementation is well-structured and follows the project's architectural patterns. Here's my detailed analysis:

✅ Positive Aspects

Strong Implementation Design

  • Proper Object Pooling: Correctly uses tokenizer.GetTokenizer() / PutTokenizer() pattern as mandated by CLAUDE.md
  • Thread Safety: Proper mutex usage for debouncing timers with sync.Mutex and cleanup patterns
  • LSP Compliance: Follows LSP 3.17 specification correctly for semantic tokens
  • Memory Efficiency: Uses delta encoding for semantic tokens, reducing payload size
  • Repository Guidelines: Adheres to coding standards and architectural patterns

Clean Code Quality

  • Well-documented functions with comprehensive comments
  • Consistent error handling patterns
  • Proper resource cleanup with defer statements
  • Clear separation of concerns

⚠️ Issues & Recommendations

Critical: Potential Race Condition

Location: pkg/lsp/handler.go:336-342

h.debounceTimers[uri] = time.AfterFunc(300*time.Millisecond, func() {
    h.debounceMu.Lock()
    delete(h.debounceTimers, uri)
    h.debounceMu.Unlock()
    h.validateDocument(uri, content, version)
})

Issue: The content is captured at change time, but if the document is modified again before the timer fires, validateDocument might run on stale content.

Fix: Capture content inside the timer callback:

h.debounceTimers[uri] = time.AfterFunc(300*time.Millisecond, func() {
    h.debounceMu.Lock()
    delete(h.debounceTimers, uri)
    h.debounceMu.Unlock()
    // Get fresh content
    if freshContent, ok := h.server.Documents().GetContent(uri); ok {
        h.validateDocument(uri, freshContent, version)
    }
})

Performance Optimizations

  1. Token Classification Efficiency (pkg/lsp/handler.go:1771-1817)

    • The switch statement is extensive. Consider using a map for O(1) lookups:
    var tokenTypeMap = map[models.TokenType]int{
        models.TokenTypeKeyword: semTokKeyword,
        models.TokenTypeSelect: semTokKeyword,
        // ... etc
    }
  2. Memory Pre-allocation (pkg/lsp/handler.go:1725)

    // Current: grows dynamically
    data := make([]uint32, 0, len(tokens)*5)
    // Better: pre-allocate known size
    data := make([]uint32, len(tokens)*5)[:0]

Minor Improvements

  1. Magic Number: The 300ms debounce timeout should be configurable:

    const defaultDebounceTimeout = 300 * time.Millisecond
  2. Token Length Calculation (pkg/lsp/handler.go:1739-1742)

    • Consider validating that tok.End.Column >= tok.Start.Column before subtraction
  3. Error Handling: Consider logging tokenization errors instead of silently returning empty tokens

🔒 Security Assessment

  • No Security Issues: The implementation doesn't expose sensitive data
  • Input Validation: Proper bounds checking in position calculations
  • Resource Management: Correct cleanup of timers prevents memory leaks

🧪 Test Coverage Recommendations

The PR lacks tests. Consider adding:

  1. Race Condition Tests:

    go test -race ./pkg/lsp/... -run TestDebounce
  2. Semantic Token Tests:

    • Verify correct token classification
    • Test delta encoding accuracy
    • Validate LSP protocol compliance
  3. Integration Tests:

    • Test debouncing with rapid document changes
    • Verify timer cleanup on document close

📊 Performance Impact

  • Positive: Reduces diagnostic spam during rapid typing (300ms debounce)
  • Positive: Efficient semantic token encoding reduces network overhead
  • Minimal: Object pooling usage aligns with existing patterns
  • Memory: Debounce timer map grows with open documents (acceptable)

🏆 Final Verdict

Status: ✅ Approve with minor fixes

This is a solid implementation that adds valuable LSP functionality. The diagnostic debouncing will significantly improve user experience during active editing. The semantic token provider follows LSP standards correctly.

Required before merge:

  1. Fix the race condition in debounce timer callback
  2. Add basic test coverage for the debouncing mechanism

Optional improvements:

  • Make debounce timeout configurable
  • Optimize token classification with lookup map
  • Add comprehensive test suite

The code demonstrates good understanding of the GoSQLX architecture and LSP protocols. Well done! 🎉


Job Run | Branch: feat/lsp-debounce-semantic-tokens

@ajitpratap0 ajitpratap0 merged commit e664751 into main Mar 19, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

P2-6: LSP incremental text sync + semantic tokens

1 participant