Skip to content

Conversation

@alvinunreal
Copy link
Owner

Implements a comprehensive knowledge base system that allows users to:

  • Create and manage markdown knowledge base files in ~/.config/tmuxai/kb/
  • Load/unload KBs dynamically with /kb commands
  • Load KBs via CLI flag: --kb docker,git
  • Auto-load KBs on startup via config
  • View loaded KB status with token counts
  • Inject loaded KBs into conversation context

Resolves #87

Changes:

  • Added KnowledgeBaseConfig to config with auto_load and custom path support
  • Created internal/knowledge_base.go with load/unload/list functions
  • Integrated LoadedKBs map into Manager struct
  • Added /kb commands: list, load, unload (including --all)
  • Updated /info to show loaded KB statistics
  • Modified message assembly to inject loaded KBs after system prompt
  • Added --kb CLI flag for loading KBs at startup
  • Updated README with comprehensive KB documentation
  • Added comprehensive test suite with 10 test cases

Implements a comprehensive knowledge base system that allows users to:
- Create and manage markdown knowledge base files in ~/.config/tmuxai/kb/
- Load/unload KBs dynamically with /kb commands
- Load KBs via CLI flag: --kb docker,git
- Auto-load KBs on startup via config
- View loaded KB status with token counts
- Inject loaded KBs into conversation context

Resolves #87

Changes:
- Added KnowledgeBaseConfig to config with auto_load and custom path support
- Created internal/knowledge_base.go with load/unload/list functions
- Integrated LoadedKBs map into Manager struct
- Added /kb commands: list, load, unload (including --all)
- Updated /info to show loaded KB statistics
- Modified message assembly to inject loaded KBs after system prompt
- Added --kb CLI flag for loading KBs at startup
- Updated README with comprehensive KB documentation
- Added comprehensive test suite with 10 test cases

Co-authored-by: Alvin <alvinunreal@users.noreply.github.com>
@claude
Copy link

claude bot commented Oct 18, 2025

Pull Request Review: Knowledge Base Feature

Overview

This PR implements a comprehensive knowledge base system for TmuxAI. The implementation is well-structured with good separation of concerns and includes extensive test coverage. Overall, this is a solid feature addition.

✅ Strengths

Code Quality

  • Clean architecture: Knowledge base logic is properly isolated in internal/knowledge_base.go
  • Good error handling: Comprehensive error messages with proper wrapping using %w
  • Consistent coding style: Follows Go conventions and existing codebase patterns
  • Well-documented: Extensive README documentation with clear examples

Test Coverage

  • Comprehensive tests: 10 test cases covering core functionality
  • Good edge case coverage: Tests for non-existent files, empty directories, and error conditions
  • Proper test isolation: Uses t.TempDir() for clean test environments

User Experience

  • Multiple loading methods: CLI flag, auto-load config, and runtime commands
  • Good feedback: Token counts and clear status messages
  • Intuitive commands: /kb, /kb load, /kb unload with --all flag

🐛 Potential Issues

1. Race Condition in GetKBDir() (config/config.go:189-205)

Severity: Medium

The GetKBDir() function calls Load() on every invocation, which could cause issues:

  • Performance overhead: Reloads entire config file on every KB operation
  • Potential inconsistency: Different calls might see different config values during concurrent operations
  • Used in hot paths like loadKB(), listKBs()

Recommendation: Pass the config directory path as a parameter or cache it in the Manager struct to avoid repeated file I/O.

2. Duplicate Loading Not Prevented (cli/cli.go:62-65)

Severity: Low

When using --kb flag with auto-load config, KBs could be loaded twice. The CLI loads KBs, then NewManager() calls autoLoadKBs() which may load the same KBs again.

Recommendation: Check for already-loaded KBs in loadKB() before loading (idempotent operation).

3. Missing Validation in /kb load Command (internal/chat_command.go:244-257)

Severity: Low

The command doesn't validate if the KB exists before attempting to load. Pre-check if the KB file exists to provide more specific error messages (e.g., KB not found vs generic read errors).

⚡ Performance Considerations

1. Token Counting Overhead (internal/chat_command.go:224)

The /kb list command recalculates token counts every time. Consider caching token counts when loading KBs to avoid repeated calculations.

2. KB Injection on Every Message (internal/process_message.go:53-60)

Every message processing injects all loaded KBs. This is expected behavior, but large KBs will significantly increase token usage. Consider documenting the token cost implications more prominently and adding a warning when loaded KBs exceed a certain token threshold.

🔒 Security Concerns

1. Path Traversal Vulnerability (internal/knowledge_base.go:15-17)

Severity: Medium - MUST FIX

The loadKB() function is vulnerable to directory traversal. A malicious user could potentially load files outside the KB directory like /kb load ../../../etc/passwd

Recommendation: Add input validation to prevent path traversal:

  • Validate name doesn't contain path separators
  • Ensure resolved path is within kbDir using filepath.Abs and string prefix check

2. No File Size Limits (internal/knowledge_base.go:19)

Severity: Low

Reading KB files without size limits could cause memory issues. Add file size validation with a reasonable limit like 10MB.

📋 Test Coverage Analysis

Excellent Coverage: Load/unload operations, non-existent files, empty directories, CLI loading, auto-loading, token counting

Missing Tests:

  • Concurrent KB loading/unloading
  • Loading the same KB twice (idempotency)
  • KB names with special characters or path separators (security test)
  • Very large KB files
  • Custom KB paths in config

📝 Minor Suggestions

  1. Error Message Consistency: Avoid exposing internal paths in error messages
  2. Magic String for KB Format: Make the KB header format a constant
  3. Sorted KB Listing: Sort KB list alphabetically for consistency

🎯 Recommendations Priority

Must Fix (Before Merge):

  1. Path traversal vulnerability - Add input validation to loadKB()

Should Fix (Before Merge):

  1. File size limits - Prevent memory exhaustion from large files
  2. GetKBDir() performance - Cache config path in Manager

Nice to Have (Can be follow-up):

  1. Token count caching for /kb list performance
  2. Duplicate loading prevention
  3. Additional security tests
  4. Sort KB listings alphabetically

✅ Final Verdict

This is a well-implemented feature with good code quality and test coverage. The main concern is the path traversal vulnerability which should be addressed before merging. Once the security issue is fixed, this PR will be ready to merge.

Great work on this feature! The knowledge base system is a valuable addition to TmuxAI. The code is clean, well-tested, and user-friendly.

@alvinunreal alvinunreal force-pushed the claude/issue-87-20251018-1735 branch from 2bb67c7 to 91f303d Compare October 18, 2025 18:49
@claude
Copy link

claude bot commented Oct 18, 2025

Pull Request Review: Knowledge Base Feature

Overview

This PR implements a comprehensive knowledge base system with CLI flags, auto-loading, and dynamic management. The implementation is well-structured and follows good Go practices.

Code Quality & Best Practices

Strengths:

  1. Clean separation of concerns - KB logic properly isolated in internal/knowledge_base.go
  2. Consistent error handling with proper error wrapping
  3. Good logging throughout
  4. Well-structured configuration design
  5. Excellent UX with checkboxes, token counts, and clear status messages
  6. Comprehensive README documentation

Areas for Improvement:

1. config.GetKBDir() performance issue (config.go:189-205)
This function calls Load() every time it's invoked, re-reading and parsing the YAML config file. This is called from loadKB(), listKBs(), and potentially multiple times per user action. Should pass config as parameter or cache it.

2. Path traversal vulnerability (knowledge_base.go:15-22)
The name parameter is user-controlled. A user could provide: tmuxai --kb ../../etc/passwd
Fix: Add validation to ensure name doesn't contain path separators before filepath.Join

3. Non-deterministic KB ordering (process_message.go:54-60)
Go maps have non-deterministic iteration order. KBs may be injected in different orders across runs, potentially affecting AI behavior. Consider sorting KB names before iteration.

Security Concerns

HIGH: Path traversal vulnerability
Add validation in loadKB():
if strings.ContainsAny(name, "/\") {
return fmt.Errorf("invalid KB name: cannot contain path separators")
}

MEDIUM: Unrestricted file size
os.ReadFile loads entire file into memory without limits. Add max file size check to prevent OOM.

LOW: Silent directory creation failure
config.go:202 ignores MkdirAll errors, leading to confusing failures later.

Test Coverage

Current: 4 tests (load non-existent, unload, unload non-loaded, token counting)

Missing critical tests:

  • TestLoadKBSuccess - Actually loading a valid .md file
  • TestListKBs - Listing available KBs
  • TestAutoLoadKBs - Auto-loading from config
  • Path traversal attack prevention
  • Large file handling
  • Unicode/special characters in names

Test quality issue: Line 16 uses 0755 instead of 0o755 (inconsistent with codebase)

Performance Issues

  1. config.GetKBDir() re-reads config file on every call (HIGH impact)
  2. Every KB injected into chat history on each message - could be significant token cost
  3. No warning when KB token usage is high

Summary

Priority Fixes:

  1. HIGH: Path traversal vulnerability
  2. HIGH: config.GetKBDir() circular dependency
  3. MEDIUM: File size limits
  4. MEDIUM: Deterministic KB ordering
  5. MEDIUM: Expand test coverage

Overall: Solid implementation with excellent documentation and UX. Main concerns are security (path traversal) and performance (config reloading). With suggested fixes, ready to merge.

Estimated fix effort: 2-4 hours

Great work on the feature! Documentation is particularly well done.

@claude
Copy link

claude bot commented Oct 18, 2025

Code Review: Knowledge Base Feature Implementation

Overall Assessment

This is a well-implemented feature that adds knowledge base support to TmuxAI. The code is clean, follows Go best practices, and includes comprehensive documentation.

High Priority Issues

1. Token count not included in context calculation

  • Location: internal/squash.go:14-23 and internal/process_message.go:53-60
  • Issue: needSquash() only counts message tokens, but KB tokens are injected into context
  • Impact: Could cause unexpected context overflows
  • Fix: Update needSquash() to include m.getTotalLoadedKBTokens()

2. Path traversal vulnerability

  • Location: internal/knowledge_base.go:16
  • Issue: Malicious filename like ../../../etc/passwd.md could escape KB directory
  • Fix: Add validation to reject KB names with path separators and verify resolved path is within kbDir

Medium Priority

3. File size limits

  • No upper bound on KB file size could lead to DoS via memory exhaustion
  • Suggestion: Add reasonable file size limit (e.g., 1MB)

4. Test coverage gaps

  • Missing tests for: listKBs(), autoLoadKBs(), LoadKBsFromCLI()
  • No integration test for KB injection into message flow
  • No test for loading multiple KBs or unload --all

5. Performance optimizations

  • Token counting is repeated multiple times for same KB content
  • Consider caching token counts alongside KB content
  • KB injection creates array allocations on every message

Low Priority

6. Race condition possibility

  • LoadedKBs map accessed without synchronization
  • Add mutex or document non-thread-safety

7. Sorting and UX

  • Consider sorting KB list alphabetically for consistent output
  • GetKBDir() loads config on every call - could be optimized

Positive Notes

✅ Clean separation of concerns
✅ Excellent README documentation with examples
✅ Good error handling and user-friendly messages
✅ Nice autocomplete support for KB commands
✅ Helpful checkmark indicators and token counts
✅ Backward compatible - purely additive feature

Summary

Overall this is a solid implementation! Please address the two high-priority items (token counting in needSquash and path traversal validation) before merge. The rest are nice-to-haves that would improve robustness. Great work!

@alvinunreal alvinunreal merged commit bc39af6 into main Oct 18, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Knowledge base

2 participants