Refactor/agent #46

Koopa0 · 2025-12-19T05:05:03Z

No description provided.

codereviewbot-ai · 2025-12-19T05:07:13Z

build/frontend/tailwind.config.js

+  // NOT relative to this config file location
  content: [
    "./internal/web/**/*.templ",
    "./internal/web/**/*.go",


Including ./internal/web/**/*.go in the content array may lead to performance degradation and missed class extraction, as Go files typically do not contain Tailwind classes directly. Consider restricting the content paths to files where Tailwind classes are actually used, such as .templ, .html, .js, or .ts files, to optimize build performance and ensure accurate class extraction.

codereviewbot-ai · 2025-12-19T05:07:13Z

cmd/cli.go

+	if err := session.SaveCurrentSessionID(newSess.ID); err != nil {
+		slog.Warn("failed to save session state", "error", err)
+	}


If session.SaveCurrentSessionID(newSess.ID) fails, the error is only logged as a warning, but the function still returns the new session ID as if it was successfully saved. This can lead to inconsistencies if the session ID is not persisted, potentially causing confusion or loss of session continuity in future runs.

Recommendation: Consider returning an error if saving the session ID fails, or otherwise ensure the caller is aware of the failure so it can be handled appropriately.

codereviewbot-ai · 2025-12-19T05:07:13Z

cmd/cli.go

+	ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
+	defer cancel()


The context created with signal.NotifyContext is canceled via defer cancel(), which will execute after runCLI returns. However, if the Bubble Tea TUI program (program.Run()) is still running or has background operations tied to the context, canceling the context at this point may cause premature shutdown of resources or incomplete cleanup.

Recommendation: Ensure that context cancellation aligns precisely with the lifecycle of the TUI program and all dependent resources. Consider whether explicit context management is needed to avoid premature cancellation.

codereviewbot-ai · 2025-12-19T05:07:13Z

cmd/e2e_test.go

@@ -85,7 +95,8 @@ func findOrBuildKoopa(t *testing.T) string {

 	// Get project root (parent of cmd/)
 	projectRoot, _ := filepath.Abs("..")


Unchecked error from filepath.Abs

The call to filepath.Abs("..") ignores the returned error, which could result in an invalid projectRoot if path resolution fails. This may cause subsequent file operations to behave unpredictably or fail silently.

Recommended solution:
Handle the error explicitly and fail the test if the project root cannot be determined:

projectRoot, err := filepath.Abs("..") require.NoError(t, err, "Failed to determine project root")

codereviewbot-ai · 2025-12-19T05:07:13Z

cmd/execute.go

+	if len(os.Args) < 2 {
+		runHelp()
+		return nil
+	}


When no command is provided (len(os.Args) < 2), the application prints help and returns nil. This may lead to ambiguity for automation or scripting, as the process exits with a success code despite no command being executed. Consider returning a non-nil error (e.g., fmt.Errorf("no command provided")) to signal that the invocation was incomplete.

Recommended change:

if len(os.Args) < 2 { runHelp() return fmt.Errorf("no command provided") }

codereviewbot-ai · 2025-12-19T05:07:14Z

internal/agent/chat/flow.go

+func ResetFlowForTesting() {
+	flowOnce = sync.Once{}
+	flow = nil
+	flowInitDone = false
 }


Potential Data Race in Test Reset

The ResetFlowForTesting function is not safe for concurrent use, as noted in the comment. If tests are run in parallel and this function is called concurrently, it could lead to data races and unpredictable behavior.

Recommendation:
Consider adding synchronization (e.g., a mutex) around the singleton reset logic, or ensure that tests using this function are always run serially. Alternatively, document this limitation more prominently and enforce serial test execution where this function is used.

codereviewbot-ai · 2025-12-19T05:07:14Z

internal/agent/chat/integration_rag_test.go

+func TestChatAgent_RAGIntegration_EndToEnd(t *testing.T) {
+	framework, cleanup := SetupTest(t)
+	defer cleanup()
+
+	ctx := context.Background()
+
+	// Ensure RAG is enabled
+	require.Greater(t, framework.Config.RAGTopK, 0, "RAG must be enabled for this test")
+
+	// STEP 1: Index test document using DocStore
+	docID := uuid.New()
+	testContent := "The secret password is KOOPA_TEST_123. This is a unique test string."
+	framework.IndexDocument(t, testContent, map[string]any{
+		"id":          docID.String(),
+		"filename":    "test-rag-doc.txt",
+		"source_type": rag.SourceTypeFile,
+	})
+	t.Logf("Indexed document %s with test content", docID)
+
+	// STEP 2: Query should trigger RAG retrieval
+	invCtx, sessionID, branch := newInvocationContext(ctx, framework.SessionID)
+	resp, err := framework.Agent.ExecuteStream(invCtx, sessionID, branch,
+		"What is the secret password from the test document?",
+		false,
+		nil,
+	)
+
+	require.NoError(t, err, "Query with RAG should succeed")
+	assert.NotEmpty(t, resp.FinalText, "Response should not be empty")
+
+	// STEP 3: Verify LLM response uses retrieved context
+	assert.Contains(t, resp.FinalText, "KOOPA_TEST_123",
+		"Response should contain the password from retrieved document")
+	t.Logf("RAG response: %s", resp.FinalText)
+}


Test Isolation Risk

The test TestChatAgent_RAGIntegration_EndToEnd (and others in this file) indexes documents but does not clear the knowledge base before or after the test. This can lead to test flakiness if documents from previous tests persist and interfere with the current test's retrieval results.

Recommendation:
Ensure that the knowledge base is cleared before each test, or use a unique, isolated test environment per test run. For example:

framework.ClearKnowledgeBase(t)

Add this at the start of each test if such a method exists.

codereviewbot-ai · 2025-12-19T05:07:14Z

internal/agent/chat/integration_rag_test.go

+	assert.Contains(t, resp.FinalText, "KOOPA_TEST_123",
+		"Response should contain the password from retrieved document")


Brittle Assertion on LLM Output

The assertion assert.Contains(t, resp.FinalText, "KOOPA_TEST_123", ...) assumes that the LLM will always echo the exact string from the indexed document. LLMs may paraphrase or omit details, causing false negatives even if retrieval works correctly.

Recommendation:
Consider relaxing the assertion or supplementing it with additional checks, such as verifying that the retrieved context was included in the prompt sent to the LLM, or using a more robust matching strategy.

codereviewbot-ai · 2025-12-19T05:07:14Z

internal/agent/chat/integration_streaming_test.go

+func TestChatAgent_StreamingContextCancellation(t *testing.T) {
+	framework, cleanup := SetupTest(t)
+	defer cleanup()
+
+	ctx, cancel := context.WithCancel(context.Background())
+	_, sessionID, branch := newInvocationContext(context.Background(), framework.SessionID)
+	chunks := 0
+
+	callback := func(callbackCtx context.Context, chunk *ai.ModelResponseChunk) error {
+		chunks++
+		t.Logf("Received chunk %d", chunks)
+
+		// Cancel after first chunk
+		if chunks == 1 {
+			cancel()
+		}
+		return nil
+	}
+
+	resp, err := framework.Agent.ExecuteStream(ctx, sessionID, branch,
+		"Write a very long story",
+		false,
+		callback,
+	)
+
+	// Should fail with context canceled error
+	if err != nil {
+		assert.Contains(t, err.Error(), "context canceled",
+			"Should fail with context canceled error")
+		t.Logf("Context cancellation detected: %v", err)
+	} else {
+		// If somehow completed despite cancellation, that's unexpected but not wrong
+		assert.NotNil(t, resp)
+		t.Logf("Completed despite cancellation (unexpected): %s", resp.FinalText)
+	}
+}


Potential race condition in context cancellation test

In TestChatAgent_StreamingContextCancellation, the test cancels the context after the first chunk and expects an error. However, it does not assert that only one chunk was received. If the agent implementation is slow to respond to cancellation, more than one chunk could be processed, leading to flaky or unreliable test results.

Recommended solution:
Add an assertion after the test to verify that chunks == 1, ensuring that only one chunk was processed before cancellation:

assert.Equal(t, 1, chunks, "Should have received only one chunk before cancellation")

This will make the test more robust and reliable.

codereviewbot-ai · 2025-12-19T05:07:14Z

internal/agent/chat/integration_streaming_test.go

+func TestChatAgent_StreamingCallbackError(t *testing.T) {
+	framework, cleanup := SetupTest(t)
+	defer cleanup()
+
+	ctx, sessionID, branch := newInvocationContext(context.Background(), framework.SessionID)
+	chunks := 0
+	maxChunks := 3
+
+	// Callback that errors after 3 chunks
+	callback := func(ctx context.Context, chunk *ai.ModelResponseChunk) error {
+		chunks++
+		t.Logf("Received chunk %d: %d bytes", chunks, len(chunk.Text()))
+
+		if chunks >= maxChunks {
+			return errors.New("simulated streaming error after 3 chunks")
+		}
+		return nil
+	}
+
+	resp, err := framework.Agent.ExecuteStream(ctx, sessionID, branch,
+		"Write a long story about a space adventure",
+		false,
+		callback,
+	)
+
+	// Stream should stop and propagate the error
+	require.Error(t, err, "Should propagate streaming callback error")
+	assert.Contains(t, err.Error(), "simulated streaming error",
+		"Error should contain callback error message")
+	assert.Equal(t, maxChunks, chunks,
+		"Should have received exactly maxChunks chunks before error")
+
+	// Response might be nil or partial depending on error handling
+	if resp != nil {
+		t.Logf("Partial response received before error: %s", resp.FinalText)
+	} else {
+		t.Log("No response returned (expected when streaming errors)")
+	}
+}


Incomplete verification of stream termination after callback error

In TestChatAgent_StreamingCallbackError, the test checks that the error is propagated and the number of chunks matches maxChunks. However, it does not explicitly verify that no further chunks are processed after the error. If the agent does not properly halt streaming after a callback error, this could go undetected.

Recommended solution:
After the test, add a check to ensure that no chunks are processed after the error. For example, you could set a flag in the callback when the error is returned and assert that the callback is not invoked again after that point.

- Remove internal/agent/session_id.go (SessionID was just a string wrapper - Move History type from agent to session package - Update all session-related APIs to use uuid.UUID directly - Simplify code by removing unnecessary type conversions"

codereviewbot-ai · 2025-12-19T05:14:59Z

db/migrations/000001_init_schema.down.sql

+DROP TABLE IF EXISTS artifact;
+DROP TABLE IF EXISTS message;
+DROP TABLE IF EXISTS sessions;
 DROP TABLE IF EXISTS documents;



Dropping tables may leave behind dependent objects such as indexes, views, or sequences that were created alongside these tables. This can result in orphaned objects and schema inconsistencies after rollback.

Recommended solution:
Explicitly drop any dependent objects (e.g., indexes, views, sequences) associated with these tables before or after dropping the tables. For example:

DROP INDEX IF EXISTS some_index; DROP SEQUENCE IF EXISTS some_sequence; DROP VIEW IF EXISTS some_view;

codereviewbot-ai · 2025-12-19T05:14:59Z

db/queries/sessions.sql


 -- name: DeleteMessagesByBranch :exec
 -- Delete all messages in a specific branch
-DELETE FROM session_messages
+DELETE FROM message
 WHERE session_id = sqlc.arg(session_id) AND branch = sqlc.arg(branch);



Potential Data Inconsistency: DeleteMessagesByBranch Does Not Update message_count

The DeleteMessagesByBranch query (lines 98-102) removes messages from a branch but does not update the message_count field in the sessions table. This can result in the message_count becoming stale or incorrect, leading to inconsistencies in the application logic that relies on this field.

Recommendation:
After deleting messages, ensure that message_count is recalculated and updated accordingly, ideally within the same transaction. For example:

UPDATE sessions SET message_count = (SELECT COUNT(*) FROM message WHERE session_id = sqlc.arg(session_id)) WHERE id = sqlc.arg(session_id);

This will keep the session metadata consistent with the actual message data.

codereviewbot-ai · 2025-12-19T05:14:59Z

internal/agent/chat/business_logic_integration_test.go

+	callback := func(ctx context.Context, chunk *ai.ModelResponseChunk) error {
+		chunks++
+		t.Logf("Received chunk %d: %d bytes", chunks, len(chunk.Text()))
+		return nil // Continue streaming
+	}


Streaming Callback Error Propagation Not Tested

In TestExecuteStream_StreamingCallback, the callback always returns nil, so the test does not verify that errors returned from the callback are properly propagated and handled by ExecuteStream. This could mask issues where streaming errors are not surfaced to the caller.

Recommended Solution:
Add a subtest where the callback returns a non-nil error and assert that ExecuteStream returns the error as expected. For example:

t.Run("callback error", func(t *testing.T) { errCallback := func(ctx context.Context, chunk *ai.ModelResponseChunk) error { return errors.New("streaming error") } resp, err := framework.Agent.ExecuteStream(ctx, sessionID, branch, "test", false, errCallback) require.Error(t, err) assert.Contains(t, err.Error(), "streaming error") assert.Nil(t, resp) })

codereviewbot-ai · 2025-12-19T05:14:59Z

internal/agent/chat/chat_test.go

 	})
 }

-// TestChat_NilResponseDefense tests defensive check for nil responses.
-// This prevents panics when execute() incorrectly returns nil without error.
-func TestChat_NilResponseDefense(t *testing.T) {
-	t.Parallel()
-
-	t.Run("nil response detection", func(t *testing.T) {
-		t.Parallel()
-		// Simulate the defensive check in ExecuteStream
-		var resp *ai.ModelResponse = nil
-		if resp == nil {
-			err := errors.New("internal error: execute returned nil response without error")
-			assert.Contains(t, err.Error(), "nil response")
-		}
-	})
-}
-
 // TestChat_MaxTurnsProtection tests that conversation doesn't loop infinitely.
 // Safety: Prevents runaway agent loops that could exhaust resources.
 func TestChat_MaxTurnsProtection(t *testing.T) {


Max Turns Protection Test Is Superficial

The test in TestChat_MaxTurnsProtection only simulates a local turn counter and does not interact with the actual agent loop or logic. This means it does not verify that the agent itself enforces the max turns protection, which could allow runaway loops in production if the agent's logic is incorrect.

Recommendation:
Integrate this test with the real agent loop or use a mock agent to ensure that the max turns protection is enforced in practice, not just in a simulated counter.

codereviewbot-ai · 2025-12-19T05:14:59Z

internal/agent/chat/flow.go

 				return Output{SessionID: input.SessionID}, fmt.Errorf("%w: %w", agent.ErrInvalidSession, err)
 			}


Incorrect Error Wrapping Format

The error wrapping in fmt.Errorf("%w: %w", agent.ErrInvalidSession, err) uses two %w verbs, which is not idiomatic Go. Only one error can be wrapped using %w, and additional context should be added as a string.

Recommended Solution:
Use a single %w and add context as a string:

return Output{SessionID: input.SessionID}, fmt.Errorf("invalid session: %w", err)

Or, if you want to wrap both errors, consider using errors.Join (Go 1.20+):

return Output{SessionID: input.SessionID}, fmt.Errorf("invalid session: %w", errors.Join(agent.ErrInvalidSession, err))

codereviewbot-ai · 2025-12-19T05:14:59Z

internal/agent/chat/integration_test.go

+	numConcurrentQueries := 5
+	var wg sync.WaitGroup
+	wg.Add(numConcurrentQueries)
+
+	ctx, sessionID, branch := newInvocationContext(context.Background(), framework.SessionID)
+


Potential Data Race in Concurrent Test

The concurrent test uses the same ctx, sessionID, and branch for all goroutines. If the underlying chat agent or session store is not explicitly designed for concurrent access with the same session, this may lead to data races or inconsistent state.

Recommendation:
Consider creating separate sessions or branches per goroutine, or ensure that the chat agent and session store are thread-safe for concurrent access with the same session.

codereviewbot-ai bot reviewed Dec 19, 2025

View reviewed changes

Koopa0 marked this pull request as draft December 19, 2025 05:09

Koopa0 marked this pull request as ready for review December 19, 2025 05:10

Koopa0 force-pushed the refactor/agent branch from fb8f56d to e0bda0b Compare December 19, 2025 05:13

Koopa0 merged commit f313b1a into main Dec 19, 2025
6 of 8 checks passed

Koopa0 deleted the refactor/agent branch December 19, 2025 05:14

codereviewbot-ai bot reviewed Dec 19, 2025

View reviewed changes

		ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
		defer cancel()

		@@ -85,7 +95,8 @@ func findOrBuildKoopa(t *testing.T) string {

		// Get project root (parent of cmd/)
		projectRoot, _ := filepath.Abs("..")

		assert.Contains(t, resp.FinalText, "KOOPA_TEST_123",
		"Response should contain the password from retrieved document")

		return Output{SessionID: input.SessionID}, fmt.Errorf("%w: %w", agent.ErrInvalidSession, err)
		}

Refactor/agent #46

Refactor/agent #46

Uh oh!

Conversation

Koopa0 commented Dec 19, 2025

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Potential Data Race in Test Reset

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Test Isolation Risk

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Brittle Assertion on LLM Output

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Potential Data Inconsistency: DeleteMessagesByBranch Does Not Update message_count

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Streaming Callback Error Propagation Not Tested

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Incorrect Error Wrapping Format

Uh oh!

codereviewbot-ai bot Dec 19, 2025

Choose a reason for hiding this comment

Potential Data Race in Concurrent Test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Potential Data Inconsistency: `DeleteMessagesByBranch` Does Not Update `message_count`