Skip to content

feat: OpenAI SDK integration, AST markdown processing, mobile UI#2

Merged
WilliamAGH merged 56 commits into
mainfrom
dev
Jan 23, 2026
Merged

feat: OpenAI SDK integration, AST markdown processing, mobile UI#2
WilliamAGH merged 56 commits into
mainfrom
dev

Conversation

@WilliamAGH
Copy link
Copy Markdown
Owner

@WilliamAGH WilliamAGH commented Jan 23, 2026

Summary

  • OpenAI SDK integration: Replace custom ResilientApiClient with official OpenAI Java SDK; add OpenAIStreamingService for reliable streaming with retry logic and fast-fail preferences
  • AST-based markdown processing: Migrate from regex to Flexmark AST for parsing; new UnifiedMarkdownService with structured citations, enrichments, and type-safe processing
  • Mobile-responsive UI: Comprehensive mobile support with safe area insets, touch-optimized controls, responsive typography, and 1450+ lines of new CSS
  • GPT-5 model migration: Update model references and configuration across codebase
  • Container optimization: JVM memory tuning for 512MB containers, Alpine/musl compatibility fixes for Netty OpenSSL

New Features

  • OpenAIStreamingService for SDK-based LLM streaming with multi-client fallback
  • OpenAiCompatibleEmbeddingModel for remote embedding providers
  • UnifiedMarkdownService with AST-based citation/enrichment extraction
  • MarkdownStreamProcessor for enhanced streaming support
  • Domain markdown types: ProcessedMarkdown, MarkdownCitation, MarkdownEnrichment
  • Enrichment types: Hint, Reminder, Warning, Background, Example
  • Mobile-first UI with touch interactions and accessibility improvements
  • SSE heartbeat management for connection reliability

Bug Fixes

  • Disable Netty native OpenSSL for Alpine container compatibility
  • Correct Spring documentation URL mappings and path formatting
  • Improve SSE heartbeat lifecycle management (terminate when data stream completes)
  • Fix markdown enrichment parsing for code fences and inline citations
  • Simplify OpenAI reasoning property handling for API compatibility

Refactoring

  • Remove ResilientApiClient (replaced by SDK integration)
  • Update all services for OpenAI SDK patterns (ChatService, EnrichmentService, GuidedLearningService, RetrievalService, RerankerService)
  • Update controllers for SDK streaming (ChatController, GuidedLearningController, MarkdownController)
  • Enhance RateLimitManager and RateLimitState for better performance
  • Improve EmbeddingFallbackConfig for remote provider switching

Documentation

  • Rewrite AGENTS.md to strict hash-based rule format (87 rules)
  • Add domain documentation for parsing and markdown logic
  • Update README with mobile features and GPT-5 configuration
  • Remove obsolete docs and debug scripts

Tests

  • ChatSseIntegrationTest - Chat streaming integration
  • GuidedSseIntegrationTest - Guided learning streaming
  • MarkdownApiIntegrationTest - Markdown API testing
  • TestCompleteStreaming, TestGPT5Streaming, TestWebFluxSSE, TestWhatGetsStreamed - Streaming verification
  • Enhanced MarkdownServiceTest, MarkdownPreprocessingTest, ComprehensiveListFormattingTest

Other Changes

  • Switch Docker base images to Amazon ECR Public (avoid Docker Hub rate limits)
  • Add Jackson Databind and Spring DotEnv dependencies
  • Configure Surefire plugin and Makefile JVM memory limits
  • Pin Qdrant to v1.14.1
  • Add HTMLHint and browserslist configuration

- Added class-level @SuppressWarnings("unchecked") annotation
- Replaced unsafe Map casts with safer instanceof checks and wildcards
- Maintained type safety while eliminating compiler warnings
- Application compiles cleanly without unchecked operation warnings
- Add Jackson Databind and Spring DotEnv dependencies
- Configure Surefire plugin for test execution control
- Enhance Makefile with JVM memory limits and LiveReload support
- Pin Qdrant version to v1.14.1 for stability
- Add browserslist configuration for HTMLHint
- Add HTMLHint configuration file
- Exclude build artifacts from version control
- Improve API key logging with endpoint detection
- Add diagnostics configuration for streaming chunk logging
- Add Qdrant payload index configuration
- Enhance port initializer with test profile support
- Update application properties for new features
- Replace regex-based processing with structured AST approach
- Add new UnifiedMarkdownService for type-safe processing
- Implement ProcessedMarkdown with structured citations and enrichments
- Add comprehensive markdown processing classes (CitationProcessor, EnrichmentProcessor, etc.)
- Deprecate legacy regex methods in favor of AST-based processing
- Add MarkdownStreamProcessor for enhanced streaming support
- Add diagnostic logging for LLM prompts and processing
- Integrate new AST-based markdown processing
- Improve ResilientApiClient with enhanced error handling
- Update RetrievalService with minor improvements
- Add structured logging for better debugging
- Enhance RateLimitManager with better state handling
- Refactor RateLimitState for improved performance
- Add more robust rate limiting logic
- Improve state persistence and recovery
- Enhance ChatController with improved streaming support
- Update ErrorTestController with minor improvements
- Improve GuidedLearningController with better error handling
- Update MarkdownController with enhanced processing
- Add comprehensive endpoint documentation
- Add comprehensive mobile responsiveness with safe area insets
- Implement enhanced streaming with character-by-character flow
- Update chat.html with mobile-optimized layout
- Update guided.html with improved responsive design
- Enhance index.html with better mobile navigation
- Overhaul app.css with modern design system and mobile optimizations
- Improve markdown-utils.js with enhanced processing
- Update error pages with mobile-friendly styling
- Add touch-optimized interactions and accessibility improvements
- Add ComprehensiveListFormattingTest for list processing
- Update MarkdownPreprocessingTest with enhanced coverage
- Refactor MarkdownServiceTest for new AST-based processing
- Add GuidedLearningControllerTest for new endpoints
- Add TestCompleteStreaming for streaming functionality
- Add TestGPT5Streaming for GPT-5 streaming tests
- Add TestWebFluxSSE for Server-Sent Events testing
- Add TestWhatGetsStreamed for streaming verification
- Add TestConfiguration for test setup
- Add ChatSseIntegrationTest for chat streaming integration
- Add GuidedSseIntegrationTest for guided learning streaming
- Add MarkdownApiIntegrationTest for API testing
- Add test resources for comprehensive test coverage
- Add comprehensive mobile responsiveness section to README
- Document mobile features, breakpoints, and safety measures
- Add mobile testing checklist and anti-patterns
- Add domain documentation for parsing and markdown logic
- Update stack details with mobile-first CSS information
- Update model references from gpt-4o-mini to gpt-5 across documentation
- Update Makefile default model parameter
- Add OpenAI Java SDK dependency for reliable streaming
- Update README with GPT-5 configuration examples
- Add OpenAIStreamingService for official SDK streaming
- Add OpenAiCompatibleEmbeddingModel for remote embeddings
- Remove ResilientApiClient (replaced by SDK integration)
- Add RemoteEmbedding configuration class to AppProperties
- Update EmbeddingFallbackConfig to support remote providers
- Enable flexible embedding provider switching
- Update ChatService to use OpenAIStreamingService
- Update EnrichmentService for new SDK patterns
- Update GuidedLearningService for streaming improvements
- Update MarkdownService to leverage unified processing
- Update MarkdownStreamProcessor for better streaming
- Update RetrievalService for enhanced context handling
- Update ChatController for OpenAI SDK streaming
- Update GuidedLearningController for improved streaming
- Remove manual SSE parsing in favor of SDK streaming
- Enhance error handling and response formatting
- Enhance chat.html for improved streaming experience
- Update guided.html with better lesson integration
- Update index.html for enhanced navigation
- Improve app.css with modern styling and animations
- Optimize streaming UI for reduced jitter and better UX
- Update all-parsing-and-markdown-logic.md with streaming improvements
- Update application.properties for new configuration options
- Add MIGRATION_SUCCESS.md documenting the GPT-5 migration
- Document OpenAI SDK integration and service changes
- Update pom.xml with additional dependency alignments
- Update .hintrc with expanded browser support
- Update test files for new streaming architecture
- Add development scripts for testing and debugging
- Ensure build stability with GPT-5 integration
- Update RerankerService for SDK integration
- Update UnifiedMarkdownService for enhanced processing
- Add OpenSSL disable flags to Dockerfile ENTRYPOINT
- Update Makefile run/dev targets with OpenSSL flags
- Set system properties in JavaChatApplication to prevent segfaults
- Fixes compatibility issues with Alpine Linux containers using musl
- Update Spring Boot docs to include /api/ path for proper API documentation
- Update Spring Framework docs to include /javadoc-api/ path
- Ensure documentation links point to correct API reference locations
- Add fast-fail preference for OpenAI when available
- Enhance retry logic for transient failures (timeouts, interruptions)
- Improve client selection algorithm for better fallback handling
- Treat InterruptedException and sleep interruptions as retryable
- Fix accidental '/java/' segment in Spring Boot API paths
- Fix accidental '/java/' segment in Spring Framework javadoc paths
- Ensure Spring documentation URLs are properly formatted
- Prevent broken links to Spring documentation
- Simplify page title from 'Java Chat — Solar Roast Edition' to 'Java Chat'
- Update author metadata to match simplified branding
- Remove unnecessary edition-specific branding from HTML metadata
- Add URL normalization logic for Spring docs in DocsSourceRegistry
- Implement URL normalization for LLM prompts in ChatService
- Add Spring docs handling and class literal filtering in JavadocLinkResolver
- Add memory-sensitive defaults for 512MB container budgets
- Configure lazy initialization and max in-memory size for HTTP codecs
- Update memory settings for 512MB container constraints
- Adjust heap size, metaspace, and direct memory limits for improved performance
- Implement additional JVM options for better resource management and stability
- Change default setting to skip Qdrant payload index ensure on boot in constrained environments
- Maintain existing toggle for debugging purposes
- Ensure application properties are optimized for various deployment scenarios
- Adjust max heap size to 192MB to allocate more space for metaspace
- Increase MaxMetaspaceSize to 192MB to prevent out-of-memory errors
- Reduce ReservedCodeCacheSize and MaxDirectMemorySize for better resource management
- Ensure memory settings align with container constraints for improved stability
- Remove nested "reasoning" object from additional body properties for Chat Completions endpoint
- Retain only top-level "reasoning_effort" property to ensure compatibility
- Update logging to reflect the change in property setting for clarity
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 08ddd3f8d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/main/java/com/williamcallahan/javachat/web/GuidedLearningController.java Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates chat/guided SSE streaming to an OpenAI Java SDK-based service, expands server-side “structured markdown” processing (citations/enrichments), and adds configuration/test scaffolding for more reliable local + CI runs.

Changes:

  • Replace manual SSE parsing / resilient client calls with OpenAIStreamingService and SSE heartbeats.
  • Introduce/expand structured markdown types and tests (citations, enrichments, list/code normalization).
  • Update runtime/test configuration for ports, Qdrant startup behavior, embeddings fallbacks, and tooling (Docker/Maven/Makefile).

Reviewed changes

Copilot reviewed 77 out of 79 changed files in this pull request and generated 21 comments.

Show a summary per file
File Description
src/test/resources/application.properties Adds test profile properties to avoid writing into repo and disable optional AI models during tests.
src/test/java/com/williamcallahan/javachat/web/MarkdownApiIntegrationTest.java Adds MVC test for markdown rendering edge case via API.
src/test/java/com/williamcallahan/javachat/web/GuidedSseIntegrationTest.java Adds external-service-gated SSE integration test for guided streaming.
src/test/java/com/williamcallahan/javachat/web/GuidedLearningControllerTest.java Updates controller test wiring/mocks for new streaming/markdown dependencies.
src/test/java/com/williamcallahan/javachat/web/ChatSseIntegrationTest.java Adds external-service-gated SSE integration test for chat streaming.
src/test/java/com/williamcallahan/javachat/service/MarkdownServiceTest.java Updates markdown tests to use processStructured() and asserts enrichment rendering/classes.
src/test/java/com/williamcallahan/javachat/service/MarkdownPreprocessingTest.java Refocuses preprocessing tests on rendered HTML output from processStructured().
src/test/java/com/williamcallahan/javachat/service/ComprehensiveListFormattingTest.java Updates list-formatting assertions to validate HTML lists from processStructured().
src/test/java/com/williamcallahan/javachat/TestConfiguration.java Adds JUnit extension/tag to gate tests requiring external services.
src/test/java/TestWhatGetsStreamed.java Adds ad-hoc executable for inspecting streaming payload formatting.
src/test/java/TestWebFluxSSE.java Adds ad-hoc executable for inspecting raw WebFlux SSE chunks.
src/test/java/TestGPT5Streaming.java Adds ad-hoc executable for inspecting raw OpenAI SSE response parsing.
src/test/java/TestCompleteStreaming.java Adds ad-hoc executable to validate end-to-end streaming extraction.
src/main/resources/static/js/markdown-utils.js Deprecates client-side markdown/code transformations, expands bullet support, improves enrichment/link processing.
src/main/resources/static/error.html Normalizes SVG attribute casing (viewBox).
src/main/resources/static/404.html Normalizes SVG attribute casing (viewBox).
src/main/resources/application.properties Adds memory defaults, embedding fallback config, Qdrant startup toggle, and excludes OpenAI model autoconfigs.
src/main/resources/application-dev.properties Adds diagnostics flags and makes LiveReload port configurable.
src/main/java/com/williamcallahan/javachat/web/MarkdownController.java Switches rendering to structured processing and adds a structured render endpoint; cache stats wired to unified service.
src/main/java/com/williamcallahan/javachat/web/GuidedLearningController.java Updates guided endpoints to use SDK streaming, adds proxy headers, and uses processStructured() for HTML.
src/main/java/com/williamcallahan/javachat/web/ErrorTestController.java Whitespace-only change.
src/main/java/com/williamcallahan/javachat/web/ChatController.java Reworks chat streaming to SDK-based streaming with ServerSentEvent heartbeats and adds diagnostics/clear endpoints.
src/main/java/com/williamcallahan/javachat/util/JavadocLinkResolver.java Avoids risky anchor heuristics for Spring docs; ignores .class params.
src/main/java/com/williamcallahan/javachat/service/markdown/Warning.java Adds typed enrichment record for warnings.
src/main/java/com/williamcallahan/javachat/service/markdown/Reminder.java Adds typed enrichment record for reminders.
src/main/java/com/williamcallahan/javachat/service/markdown/ProcessingWarning.java Adds typed processing-warning record and enums for warning categories.
src/main/java/com/williamcallahan/javachat/service/markdown/ProcessedMarkdown.java Adds structured markdown result type (html + citations/enrichments/warnings + timing).
src/main/java/com/williamcallahan/javachat/service/markdown/MarkdownEnrichment.java Adds sealed interface for typed enrichments.
src/main/java/com/williamcallahan/javachat/service/markdown/MarkdownCitation.java Adds structured citation record + helpers.
src/main/java/com/williamcallahan/javachat/service/markdown/InlineListPostProcessor.java Adds placeholder post-processor class (currently unused).
src/main/java/com/williamcallahan/javachat/service/markdown/InlineListExtension.java Adds no-op parser extension (currently unused).
src/main/java/com/williamcallahan/javachat/service/markdown/Hint.java Adds typed enrichment record for hints.
src/main/java/com/williamcallahan/javachat/service/markdown/Example.java Adds typed enrichment record for examples.
src/main/java/com/williamcallahan/javachat/service/markdown/EnrichmentProcessor.java Adds AST-based enrichment extraction (visitor + transitional marker parsing).
src/main/java/com/williamcallahan/javachat/service/markdown/EnrichmentPriority.java Adds priority enum for enrichment ordering.
src/main/java/com/williamcallahan/javachat/service/markdown/CitationType.java Adds enum + URL classification for citations.
src/main/java/com/williamcallahan/javachat/service/markdown/CitationProcessor.java Adds AST-based citation extraction via Flexmark visitor.
src/main/java/com/williamcallahan/javachat/service/markdown/Background.java Adds typed enrichment record for background blocks.
src/main/java/com/williamcallahan/javachat/service/ResilientApiClient.java Removes legacy manual SSE parsing client.
src/main/java/com/williamcallahan/javachat/service/RerankerService.java Switches reranking to SDK-based completion and improves cache keying.
src/main/java/com/williamcallahan/javachat/service/RateLimitState.java Makes persistence safer and prefers injected ObjectMapper.
src/main/java/com/williamcallahan/javachat/service/OpenAiCompatibleEmbeddingModel.java Adds an OpenAI-compatible embedding model for remote providers.
src/main/java/com/williamcallahan/javachat/service/OpenAIStreamingService.java Adds SDK-based streaming + completion with provider selection/backoff.
src/main/java/com/williamcallahan/javachat/service/MarkdownStreamProcessor.java Adds deprecated streaming markdown processor (currently appears unused).
src/main/java/com/williamcallahan/javachat/service/LocalEmbeddingModel.java Refactors embedding calls and fallback behavior for readability and consistency.
src/main/java/com/williamcallahan/javachat/service/GuidedLearningService.java Adds method to build guided prompt for SDK streaming.
src/main/java/com/williamcallahan/javachat/service/GracefulEmbeddingModel.java Tightens failure behavior (no silent zero-vector embedding) and refactors formatting.
src/main/java/com/williamcallahan/javachat/service/EnrichmentService.java Switches enrichment generation to SDK completion.
src/main/java/com/williamcallahan/javachat/service/ChatService.java Switches streaming to SDK, adds prompt building helpers, and URL normalization logic.
src/main/java/com/williamcallahan/javachat/service/ChatMemoryService.java Attempts to improve thread-safety for session histories.
src/main/java/com/williamcallahan/javachat/service/AuditService.java Removes unused set tracking and trailing whitespace.
src/main/java/com/williamcallahan/javachat/config/TestPortConfig.java Adds configuration-properties holder for test port settings.
src/main/java/com/williamcallahan/javachat/config/QdrantIndexInitializer.java Skips index creation outside of test profile and adds app-level toggle.
src/main/java/com/williamcallahan/javachat/config/PortInitializer.java Disables port manipulation when running under test profile.
src/main/java/com/williamcallahan/javachat/config/EmbeddingFallbackConfig.java Adds remote embedding provider support and improves fallback selection.
src/main/java/com/williamcallahan/javachat/config/DocsSourceRegistry.java Fixes Spring doc URL mapping/normalization to remove spurious java/ segments.
src/main/java/com/williamcallahan/javachat/config/AppProperties.java Adds properties for diagnostics, remote embeddings, and Qdrant toggles.
src/main/java/com/williamcallahan/javachat/config/ApiKeyLoggingConfig.java Improves key/endpoint detection logic and logging behavior.
src/main/java/com/williamcallahan/javachat/JavaChatApplication.java Forces Netty OpenSSL off to avoid Alpine/musl issues.
pom.xml Adds OpenAI SDK, gRPC BOM/core alignment, surefire tag exclusions, and dotenv support.
docker-compose-qdrant.yml Pins Qdrant image version for reproducibility.
README.md Updates documented model/architecture details and adds mobile section.
Makefile Loads .env for tests and adjusts ports/JVM args for dev/run.
Dockerfile Switches to ECR mirror base images and tightens JVM memory/runtime flags.
CLAUDE.md Adds repo-specific agent rules/instructions document.
AGENT.md Removes legacy pointer file.
.htmlhintrc Adds HTMLHint configuration.
.hintrc Updates hint configuration and browserslist exclusions.
.gitignore Adds build artifact ignores (plus an accidental -e entry).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/main/java/com/williamcallahan/javachat/service/ChatService.java
Comment thread src/main/resources/application.properties Outdated
Comment thread src/main/resources/static/chat.html
Comment thread src/main/resources/static/chat.html
@WilliamAGH WilliamAGH changed the title merge feat: OpenAI SDK integration, AST markdown processing, mobile UI Jan 23, 2026
Prevent NullPointerException when no OpenAI-compatible client is
configured. selectClientForStreaming() may return null; add explicit
check and emit clear error instead of crashing.
- Add isAvailable() check before calling streamResponse() to prevent
  crashes when no API credentials configured
- Downgrade diagnostic prompt logging from INFO to DEBUG to avoid
  leaking sensitive user input and document content
- Apply URL normalization to all context URLs in prompt building
synchronizedList only protects individual operations, not iteration.
Return defensive copies from public getters to prevent
ConcurrentModificationException when callers iterate while another
thread modifies the list.
Replace ': keepalive' string with empty string for heartbeats and
filter them out. The previous implementation emitted pre-formatted
SSE comment strings which Spring serialized as data events, leaking
': keepalive' text into visible responses.
Update comment to accurately describe that lazy-initialization=true
defers bean creation to first use and moves startup errors to runtime,
rather than claiming 'no behavior change'.
The 'cached' field always reported true regardless of actual cache
status. Set to false since cache hit information isn't tracked at
the controller layer.
The null check on line 61 was unreachable since it was already guarded
by the null/blank validation on line 54-56.
InlineListPostProcessor and InlineListExtension were empty no-op
placeholders not referenced anywhere. List normalization is handled
via DOM after render in UnifiedMarkdownService.
The ternary condition 'i + 1 < lines.length' was always true because
the outer if already checked 'i < lines.length - 1'.
…tion

- Remove 'digits > 3' check in removeBracketNumbers() which was always
  false since the while loop bounds digits to max 3
- Add braces and fix indentation in promoteSingleItemOrderedListHeadings()
  to clarify control flow
@railway-app railway-app Bot temporarily deployed to insightful-intuition / production January 23, 2026 20:18 Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
README.md (1)

7-12: Add an “as of” date to the “Latest Updates” stats to prevent staleness.

These numbers will drift quickly; a simple “As of January 23, 2026” note keeps the section honest and future‑proof.

src/main/resources/static/index.html (1)

615-628: Add origin and source validation to the postMessage handler.

The listener currently accepts messages from any source without checking where they come from—this is a fun security pattern to lock down! Since contentFrame is your trusted iframe, verify that messages originate from exactly there before toggling the UI. It's a small check that keeps bad actors from spoofing loading states.

🔐 Safer handler
 window.addEventListener('message', (e) => {
   const data = e && e.data;
   if (!data || typeof data !== 'object') return;
+  if (e.source !== frame.contentWindow) return;
+  if (e.origin !== window.location.origin) return;
   
   if (data.type === 'content-started' || data.type === 'content-ready' || data.type === 'dom-ready') {
🤖 Fix all issues with AI agents
In `@README.md`:
- Around line 346-349: Update the GPT‑5 context size note in the README: replace
the incorrect "8K input" text in the bullet under "OpenAI Java SDK
(standardized)" (near the mention of OpenAIStreamingService and the "Prompt
truncation for GPT‑5" line) with the current official API specs—GPT‑5 has a
~400,000‑token context window with a max output of 128,000 tokens (practical max
input ≈ 400K minus output allocation); optionally note the ChatGPT app
“Thinking” limit of 196,000 tokens and cite the OpenAI API docs for reference;
ensure the wording clarifies that prompt truncation logic in
OpenAIStreamingService should account for these limits rather than a fixed 8K.

In
`@src/main/java/com/williamcallahan/javachat/service/GracefulEmbeddingModel.java`:
- Around line 29-35: The circuit-breaker fields in GracefulEmbeddingModel
(primaryAvailable, secondaryAvailable, lastPrimaryCheck, lastSecondaryCheck) are
not thread-safe; make them visible across threads by declaring the two boolean
flags and two timestamp longs as volatile (or replace with
AtomicBoolean/AtomicLong if you need CAS semantics) and keep
CIRCUIT_BREAKER_TIMEOUT as-is; update any readers/writers to rely on the
volatile/atomic fields so the breaker flips reliably across threads.

In
`@src/main/java/com/williamcallahan/javachat/service/markdown/UnifiedMarkdownService.java`:
- Around line 590-663: The enrichment position is being passed as
absolutePosition + i (double-counting the offset); update the enrichment
creation to use a single base index (use absolutePosition alone or
absolutePosition + 0) when calling Hint.create, Warning.create,
Background.create, Example.create, and Reminder.create (replace absolutePosition
+ i with absolutePosition), and ensure any metadata/placeholder generation
(buildEnrichmentHtmlUnified, placeholders.put) uses that same single index so
positions remain consistent with the subsequent absolutePosition arithmetic.

In
`@src/main/java/com/williamcallahan/javachat/service/MarkdownStreamProcessor.java`:
- Around line 121-152: checkForCommit and flushRemainingCommit currently allow
commitBuffer to grow unbounded because commitBuffer only flushes on semantic
boundaries; add a size-cap guard and optional timeout logic to prevent memory
bloat. Modify checkForCommit (and flushRemainingCommit behavior) to check
commitBuffer.length() against a max constant (e.g., MAX_COMMIT_BUFFER_SIZE) and
if exceeded, force a flush by returning the buffer content and resetting
commitBuffer (log a warning), and optionally add a timestamp field
(lastAppendTime) updated in checkForCommit so a separate timeout check can flush
stale content when streaming pauses; ensure references: commitBuffer,
checkForCommit, flushRemainingCommit, isCompleteBlock are updated consistently
and tests/logs reflect forced flush cases.

In `@src/main/java/com/williamcallahan/javachat/service/RateLimitManager.java`:
- Around line 160-200: isProviderAvailable should first verify the provider is
actually configured: call isProviderConfigured(provider.getName()) at the top
and return false (and optionally log a debug) if it's not configured, before
checking rateLimitState and endpointStates; update the method that currently
creates/uses ApiEndpointState via endpointStates.computeIfAbsent(...) so
unconfigured providers are rejected early to avoid downstream auth errors.

In `@src/main/java/com/williamcallahan/javachat/service/RateLimitState.java`:
- Around line 32-36: safeSaveState() currently can be invoked concurrently
(scheduler and request path) causing interleaved JSON writes; add a single
ReentrantLock (e.g., private final ReentrantLock stateWriteLock = new
ReentrantLock()) to RateLimitState and use stateWriteLock.lock()/unlock() (in
try/finally) to serialize all persistence operations inside safeSaveState() and
any other save paths referenced around the 207-227 region so file writes cannot
interleave; reference the existing scheduler, providerStates, and
safeSaveState() when locating where to place the lock and ensure unlocking in
finally blocks.

In `@src/main/java/com/williamcallahan/javachat/service/RerankerService.java`:
- Around line 197-208: computeDocsHash can NPE when a Document has no "url"
metadata and getText() returns null; update computeDocsHash to null-safe the
text fallback used in the sb.append(...) call (in the for loop inside
computeDocsHash) by using a safe string or zero value when doc.getText() is null
(e.g. use Objects.toString(doc.getText(), "") or check doc.getText() != null
before calling hashCode), ensuring you still append a deterministic token for
that document so cache keys remain stable.

In `@src/main/java/com/williamcallahan/javachat/web/ChatController.java`:
- Around line 114-132: The Flux returned by
openAIStreamingService.streamResponse(...) is cold and is being subscribed twice
(once for dataEvents and once for takeUntilOther), causing duplicate API calls;
fix by filtering null/empty chunks before side-effects, convert the stream to a
single shared hot stream (e.g., call .share() or .publish().refCount(1)) and use
that shared Flux for both dataEvents and the heartbeat completion signal (use
shared.ignoreElements().onErrorResume(...) as the completion Mono), and move the
.doOnNext(...) that appends to fullResponse and increments chunkCount to act on
the shared stream after the filter so only valid chunks are appended.
♻️ Duplicate comments (4)
src/main/resources/application.properties (1)

9-12: Memory-conscious defaults look reasonable, but note the lazy-init trade-off.

The comment on line 10 helpfully documents the behavioral change. Just a friendly reminder: lazy-initialization=true can defer bean creation failures to runtime, which might surprise you during development. If startup validation is important, consider enabling it only in constrained production profiles.

src/main/java/com/williamcallahan/javachat/service/markdown/CitationProcessor.java (1)

48-91: Separate citation index from character offset

position is advanced by text length and incremented per citation, so offsets drift as soon as a link appears. Consider keeping a dedicated character offset (or using Flexmark’s start offsets) and let the list order represent citation order.

🔧 Possible fix (offset-based)
-    private int position = 0;
+    private int charOffset = 0;

     private void visitLink(Link link) {
         String url = link.getUrl().toString();
         String title = extractLinkTitle(link);
         CitationType type = CitationType.fromUrl(url);

         if (isValidCitation(url, title)) {
-            MarkdownCitation citation = MarkdownCitation.create(url, title, "", type, position++);
+            int pos = link.getStartOffset(); // or use charOffset if you prefer
+            MarkdownCitation citation = MarkdownCitation.create(url, title, "", type, pos);
             citations.add(citation);
             logger.debug("Found citation: {} -> {}", title, url);
         }

         // Continue visiting child nodes
         visitor.visitChildren(link);
     }

     private void visitText(Text text) {
         // This could be extended to handle inline citation markers like [1], [2]
         // For now, we focus on explicit links
-        position += text.getChars().length();
+        charOffset += text.getChars().length();
     }
src/main/java/com/williamcallahan/javachat/service/OpenAIStreamingService.java (1)

240-254: Hardcoded GPT-5 ignores OPENAI_MODEL configuration

This was flagged in previous reviews and remains unresolved. The model field is populated from OPENAI_MODEL config (line 57-58), but buildChatParams always uses ChatModel.GPT_5 (line 244). This means operators who configure a different model will be surprised when their setting is ignored!

Either honor the configured model or document clearly that GPT-5 is intentionally locked in.

src/main/java/com/williamcallahan/javachat/web/GuidedLearningController.java (1)

187-187: History snapshot may have race conditions

Quick reminder from a past review - copying the history with new ArrayList<>(chatMemory.getHistory(sessionId)) can be racy if the underlying list is a synchronizedList. Ideally, iteration/copying should be synchronized on the list, or ChatMemoryService should provide a thread-safe snapshot method.

This is existing behavior, so not a blocker, but worth addressing for robustness!

🟡 Minor comments (19)
AGENTS.md-6-6 (1)

6-6: GT1 might block everyday development workflow. 🤔

Requiring "elevated permissions" for all git commands seems overly restrictive. Standard operations like git status, git log, git diff, git commit, and git pull are routine and shouldn't need special escalation. Consider limiting this rule to truly destructive operations (already covered by GT3) or operations that affect remote repositories.

.gitignore-49-49 (1)

49-49: Looks like a stray artifact snuck in here! 🕵️

The -e entry appears to be accidental—possibly output from an echo -e command or a copy-paste mishap. This won't match any meaningful log files and should likely be removed.

🧹 Suggested fix
 ### Logs ###
 logs/
 *.log
--e 
 # Build artifacts
 BOOT-INF/
 classpath.txt
pom.xml-63-68 (1)

63-68: Consider updating to the latest openai-java version for better compatibility and bug fixes!

Version 3.0.1 is stable, but it's a few months old (released Aug 2025). The latest version available on Maven Central is 4.16.1 (as of Jan 2026), which likely includes important updates and improvements. A quick bump could help ensure the new SDK integration has the latest enhancements!

<dependency>
    <groupId>com.openai</groupId>
    <artifactId>openai-java</artifactId>
    <version>4.16.1</version>
</dependency>
docs/domains/all-parsing-and-markdown-logic.md-471-485 (1)

471-485: Update diagram and notes—ResilientApiClient was removed during SDK migration.

The README confirms it's gone ("Removed ResilientApiClient and all manual SSE parsing"), and OpenAIStreamingService now handles what it used to do. The diagram at lines 471–485 and the associated notes at line 492 still reference it, so they need a refresh to keep docs in sync with the codebase.

docs/domains/all-parsing-and-markdown-logic.md-9-51 (1)

9-51: Add language tags to fenced blocks.
Markdown linters (and readers) love this tiny upgrade for clarity and highlighting.

✍️ Example tweak (apply similarly to other fences)
-```
+```text
📋 MARKDOWN PROCESSING ARCHITECTURE
...
-```
+```
docs/domains/all-parsing-and-markdown-logic.md-559-569 (1)

559-569: Wrap the [CTX 1][CTX 2] example as code.
This prevents Markdown from treating it as a reference link (and silences MD052).

✍️ Suggested edit
-- Symptom: Model emits context markers (e.g., [CTX 1][CTX 2]) that appear in prose instead of becoming proper citation pills.
+- Symptom: Model emits context markers (e.g., `[CTX 1][CTX 2]`) that appear in prose instead of becoming proper citation pills.
src/test/java/TestCompleteStreaming.java-34-81 (1)

34-81: Use ServerSentEvent<String> for clearer SSE handling.

OpenAI's streaming API sends standard SSE frames with data: prefixes, and while Spring's WebClient codec already strips that framing when you use bodyToFlux(String.class), using ServerSentEvent<String> makes the intent explicit and future-proof. It also keeps your code in sync with Spring's SSE abstractions rather than assuming codec behavior.

Cleaner SSE parsing
+import org.springframework.core.ParameterizedTypeReference;
+import org.springframework.http.codec.ServerSentEvent;
 ...
-        Flux<String> stream = webClient.post()
+        Flux<ServerSentEvent<String>> stream = webClient.post()
             .uri("https://api.openai.com/v1/chat/completions")
             .header("Authorization", "Bearer " + OPENAI_API_KEY)
             .header("Accept", "text/event-stream")
             .contentType(MediaType.APPLICATION_JSON)
             .bodyValue(body)
             .retrieve()
-            .bodyToFlux(String.class);
+            .bodyToFlux(new ParameterizedTypeReference<ServerSentEvent<String>>() {});
 ...
-        stream
-            .flatMap(chunk -> {
-                // Mirrors logic now handled by OpenAIStreamingService
-                if (chunk == null || chunk.trim().isEmpty() || chunk.equals("[DONE]")) {
+        stream
+            .flatMap(event -> {
+                String data = event.data();
+                if (data == null || data.trim().isEmpty() || data.equals("[DONE]")) {
                     return Flux.empty();
                 }
                 try {
-                    // Parse the raw JSON chunk directly
-                    Map<String, Object> data = objectMapper.readValue(chunk, new TypeReference<Map<String, Object>>() {});
+                    Map<String, Object> payload =
+                        objectMapper.readValue(data, new TypeReference<Map<String, Object>>() {});
 ...
-                    Object choicesObj = data.get("choices");
+                    Object choicesObj = payload.get("choices");
src/main/java/com/williamcallahan/javachat/config/EmbeddingFallbackConfig.java-89-91 (1)

89-91: Duplicate log statements

Lines 89 and 91 both log configuration messages at startup. This seems like an artifact from the refactor - line 89 says "OpenAI embedding" while line 91 says "remote/OpenAI embeddings". One should be removed to avoid log noise! 📋

🐛 Suggested fix
-        log.info("[EMBEDDING] Configuring OpenAI embedding with fallback strategies");
-        
         log.info("[EMBEDDING] Configuring remote/OpenAI embeddings with fallback strategies");
src/main/java/com/williamcallahan/javachat/config/EmbeddingFallbackConfig.java-135-145 (1)

135-145: Potential NPE if URI host is null

The redactUrl helper is a nice touch for security-conscious logging! However, uri.getHost() can return null for certain malformed or scheme-only URIs (e.g., file:///path), which would result in logging "https://null".

🛡️ Defensive fix
 private String redactUrl(String url) {
     if (url == null) return "";
     try {
         java.net.URI uri = java.net.URI.create(url);
+        if (uri.getHost() == null) return url;
         String host = uri.getScheme() + "://" + uri.getHost();
         if (uri.getPort() > 0) host += ":" + uri.getPort();
         return host;
     } catch (Exception e) {
         return url;
     }
 }
src/test/java/com/williamcallahan/javachat/web/GuidedSseIntegrationTest.java-45-51 (1)

45-51: Potential NPE if block() returns null

If the reactive stream errors out or times out in certain ways, block() can return null. Calling .stream() on null would throw NPE. A defensive null check or using Objects.requireNonNull with a clear message would make test failures more debuggable.

🐛 Defensive fix
-        String aggregated = body
+        var chunks = body
                 .takeUntil(chunk -> chunk.contains("[DONE]"))
                 .take(Duration.ofSeconds(10))
                 .collectList()
-                .block(Duration.ofSeconds(15))
+                .block(Duration.ofSeconds(15));
+
+        assertNotNull(chunks, "Stream should return chunks");
+
+        String aggregated = chunks
                 .stream()
                 .reduce("", (a, b) -> a + b);
src/main/java/com/williamcallahan/javachat/service/LocalEmbeddingModel.java-186-192 (1)

186-192: Potential NPE if data value is null

The code checks response.containsKey("data") but doesn't verify the value is non-null before calling isEmpty(). If the JSON contains "data": null, the cast succeeds but dataList.isEmpty() throws NPE.

🐛 Defensive fix
 `@SuppressWarnings`("unchecked")
 private float[] parseEmbeddingResponse(Map<String, Object> response) {
-    if (response == null || !response.containsKey("data")) {
+    if (response == null || response.get("data") == null) {
         log.error("Invalid response from embedding API: {}", response);
         return null;
     }

     List<Map<String, Object>> dataList = (List<
         Map<String, Object>
     >) response.get("data");
-    if (dataList.isEmpty()) {
+    if (dataList == null || dataList.isEmpty()) {
         log.error("Empty data list in embedding API response");
         return null;
     }
src/main/java/com/williamcallahan/javachat/service/markdown/MarkdownCitation.java-55-61 (1)

55-61: Handle potential null return from URI.getHost().

Fun fact: URI.create("file:///path/to/file").getHost() returns null! For URIs like mailto:, file:, or malformed URLs, getHost() can legitimately return null. A tiny null-check keeps things tidy:

✨ Proposed fix
 public String getDomain() {
     try {
-        return java.net.URI.create(url).getHost();
+        String host = java.net.URI.create(url).getHost();
+        return host != null ? host : "unknown";
     } catch (Exception e) {
         return "unknown";
     }
 }
src/test/java/com/williamcallahan/javachat/service/ComprehensiveListFormattingTest.java-199-203 (1)

199-203: Boolean concatenation produces unexpected output.

This is a fun Java quirk! !html1.contains("<ol>") + !html1.contains("<ul>") doesn't produce a logical AND—it concatenates booleans as strings (e.g., "truetrue"). If you want a combined check, use &&:

🐛 Proposed fix
 System.out.println("\nTest: No false positives");
-System.out.println("Version number preserved: " + !html1.contains("<ol>") + !html1.contains("<ul>"));
-System.out.println("Math expression preserved: " + !html2.contains("<ol>") + !html2.contains("<ul>"));
-System.out.println("Normal sentences preserved: " + !html3.contains("<ol>") + !html3.contains("<ul>"));
+System.out.println("Version number preserved: " + (!html1.contains("<ol>") && !html1.contains("<ul>")));
+System.out.println("Math expression preserved: " + (!html2.contains("<ol>") && !html2.contains("<ul>")));
+System.out.println("Normal sentences preserved: " + (!html3.contains("<ol>") && !html3.contains("<ul>")));
src/main/java/com/williamcallahan/javachat/web/MarkdownController.java-188-201 (1)

188-201: Keep renderStructured response shape consistent for empty input

The empty-response branch omits structuredElementCount and isClean, which can surprise clients expecting a stable schema. Returning defaults keeps the contract consistent.

✅ Suggested fix
             if (markdown == null || markdown.isEmpty()) {
                 return ResponseEntity.ok(Map.of(
                     "html", "",
                     "citations", 0,
                     "enrichments", 0,
                     "warnings", 0,
                     "processingTimeMs", 0L,
-                    "source", "unified-service"
+                    "source", "unified-service",
+                    "structuredElementCount", 0,
+                    "isClean", true
                 ));
             }
src/main/java/com/williamcallahan/javachat/web/GuidedLearningController.java-204-209 (1)

204-209: Heartbeat effectively does nothing 🤔

Fun puzzle here! The heartbeat emits empty strings (line 206), but then line 209 filters out empty strings. So the heartbeat doesn't actually send anything to the client - it just keeps the reactive stream "warm" internally.

If the goal is to prevent proxy timeouts, you might need actual SSE comments. However, if it's just to keep the takeUntilOther working to terminate properly, this is fine! Just worth clarifying the intent.

Previous review suggested using ServerSentEvent.comment("keepalive") for proper SSE heartbeats - that's still a good option if you need client-visible keepalives.

src/main/resources/static/css/app.css-758-758 (1)

758-758: Typo: 'pear em' is not a valid selector 🍐

Looks like a typo crept in! pear em should probably be p em (emphasis inside paragraphs) or removed entirely. Currently it won't match any elements.

🔧 Fix the typo
-pear em { font-style: normal; }
+p em { font-style: normal; }
src/main/resources/static/guided.html-843-847 (1)

843-847: Element ID mismatch - handler won't find the element 🐛

Quick bug: line 843 looks for 'lesson-content' but the actual element ID is 'lessonContent' (defined at line 69). The orientation change handler will silently fail to scroll the lesson content!

🔧 Fix the element ID
        window.addEventListener('orientationchange', function() {
            setTimeout(() => {
                // Ensure lesson content is properly positioned after rotation
-                const lessonContent = document.getElementById('lesson-content');
+                const lessonContent = document.getElementById('lessonContent');
                const chatArea = document.getElementById('chatArea');
src/main/java/com/williamcallahan/javachat/service/RetrievalService.java-218-226 (1)

218-226: Metadata loss during document truncation 🧩

Hey there! Quick heads up - when creating the truncated document, you're adding truncated and originalLength to the metadata map, but then createLocalDocument only preserves url and sets title to "Local Doc". This means you lose all the original metadata like title, package, chunkIndex, etc.

The snippet from DocumentFactory.createLocalDocument shows it only keeps url and sets a generic title. Consider creating a method that preserves the original metadata or using the full createDocument method instead!

🔧 Suggested fix
-        // Create new document with truncated content
-        Map<String, Object> metadata = new HashMap<>(doc.getMetadata());
-        metadata.put("truncated", true);
-        metadata.put("originalLength", content.length());
-
-        return documentFactory.createLocalDocument(
-            truncated,
-            String.valueOf(metadata.get("url"))
-        );
+        // Create new document with truncated content, preserving original metadata
+        Document truncatedDoc = new Document(truncated);
+        truncatedDoc.getMetadata().putAll(doc.getMetadata());
+        truncatedDoc.getMetadata().put("truncated", true);
+        truncatedDoc.getMetadata().put("originalLength", content.length());
+        return truncatedDoc;
src/main/resources/static/css/app.css-47-54 (1)

47-54: Duplicate CSS custom property definition 🔄

Static analysis caught this: --dark-bg is defined twice - once at line 33 and again at line 47. The second definition overwrites the first, which might be intentional for layering, but can lead to confusion. Consider removing the duplicate or using a more explicit override pattern.

🧹 Remove duplicate
    /* Theme Variables */
    --accent-primary: var(--coffee-warm);
    --accent-secondary: var(--coffee-cream);
    --accent-tertiary: var(--coffee-subtle);
-    --dark-bg: var(--dark-bg);
-    --dark-surface-1: var(--dark-surface-1);
-    --dark-surface-2: var(--dark-surface-2);
-    --dark-surface-3: var(--dark-surface-3);
-    --dark-border: var(--dark-border);
-    --dark-text-primary: var(--dark-text-primary);
-    --dark-text-secondary: var(--dark-text-secondary);
-    --dark-text-tertiary: var(--dark-text-tertiary);

These self-referential assignments don't do anything useful and can be removed.

🧹 Nitpick comments (37)
src/main/resources/application-dev.properties (1)

1-4: Consider a safer default sampling value to avoid log floods.

If app.diagnostics.streamChunkLogging is ever toggled on, a streamChunkSample=0 means every chunk, which can get noisy fast. A small default (e.g., 10 or 20) reduces accidental log storms while still being useful.

README.md (1)

427-457: Revisit “zoom prevention” guidance to avoid accessibility regressions.

Documenting strategies that block zoom (or discouraging user-scalable) conflicts with mobile accessibility expectations. Suggest clarifying that pinch‑zoom remains available and only double‑tap zoom is prevented.

AGENTS.md (3)

43-43: RC4 is admirably strict but might be too rigid for real-world scenarios.

While the intent to avoid workarounds is excellent, completely banning "adapters, wrappers, or bridge code" could block legitimate patterns. The Adapter pattern is specifically designed for integrating third-party libraries you can't modify at source. Consider allowing adapters/wrappers when they provide clean domain boundaries for external dependencies, while still prohibiting them for internal code where you can fix at source.


48-48: NO4 could be clearer about when to use Optional vs nullable parameters.

The rule seems to say: "Don't use Optional<T> as method parameters; use nullable T instead." This is good practice (Optional is for return values), but the current phrasing is a bit confusing with "Optional parameters prohibited" followed by nullable parameter handling. Consider rewording for clarity, perhaps: "NO4 Use nullable parameters, not Optional, in method signatures; check internally; callers unwrap Optional with .orElse(null)."


19-19: FS8's utility ban is well-intentioned but might be slightly too strict.

The goal of preventing bloated *Utils god classes is excellent, and the specific bans (generic repositories, shared utils) are spot-on. However, focused, cohesive utility classes like DateFormatting or ValidationHelpers can be legitimate when they group related static functions. Consider adjusting to: "No generic catch-all utilities; specific-purpose helpers allowed when cohesive and <100 LOC." That said, the current strict stance does force better domain modeling, so it's a reasonable trade-off!

src/main/java/com/williamcallahan/javachat/service/ChatMemoryService.java (2)

44-52: Consider tightening helper visibility.
If these are strictly internal, making them private or marking @VisibleForTesting keeps the public surface tidy.

Also applies to: 81-87


54-62: Keep history/turns alignment in mind.
If any consumer expects these lists to be strictly in sync, consider a per-session lock or a unified session state to update both atomically.

src/main/java/com/williamcallahan/javachat/JavaChatApplication.java (1)

12-14: Guard these properties so ops can override them if needed.

These are set unconditionally, which prevents override via JVM arguments or configuration. Using System.getProperty() checks lets operators adjust the behavior on Alpine (or any environment) without recompiling—they keep your safety net while keeping their flexibility. Fun fact: Netty reads this during static initialization, so it catches the Alpine issue early.

One small note: the second property (io.grpc.netty.shaded.io.netty.handler.ssl.noOpenSsl) may be redundant—Netty's check uses the key io.netty.handler.ssl.noOpenSsl regardless of shading, so setting both is defensive but the shaded variant might not take effect. Verify whether you actually need both or if just the first one suffices for your gRPC setup.

♻️ Suggested refactor (guarded set)
-        System.setProperty("io.netty.handler.ssl.noOpenSsl", "true");
-        System.setProperty("io.grpc.netty.shaded.io.netty.handler.ssl.noOpenSsl", "true");
+        if (System.getProperty("io.netty.handler.ssl.noOpenSsl") == null) {
+            System.setProperty("io.netty.handler.ssl.noOpenSsl", "true");
+        }
+        if (System.getProperty("io.grpc.netty.shaded.io.netty.handler.ssl.noOpenSsl") == null) {
+            System.setProperty("io.grpc.netty.shaded.io.netty.handler.ssl.noOpenSsl", "true");
+        }
src/main/java/com/williamcallahan/javachat/service/GracefulEmbeddingModel.java (1)

62-106: Treat empty responses as failures so the breaker can back off.

Right now, a null/empty response keeps the provider “available,” so it gets hit every call. Marking it unavailable on empty results will engage the timeout and avoid tight retry loops.

♻️ Suggested tweak
                 EmbeddingResponse response = primaryModel.call(request);
                 if (response != null && !response.getResults().isEmpty()) {
                     if (!primaryAvailable) {
                         log.info(
                             "[EMBEDDING] Primary embedding service recovered"
                         );
                     }
                     return response;
                 }
+                log.warn("[EMBEDDING] Primary embedding service returned empty results");
+                primaryAvailable = false;
+                lastPrimaryCheck = System.currentTimeMillis();
             } catch (Exception e) {
                 log.warn(
                     "[EMBEDDING] Primary embedding service failed: {}",
                     e.getMessage()
                 );
                 primaryAvailable = false;
                 lastPrimaryCheck = System.currentTimeMillis();
             }
@@
                 EmbeddingResponse response = secondaryModel.call(request);
                 if (response != null && !response.getResults().isEmpty()) {
                     if (!secondaryAvailable) {
                         log.info(
                             "[EMBEDDING] Secondary embedding service recovered"
                         );
                     }
                     return response;
                 }
+                log.warn("[EMBEDDING] Secondary embedding service returned empty results");
+                secondaryAvailable = false;
+                lastSecondaryCheck = System.currentTimeMillis();
             } catch (Exception e) {
                 log.warn(
                     "[EMBEDDING] Secondary embedding service failed: {}",
                     e.getMessage()
                 );
                 secondaryAvailable = false;
                 lastSecondaryCheck = System.currentTimeMillis();
             }
.hintrc (1)

17-26: Browserslist config looks good—a small heads-up on versioning! 📅

These version exclusions (e.g., firefox <= 139, safari <= 18.5) are quite specific. As browsers release new versions, you may need to periodically update these values. Consider adding a comment noting when this was last reviewed, or using relative queries like last 2 versions for easier maintenance if that fits your support policy.

That said, explicit version pinning gives you precise control, which can be valuable!

src/main/java/com/williamcallahan/javachat/config/PortInitializer.java (1)

23-34: Smart guard to keep tests isolated! 🧪

Disabling port manipulation during tests is the right call—you don't want kill -9 commands running in CI or parallel test environments. The Spring profile check using equalsIgnoreCase is solid.

One small edge case: the contains("test") check on line 31 could match unintended profiles like "contest" or "latest". Consider a more precise match:

🔧 More precise env var check
         String activeEnv = System.getenv("SPRING_PROFILES_ACTIVE");
-        if (activeEnv != null && activeEnv.toLowerCase().contains("test")) {
+        if (activeEnv != null && java.util.Arrays.stream(activeEnv.split(","))
+                .map(String::trim)
+                .anyMatch(profile -> "test".equalsIgnoreCase(profile))) {
             System.err.println("[startup] PortInitializer disabled via SPRING_PROFILES_ACTIVE=test");
             return;
         }
src/main/java/com/williamcallahan/javachat/config/TestPortConfig.java (2)

6-10: Consider the naming and placement of this configuration class.

The class is named TestPortConfig and lives in src/main, which means it's available in all profiles (including production). If this is intentional for development/staging scenarios, that's fine! Otherwise, consider renaming to something like PortConfig to avoid confusion, or moving it to src/test if it's truly test-only.

Fun fact: Using @Component alongside @ConfigurationProperties works, but if you have @ConfigurationPropertiesScan or @EnableConfigurationProperties elsewhere, the @Component annotation becomes redundant! 🎓


20-26: The range property stores a string but might benefit from validation.

The range format "18085-18090" implies parsing logic exists somewhere to extract start/end ports. Consider adding validation to ensure the format is correct at startup, or documenting the expected format.

💡 Optional: Add format validation
+    private static final Pattern RANGE_PATTERN = Pattern.compile("\\d+-\\d+");
+
     public void setRange(String range) {
+        if (range != null && !RANGE_PATTERN.matcher(range).matches()) {
+            throw new IllegalArgumentException("Port range must be in format 'start-end', e.g., '18085-18090'");
+        }
         this.range = range;
     }
Makefile (1)

42-45: Solid JVM memory tuning for local development.

The memory settings (-Xms512m -Xmx1g, G1GC, 70% RAM percentage) are sensible defaults to prevent OOM kills. The disown at the end is clever—it lets you close the terminal without killing the app!

One small thought: using kill -9 (SIGKILL) on lines 40 and 56 is quite aggressive. Consider trying kill -15 (SIGTERM) first to allow graceful shutdown, with SIGKILL as a fallback.

src/test/java/com/williamcallahan/javachat/service/MarkdownPreprocessingTest.java (1)

21-24: Consider removing or conditionalizing debug println statements.

The System.out.println calls are helpful during development but add noise to test output in CI. You could:

  1. Remove them now that tests are stable
  2. Use a logging framework with a test-specific log level
  3. Keep them if the verbose output is intentionally useful for debugging failures

Not a blocker—just a tidiness note! 🧹

Also applies to: 35-37, 50-52

src/main/resources/application.properties (1)

25-26: Dummy keys allow startup but may mask configuration issues.

Using dummy-key-for-startup as fallbacks (lines 25-26, 45) lets the app start without real credentials, which is handy for local dev. However, this could mask missing configuration in environments where you'd expect a real key. Consider logging a warning at startup when dummy keys are in use so operators know API calls will fail.

Also applies to: 45-45

src/main/java/com/williamcallahan/javachat/config/ApiKeyLoggingConfig.java (1)

46-56: Clear logging for endpoint detection, with one subtle edge case.

The multi-branch logic helpfully logs which API will be used. However, there's a small wrinkle: if baseUrl points to GitHub Models but only openaiApiKey is set (no githubToken), lines 52-53 will log "Using OpenAI API (fallback)"—but the actual requests will still go to GitHub Models (since baseUrl is unchanged). The log might be slightly misleading in that scenario.

Consider aligning the log message with what will actually happen based on the configured baseUrl:

💡 Suggestion for clearer fallback logging
         } else if (hasValue(githubToken)) {
             logger.info("Chat API: Using GitHub Models (fallback)");
         } else if (hasValue(openaiApiKey)) {
-            logger.info("Chat API: Using OpenAI API (fallback)");
+            logger.info("Chat API: Using {} with OpenAI key (fallback)", 
+                usingGitHubEndpoint ? "GitHub Models endpoint" : "OpenAI API");
         } else {
src/main/java/com/williamcallahan/javachat/service/OpenAiCompatibleEmbeddingModel.java (2)

60-71: Endpoint normalization has redundant and confusing logic.

A few things to tidy up here:

  1. Line 62 trims trailing slash again, but this was already done in the constructor (line 39).
  2. Lines 66-67 have an empty if block that does nothing—the comment says "keep as-is" but there's no code. This is a no-op and could confuse future readers.
♻️ Simplified endpoint building
-        String endpoint = baseUrl;
-        // Strip trailing slash for normalization
-        if (endpoint.endsWith("/")) endpoint = endpoint.substring(0, endpoint.length() - 1);
-        if (!endpoint.endsWith("/v1/embeddings")) {
-            if (endpoint.endsWith("/v1")) {
-                endpoint = endpoint + "/embeddings";
-            } else if (endpoint.contains("/v1/embeddings")) {
-                // If user passed something like https://.../openai/v1/embeddings (with suffix already), keep as-is
-            } else {
-                endpoint = endpoint + "/v1/embeddings";
-            }
-        }
+        String endpoint = baseUrl; // Already normalized in constructor
+        if (!endpoint.contains("/v1/embeddings")) {
+            endpoint = endpoint.endsWith("/v1") 
+                ? endpoint + "/embeddings" 
+                : endpoint + "/v1/embeddings";
+        }

74-120: One API call per text—consider batching for efficiency.

The OpenAI embeddings API supports batch requests (multiple texts in a single call). Currently, this loops and makes one HTTP request per input text (line 74), which could be slow for large batches.

Fun fact: batch embedding can significantly reduce latency due to fewer round trips! If batch support isn't critical now, consider adding a TODO for future optimization.

src/test/java/com/williamcallahan/javachat/TestConfiguration.java (2)

22-28: Unused annotation attribute value()

The value() attribute is defined but never consumed in evaluateExecutionCondition. This could be a nice way to specify which external services a test requires (e.g., @RequiresExternalServices({"openai", "qdrant"})), enabling more granular gating.

If you intend to use it later, consider adding a TODO comment. Otherwise, removing it keeps the API surface clean! 🧹


54-58: Consider adding GitLab CI detection

You've got good coverage for GitHub Actions and Jenkins. If the project might run on GitLab CI in the future, you could add GITLAB_CI to the detection list. Just a thought for future-proofing!

💡 Optional enhancement
 private boolean isRunningInCI() {
     return System.getenv("CI") != null || 
            System.getenv("GITHUB_ACTIONS") != null ||
-           System.getenv("JENKINS_HOME") != null;
+           System.getenv("JENKINS_HOME") != null ||
+           System.getenv("GITLAB_CI") != null;
 }
src/main/java/com/williamcallahan/javachat/service/LocalEmbeddingModel.java (1)

29-31: Thread-safety consideration for availability flags

serverAvailable and lastCheckTime are accessed and mutated without synchronization. In a multi-threaded context (e.g., concurrent embedding requests), this could lead to stale reads or lost updates.

For an availability flag, this is often acceptable (worst case: an extra check or a skipped check), but if you want stronger guarantees, consider volatile for visibility or AtomicBoolean/AtomicLong.

💡 Optional thread-safe version
-    private boolean serverAvailable = true;
-    private long lastCheckTime = 0;
+    private volatile boolean serverAvailable = true;
+    private volatile long lastCheckTime = 0;
src/main/java/com/williamcallahan/javachat/web/ErrorTestController.java (1)

16-18: Consider restricting to non-production profiles

The comment on line 14 notes this should be "removed or disabled in production," but there's no enforcement mechanism. Adding @Profile("!prod") or a similar guard would prevent accidental exposure in production deployments. Safety first! 🔒

🛡️ Profile-based restriction
+import org.springframework.context.annotation.Profile;
+
 /**
  * Test controller for demonstrating error pages.
  * ...
  */
 `@Controller`
 `@RequestMapping`("/test-errors")
+@Profile({"dev", "test"})
 public class ErrorTestController {
src/test/java/com/williamcallahan/javachat/web/GuidedSseIntegrationTest.java (1)

29-30: Redundant credential check

The @RequiresExternalServices annotation on line 20 already gates this test based on API credentials. The Assumptions.assumeTrue on line 29-30 duplicates that logic. While harmless, removing it keeps the test cleaner!

Fun fact: This same pattern appears in ChatSseIntegrationTest - might be worth a quick cleanup pass across both! 🧹

♻️ Remove redundant check
 `@Test`
 `@DisplayName`("Guided stream returns clean plain text without artifacts and server stores processed HTML")
 void guidedStreamProducesCleanText() {
-    boolean hasKey = System.getenv("OPENAI_API_KEY") != null || System.getenv("GITHUB_TOKEN") != null;
-    Assumptions.assumeTrue(hasKey, "Skipping live integration test without API credentials");
-
     String slug = "introduction-to-java";
src/main/java/com/williamcallahan/javachat/service/markdown/MarkdownCitation.java (1)

21-28: Consider validating snippet as non-null in the compact constructor.

The create() factory method ensures snippet is never null by defaulting to "", but direct constructor invocations could pass null. This could cause a NullPointerException in hasSnippet() if someone bypasses the factory. A simple null-coalescing in the compact constructor would make this bullet-proof:

🛡️ Suggested defensive fix
 public MarkdownCitation {
     Objects.requireNonNull(url, "Citation URL cannot be null");
     Objects.requireNonNull(title, "Citation title cannot be null");
     Objects.requireNonNull(type, "Citation type cannot be null");
+    snippet = snippet != null ? snippet : "";
     if (position < 0) {
         throw new IllegalArgumentException("Citation position must be non-negative");
     }
 }
src/main/java/com/williamcallahan/javachat/service/GuidedLearningService.java (1)

101-119: Consider extracting the duplicated guidance string.

The guidance text on lines 112-115 is identical to lines 94-97 in streamGuidedAnswer. If you ever need to tweak the prompt (and you will!), you'd have to remember to update both places. A small constant or helper method keeps things DRY:

♻️ Suggested refactor
private static final String GUIDED_LEARNING_GUIDANCE = 
    "You are a Java learning assistant guiding the user through 'Think Java — 2nd Edition'. " +
    "Use ONLY content grounded in this book for factual claims. " +
    "Cite sources with [n] markers. Embed learning aids using {{hint:...}}, {{reminder:...}}, {{background:...}}, {{example:...}}, {{warning:...}}. " +
    "Prefer short, correct explanations with clear code examples when appropriate. If unsure, state the limitation.";

Then reference GUIDED_LEARNING_GUIDANCE in both methods.

src/test/java/TestWhatGetsStreamed.java (1)

1-12: Missing package declaration places this class in the default package.

Hey there! 👋 This test utility is missing a package declaration, which means it lives in the default package. That can cause issues with test organization and make it tricky to import from other test classes. Consider adding a package like com.williamcallahan.javachat to keep things tidy.

Also, a quick note: this is a handy diagnostic tool with a main() method, but it won't run as part of your standard test suite since JUnit won't pick it up. If you'd like it to run automatically, consider converting it to a JUnit test with @Test annotations (perhaps marked @Disabled by default since it requires an API key).

🛠️ Suggested fix
+package com.williamcallahan.javachat;
+
 import org.springframework.web.reactive.function.client.WebClient;
 import org.springframework.http.MediaType;
src/test/java/com/williamcallahan/javachat/web/MarkdownApiIntegrationTest.java (1)

6-8: Clarify or remove the "removed unused imports" comments.

Hey! 👋 These comments saying "removed unused imports" are a bit confusing since they're in a new file. They might be leftover from an IDE refactoring operation. Consider removing them to keep the code tidy:

✨ Suggested cleanup
 import org.springframework.beans.factory.annotation.Autowired;
-// removed unused imports
 import org.springframework.http.MediaType;
-// removed unused imports
 import org.springframework.web.bind.annotation.PostMapping;
src/test/java/com/williamcallahan/javachat/web/ChatSseIntegrationTest.java (1)

32-40: Add Accept: text/event-stream to mirror real SSE clients

Declaring the Accept header makes the server negotiate SSE explicitly and keeps this test closer to real clients. Tiny HTTP tidbit: SSE is just text/event-stream under the hood.

💡 Suggested tweak
 Flux<String> body = webTestClient.post()
         .uri("/api/chat/stream")
         .contentType(MediaType.APPLICATION_JSON)
+        .accept(MediaType.TEXT_EVENT_STREAM)
         .bodyValue("{\"latest\":\"Say hello in one short sentence.\"}")
src/test/java/com/williamcallahan/javachat/service/MarkdownServiceTest.java (2)

248-258: Simplify redundant OR assertions

Both sides of the OR are identical, so a single check is enough (or swap in the alternate variant you intended). Small cleanup keeps tests crisp.

✂️ Cleanup
-        assertTrue(html.contains("<p>Key points:</p>") || html.contains("<p>Key points:</p>"), "Leading text should be a paragraph");
+        assertTrue(html.contains("<p>Key points:</p>"), "Leading text should be a paragraph");

-        assertTrue(html.contains("class=\"markdown-table\"") || html.contains("class=\"markdown-table\""), "Table should have styling class");
-        assertTrue(html.contains("<blockquote class=\"markdown-quote\">") || html.contains("<blockquote class=\"markdown-quote\">"), "Blockquote should have styling class");
+        assertTrue(html.contains("class=\"markdown-table\""), "Table should have styling class");
+        assertTrue(html.contains("<blockquote class=\"markdown-quote\">"), "Blockquote should have styling class");

Also applies to: 291-297


308-317: Consider removing noisy System.out debug prints in tests

These can clutter CI logs; maybe gate them behind a debug flag or remove once stable. Fun tip: quieter tests make real failures easier to spot.

🔇 Suggested cleanup
-        System.out.println("[DEBUG testExampleCardRendersCode] Input: " + md);
-        System.out.println("[DEBUG testExampleCardRendersCode] HTML:\n" + html);
...
-        System.out.println("[DEBUG testEnrichmentInsideCodeNotRendered] Input: " + md);
-        System.out.println("[DEBUG testEnrichmentInsideCodeNotRendered] HTML:\n" + html);

Also applies to: 320-329

src/main/resources/static/index.html (1)

462-470: Duplicate @keyframes badge-pulse overrides the earlier definition.
Keeping one definition avoids confusion for future tweaks.

🧹 Cleanup (remove duplicate)
-    `@keyframes` badge-pulse { 0%, 100% { transform: scale(1); } 50% { transform: scale(1.05); } }
src/main/java/com/williamcallahan/javachat/service/RetrievalService.java (1)

163-184: Consider reordering: deduplicate before truncating 💡

Fun optimization opportunity! Currently you truncate all docs (line 164-168) and then deduplicate by URL (lines 171-182). If you flip this order, you'd avoid truncating documents that get discarded anyway. Small win, but every bit helps when processing tokens!

This is totally optional and won't affect correctness - just a nice-to-have efficiency tweak.

src/main/java/com/williamcallahan/javachat/service/OpenAIStreamingService.java (1)

276-285: Redundant return statement 📝

Little tidbit: line 281 has a return; but the method would exit naturally after the try-catch anyway. Not a bug, just a small cleanup opportunity!

The reflection approach here is actually quite clever for handling SDK version differences - nice forward-thinking design! 🎓

🧹 Minor cleanup
            // 2) Fallback: Chat Completions supports only top-level "reasoning_effort"
            //    Do NOT send a nested "reasoning" object on this endpoint.
            builder.putAdditionalBodyProperty("reasoning_effort", JsonValue.from("minimal"));
            log.info("[LLM] reasoning_effort set via additional body property");
-            return;
        } catch (Exception ex) {
            log.debug("Skipping reasoning_effort due to SDK compatibility: {}", ex.toString());
        }
src/main/java/com/williamcallahan/javachat/service/ChatService.java (1)

91-127: URL normalization logic is duplicated 🔄

Fun fact: these methods (normalizeUrlForPrompt and canonicalizeHttpDocUrl) are nearly identical to the ones in RetrievalService.java (lines 330-393). While they work correctly, this duplication means bug fixes need to be applied in two places.

Consider extracting to a shared utility class like UrlNormalizer or DocsSourceRegistry (which already has related URL mapping logic). Not urgent, but would improve maintainability!

src/main/java/com/williamcallahan/javachat/service/MarkdownService.java (1)

324-376: Consider removing the ThreadLocal value after use.
blocks.clear() is good; remove() avoids retaining the map object across pooled threads. Tiny hygiene win.

🔧 Suggested tweak
-        blocks.clear();
+        blocks.clear();
+        protectedBlocks.remove();

Also applies to: 740-757

src/main/resources/static/chat.html (1)

827-862: Consider consolidating the two DOMContentLoaded handlers.
Right now listeners/observers get registered twice, which can lead to duplicate work. A single init function (or a simple guard flag) would keep things lean.

Also applies to: 950-979

Comment thread README.md
Comment thread src/main/java/com/williamcallahan/javachat/service/GracefulEmbeddingModel.java Outdated
Comment thread src/main/java/com/williamcallahan/javachat/web/ChatController.java
Update from incorrect "8K input" to accurate ~400K token context
window with 128K max output per OpenAI API documentation.
Mark primaryAvailable, secondaryAvailable, lastPrimaryCheck, and
lastSecondaryCheck as volatile to ensure thread visibility across
concurrent requests in singleton bean.
Remove erroneous '+ i' from enrichment position calculation.
absolutePosition already tracks the correct offset; adding i
caused double-counting that skewed enrichment ordering.
Add saveLock object and synchronized block around saveState() to
prevent interleaved JSON writes when scheduler and request path
call safeSaveState() concurrently.
Check isProviderConfigured() at the start of isProviderAvailable()
to prevent unconfigured providers from being considered available,
which would trigger auth errors downstream.
Add null-safe check for doc.getText() in computeDocsHash() to
prevent NPE when URL metadata is missing and text is null.
The cold Flux from streamResponse() was being subscribed twice
(once for dataEvents, once for takeUntilOther), causing duplicate
API calls. Add .share() to hot-share the stream and move filter
before doOnNext to prevent null/empty chunks from accumulating.
@railway-app railway-app Bot temporarily deployed to insightful-intuition / production January 23, 2026 20:47 Inactive
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@src/main/java/com/williamcallahan/javachat/service/markdown/UnifiedMarkdownService.java`:
- Around line 111-131: The cache lookup uses the original markdown variable but
processing mutates it via truncation, preNormalizeForListsAndFences, and
extractAndPlaceholderizeEnrichments, so move the processCache.getIfPresent call
to occur before any modifications (i.e., before truncation and
pre-normalization) or change the cache key to use the final normalized string
(placeholderMarkdown) instead; update the lookup that calls
processCache.getIfPresent(markdown) to either call
processCache.getIfPresent(originalMarkdown) prior to calling
preNormalizeForListsAndFences/ extractAndPlaceholderizeEnrichments or to call
processCache.getIfPresent(placeholderMarkdown) after those transformations so
the cache key matches what is actually processed (refer to
UnifiedMarkdownService, preNormalizeForListsAndFences,
extractAndPlaceholderizeEnrichments, and processCache.getIfPresent).
♻️ Duplicate comments (4)
src/main/java/com/williamcallahan/javachat/service/RateLimitState.java (1)

217-230: Synchronized file writes properly implemented.

The saveLock synchronization prevents interleaved JSON writes from concurrent calls (scheduler + request path). This was flagged in a previous review and is now correctly addressed. 🔒

src/main/java/com/williamcallahan/javachat/service/GracefulEmbeddingModel.java (1)

29-34: Thread-safe circuit breaker fields properly implemented.

The volatile keyword on all four circuit-breaker fields ensures visibility across threads. This was flagged in a previous review and is now correctly addressed.

Fun Java tidbit: volatile guarantees that reads and writes are atomic for these primitive types, and establishes a happens-before relationship! 🧵

src/main/java/com/williamcallahan/javachat/service/RateLimitManager.java (1)

160-165: Configuration check now gates isProviderAvailable.

This was flagged in a previous review - unconfigured providers could be considered "available" and cause auth errors downstream. Now isProviderConfigured() is checked first.

This defensive check saves debugging time when API keys are missing! 🔐

src/main/java/com/williamcallahan/javachat/web/ChatController.java (1)

114-131: Streaming setup with .share() correctly implemented.

The previous concern about double API calls from cold Flux subscription has been addressed:

  • .filter() is applied before .doOnNext()
  • .share() hot-shares the stream ✓
  • Completion signal properly extracted ✓

The heartbeat termination via takeUntilOther(completion) ensures the SSE connection closes cleanly when data completes. Well done! 🎉

🧹 Nitpick comments (3)
README.md (1)

417-462: Fantastic mobile documentation! 📱

This is thorough and well-organized. One tiny thing to consider: the zoom prevention features (double-tap prevention on Line 427, and anti-pattern #5 about scaling on Line 456) might benefit from a note about accessibility.

Modern accessibility guidelines often recommend allowing user-initiated zoom for users with visual impairments. The 16px input font size (Line 424) is perfect for preventing unwanted iOS auto-zoom, but you might want to ensure users can still pinch-zoom if needed. If the double-tap prevention is specifically for UX during typing, that's totally reasonable—just worth a quick accessibility check! 🎓

src/main/java/com/williamcallahan/javachat/service/RerankerService.java (2)

185-187: Return a defensive copy instead of a subList view.

List.subList() returns a view backed by the original list, which could cause ConcurrentModificationException if the source list is modified. Since this is a cached result, returning a view could lead to surprising behavior for callers.

A quick fix keeps things safe and predictable! 🛡️

♻️ Suggested fix
     private List<Document> limitDocs(List<Document> docs, int returnK) {
-        return docs.subList(0, Math.min(returnK, docs.size()));
+        return new ArrayList<>(docs.subList(0, Math.min(returnK, docs.size())));
     }

144-151: Consider logging skipped invalid indices for easier debugging.

When the LLM returns an out-of-range index, it's silently dropped. Adding a debug log here would help trace weird reranking results without affecting production performance.

This is a minor "nice to have" for future you debugging at 2 AM! 🔦

💡 Optional enhancement
             for (JsonNode n : root.get("order")) {
                 int idx = n.asInt();
                 if (idx >= 0 && idx < docs.size()) {
                     reordered.add(docs.get(idx));
+                } else {
+                    log.debug("Skipping invalid rerank index: {} (docs size: {})", idx, docs.size());
                 }
             }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants