Feat/implement real evaluator #64

ArchILLtect · 2025-12-09T05:24:43Z

This pull request introduces the MVP implementation of a dedicated evaluator service for CodeForge, along with significant enhancements to the Practice and Drill flows, improved configuration and onboarding documentation, and better user/session tracking. The changes focus on enabling answer evaluation for coding challenges, wiring feedback into user flows, and supporting per-user drill features. The update also adds required model fields, configuration utilities, and documentation for a smoother setup and testing experience.

Evaluator Service & Challenge Evaluation

Implemented a pluggable evaluator: Added AnswerEvaluator interface, AnswerEvaluation DTO, and BasicEvaluatorService that compares challenge submissions to expected answers with normalization and feedback. [1] [2] [3]
Updated Challenge entity to include an expectedAnswer field for evaluation purposes.

Practice & Drill Flow Enhancements

Practice and Drill servlet flows now wire evaluator feedback inline and via flash messages, with public access for Practice and per-user tracking for Drill. [1] [2] [3]
DrillItem and Submission entities now include a userId field (Cognito sub) for per-user drill and submission tracking. [1] [2]
Authentication controller stores Cognito sub in session for downstream drill features.

Configuration & Documentation Updates

Added LocalConfig utility for loading local properties, and updated README.md with setup instructions for environment variables and test database configuration. [1] [2] [3]
Created projects/mvp/README.md and improved documentation for evaluator, Practice/Drill flows, troubleshooting, and quick start. [1] [2]

Testing & MVP Completion

Updated MVP TODOs and documentation to reflect completed evaluator, Practice/Drill flows, feature flags, security, and tests for both evaluator and drill persistence. [1] [2] [3]

Closes #53
Closes #54
Closes #55
Closes #56
Closes #57
Closes #58
Closes #59
Closes #60
Closes #61
Closes #62

References:
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

…e for challenge answers

…e SessionFactoryProvider configuration resolution

… tracking user-specific data

…ntralizing user context logic

…luator

… model

… ChallengesServlet

…lItems

…ecific DrillItems and enrollment

… challenge and user

…rollment logic

…handling and drill item retrieval

…user context

…e submissions

… interface

…ty methods

…lengeRunService

…ties

…d styles

…rvice persistence

…nd mode support

… item display

…e in Drill and Practice servlets

…tem displays

Copilot

Pull request overview

This pull request implements an MVP evaluator service for CodeForge, enabling answer evaluation for coding challenges through a pluggable architecture. The PR introduces two distinct user flows: Practice (public, no persistence) and Drill (authenticated, with spaced repetition tracking). It adds per-user drill tracking via a userId field (Cognito sub), implements a basic string-matching evaluator with normalization, and enhances configuration management through a three-tier resolution system (system properties → environment variables → local.properties).

Key changes:

Evaluator service with BasicEvaluatorService comparing submissions against Challenge.expectedAnswer using normalization and fuzzy matching
Practice and Drill servlet flows with inline/flash feedback, feature flags for enabling/disabling features
Per-user drill isolation: userId added to DrillItem and Submission entities with user-scoped DAO queries
Configuration improvements: LocalConfig utility, updated README with setup instructions, removed hardcoded credentials from test resources

Reviewed changes

Copilot reviewed 48 out of 49 changed files in this pull request and generated 22 comments.

Show a summary per file

File	Description
`src/test/resources/hibernate.properties`	Removed file to avoid committing database credentials (good security practice)
`src/test/resources/cleandb.sql`	Added `expected_answer` to challenges, `user_id` to drill_items/submissions; updated seed data
`src/test/java/.../DrillServletTest.java`	Added session mocking for userId; updated service calls to include userId parameter
`src/test/java/.../DrillServiceTest.java`	Updated all test methods to pass userId parameter to drill service methods
`src/test/java/.../DrillServicePersistenceTest.java`	New integration test verifying drill item persistence and field updates
`src/test/java/.../DrillServiceEnrollmentTest.java`	New unit test for auto-enrollment feature (creates missing drill items per user)
`src/test/java/.../ChallengeRunServiceTest.java`	Removed old heuristic tests; needs new evaluator integration tests
`src/test/java/.../persistence/*DaoTest.java`	Updated to set userId on all DrillItem and Submission instances
`src/test/java/.../BasicEvaluatorServiceTest.java`	New comprehensive tests for evaluator (exact match, punctuation/spacing, incorrect, guards)
`src/main/webapp/css/*.css`	Added alert styles for info/error feedback; practice form field styles
`src/main/webapp/WEB-INF/jsp/practice/solve.jsp`	New JSP for public practice mode with inline feedback display
`src/main/webapp/WEB-INF/jsp/drill/*.jsp`	Updated to show streak/next-due stats; flash messages; date formatting (has Instant→Date issue)
`src/main/webapp/WEB-INF/jsp/challenges/*.jsp`	Added expectedAnswer input fields; improved XSS escaping with c:out
`src/main/java/.../web/UserContext.java`	New helper to extract userId from session; centralizes auth context reading
`src/main/java/.../web/PracticeServlet.java`	Complete rewrite: handles GET/POST for practice mode, no persistence, inline feedback
`src/main/java/.../web/DrillServlet.java`	Enhanced with userId extraction, auto-enrollment, flash messages, evaluator integration
`src/main/java/.../web/ChallengesServlet.java`	Extracts expectedAnswer from form and passes to service
`src/main/java/.../web/ChallengeForm.java`	Added expectedAnswer field with validation
`src/main/java/.../web/AuthGuardFilter.java`	Exempts practice routes from authentication; unified auth check with UserContext
`src/main/java/.../service/DrillService.java`	All methods updated to require userId; added ensureEnrollmentForUser for auto-enrollment
`src/main/java/.../service/ChallengeService.java`	Sets expectedAnswer when creating/updating challenges
`src/main/java/.../service/ChallengeRunService.java`	Replaced heuristics with evaluator integration; added runWithMode and telemetry logging
`src/main/java/.../persistence/SubmissionDao.java`	Added listByChallengeIdAndUser query method
`src/main/java/.../persistence/SessionFactoryProvider.java`	Refactored to use three-tier config resolution; requires DB password from config
`src/main/java/.../persistence/DrillItemDao.java`	Added user-scoped queries: listByChallengeIdAndUser, dueQueue, soonestUpcoming, etc.
`src/main/java/.../evaluator/*.java`	New evaluator package with interface, DTO, BasicEvaluatorService, and Normalizer utility
`src/main/java/.../entity/*.java`	Added expectedAnswer to Challenge; userId to DrillItem and Submission
`src/main/java/.../controller/Auth.java`	Stores Cognito sub in session as "userSub" for drill features
`src/main/java/.../config/LocalConfig.java`	New utility to load local.properties from classpath for dev config
`src/main/resources/log4j2.properties`	Added telemetry appender with rolling file; updated root logger level
`src/main/resources/application.properties`	Added feature flags for practice and drill
`projects/mvp/README.md`	New MVP quick-start guide with config, routes, testing instructions
`docs/*.md`	Updated mvp-todo, added evaluator implementation plan and design docs
`README.md`	Enhanced with environment setup, database config instructions, local.properties template
`.gitignore`	Added local.properties to prevent committing dev credentials

Comments suppressed due to low confidence (1)

src/test/resources/hibernate.properties:1

Security: The entire hibernate.properties file containing database credentials has been deleted. This is good practice to avoid committing secrets to version control. However, ensure that the README instructions for creating this file locally are clear and that CI/CD pipelines have proper configuration to provide these values during testing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/main/webapp/WEB-INF/jsp/drill/queue.jsp

src/main/java/me/nickhanson/codeforge/config/LocalConfig.java

README.md

src/main/resources/log4j2.properties

src/main/webapp/WEB-INF/jsp/challenges/edit.jsp

Copilot · 2025-12-09T05:40:19Z

src/main/java/me/nickhanson/codeforge/web/AuthGuardFilter.java

+            if (path.matches("^/practice/\\d+/submit$")) {
+                needsAuth = false;
+            }


Security: The regex pattern ^/practice/\\d+/submit$ allows public POST submissions to practice challenges. While this is intentional per the design, ensure that PracticeServlet properly validates and sanitizes all inputs (language, code) to prevent injection attacks. Additionally, consider adding rate limiting to prevent abuse of the public evaluation endpoint.

Copilot · 2025-12-09T05:40:19Z

src/main/java/me/nickhanson/codeforge/persistence/SessionFactoryProvider.java

+        String dialect = resolve(
+                "hibernate.dialect",
+                "org.hibernate.dialect.MySQL8Dialect"
+        );


Maintainability: The dialect has been changed from MySQLDialect to MySQL8Dialect on line 91. While this is likely correct for MySQL 8.x, it's a breaking change for environments still running MySQL 5.7 or earlier. Consider documenting this requirement in the README or making the dialect version-aware based on the actual MySQL version detected at runtime.

Copilot · 2025-12-09T05:40:20Z

src/main/java/me/nickhanson/codeforge/web/PracticeServlet.java

+            if (language == null || language.isBlank() || code == null || code.isBlank()) {
+                Challenge challenge = challengeService.getById(id).orElse(null);
+                if (challenge == null) { resp.sendError(404); return; }
+                req.setAttribute("challenge", challenge);
+                req.setAttribute("mode", "practice");
+                req.setAttribute("submittedCode", code);
+                req.setAttribute("outcome", me.nickhanson.codeforge.entity.Outcome.SKIPPED);
+                req.setAttribute("feedback", "Missing language or code. Please fill in both fields.");
+                req.getRequestDispatcher("/WEB-INF/jsp/practice/solve.jsp").forward(req, resp);
+                return;
            }


Bug: When language or code is blank, the code sets outcome to SKIPPED and feedback message, but it still tries to fetch the challenge and forward to the JSP. However, if code is null, line 73 will pass null to req.setAttribute("submittedCode", code), which is fine. The real issue is that this validation logic duplicates what should happen in the evaluator. Consider calling the evaluator for all cases and letting it handle validation consistently, or at minimum, ensure the blank check for code uses code == null || code.isBlank() consistently with the evaluator's expectations.

Suggested change

if (language == null || language.isBlank() || code == null || code.isBlank()) {

Challenge challenge = challengeService.getById(id).orElse(null);

if (challenge == null) { resp.sendError(404); return; }

req.setAttribute("challenge", challenge);

req.setAttribute("mode", "practice");

req.setAttribute("submittedCode", code);

req.setAttribute("outcome", me.nickhanson.codeforge.entity.Outcome.SKIPPED);

req.setAttribute("feedback", "Missing language or code. Please fill in both fields.");

req.getRequestDispatcher("/WEB-INF/jsp/practice/solve.jsp").forward(req, resp);

return;

}

src/main/webapp/WEB-INF/jsp/drill/solve.jsp

src/main/resources/log4j2.properties

…user experience

Documentation: The comment syntax // e.g., localhost is not valid in .properties files. Properties files use # for comments. The correct format should be: # e.g., localhost or remove the comment entirely and use regular key=value format. This applies to all the example property comments on lines 187-191. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

… in logs

Copilot

Pull request overview

Copilot reviewed 49 out of 50 changed files in this pull request and generated 8 comments.

Comments suppressed due to low confidence (1)

src/test/resources/hibernate.properties:1

This file contained a hardcoded database password (ArchILLtect2005!) that was committed to version control. While this PR correctly deletes the file (it should be git-ignored), the password has been exposed in the git history and should be considered compromised. The database password should be rotated immediately. Verify that .gitignore includes src/test/resources/hibernate.properties to prevent future credential leaks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/main/java/me/nickhanson/codeforge/web/DrillServlet.java

src/main/java/me/nickhanson/codeforge/evaluator/Normalizer.java

src/main/java/me/nickhanson/codeforge/evaluator/AnswerEvaluation.java

Copilot · 2025-12-09T06:09:47Z

src/main/java/me/nickhanson/codeforge/service/ChallengeRunService.java

@@ -32,43 +39,32 @@ public class ChallengeRunService {
     * @return A RunResult object representing the outcome of the evaluation.
     */


The run() method now just delegates to runWithMode() with mode="unknown". Consider deprecating this method with @Deprecated annotation and updating callers to use runWithMode() directly for better telemetry tracking.

Suggested change

*/

*/

/**

* @deprecated Use {@link #runWithMode(String, Long, String, String)} for better telemetry tracking.

*/

@Deprecated

src/main/java/me/nickhanson/codeforge/web/PracticeServlet.java

src/main/java/me/nickhanson/codeforge/evaluator/BasicEvaluatorService.java

src/main/webapp/WEB-INF/jsp/challenges/detail.jsp

README.md

ArchILLtect added 30 commits December 5, 2025 10:53

chore(docs: evaluator): plan to implement dedicated evaluation servic…

44d0b00

…e for challenge answers

feat(detail): escape Markdown content to prevent HTML injection

d46b7c9

feat(detail): escape Markdown content to prevent HTML injection

3590647

feat(config): implement LocalConfig for loading properties and enhanc…

83645c8

…e SessionFactoryProvider configuration resolution

feat(database): add user_id to drill_items and submissions tables for…

661a2f0

… tracking user-specific data

feat(auth): enhance user authentication by storing Cognito sub and ce…

cda49ef

…ntralizing user context logic

feat(challenge): add expected answer field to challenge forms for eva…

796a6bd

…luator

feat(challenge): add expected answer field for evaluator in Challenge…

7643859

… model

feat(challenge): add expected answer handling in ChallengeService and…

ecd428d

… ChallengesServlet

feat(drillitem): add user_id field and user-specific queries for Dril…

e661cbf

…lItems

feat(drill): enhance DrillService and DrillServlet to support user-sp…

f9b05a1

…ecific DrillItems and enrollment

feat(submission): add user_id field and method to list submissions by…

244eeda

… challenge and user

feat(test): add user_id to DrillItem and Submission in ChallengeDaoTest

f4abb0d

feat(test): set user_id for DrillItems in DrillItemDaoTest

d51146e

feat(test): add DrillServiceEnrollmentTest to verify user-specific en…

c8d68bc

…rollment logic

feat(test): update DrillServletTest to include user-specific session …

474d44c

…handling and drill item retrieval

feat(test): add user_id handling in SubmissionDaoTest for consistent …

0bf269d

…user context

feat(auth): update AuthGuardFilter to allow public access for practic…

c5e9dd0

…e submissions

feat(evaluator): implement AnswerEvaluation class and AnswerEvaluator…

b6eb6a5

… interface

feat(evaluator): implement BasicEvaluatorService and Normalizer utili…

eee1696

…ty methods

feat(evaluator): integrate ChallengeDao and AnswerEvaluator into Chal…

5d73b78

…lengeRunService

feat(evaluator): add feature flags for practice and drill functionali…

01306b6

…ties

feat(evaluator): add practice functionality with new practice page an…

c559166

…d styles

feat(evaluator): add unit tests for BasicEvaluatorService and DrillSe…

03ca13f

…rvice persistence

feat(evaluator): enhance ChallengeRunService with telemetry logging a…

c51899d

…nd mode support

feat(evaluator): add telemetry logging configuration and update drill…

e63a01d

… item display

feat(evaluator): update DrillServletTest to use runWithMode method

ed31e8e

feat(evaluator): update MVP documentation and polish telemetry logging

5074738

feat(evaluator): implement error handling for missing language or cod…

984a0cc

…e in Drill and Practice servlets

feat(evaluator): enhance date formatting and null handling in drill i…

fa0ebe2

…tem displays

ArchILLtect added this to CodeForge – MVP Release Dec 9, 2025

Copilot started reviewing on behalf of ArchILLtect December 9, 2025 05:25 View session

Copilot AI reviewed Dec 9, 2025

View reviewed changes

ArchILLtect and others added 2 commits December 8, 2025 23:40

feat(practice): update Practice page layout and content for improved …

fce1664

…user experience

Copilot AI review requested due to automatic review settings December 9, 2025 05:57

Copilot started reviewing on behalf of ArchILLtect December 9, 2025 05:58 View session

feat(auth): enhance security by redacting sensitive token information…

c88619a

… in logs

ArchILLtect merged commit 05d907f into main Dec 9, 2025
2 checks passed

ArchILLtect moved this to Done in CodeForge – MVP Release Dec 9, 2025

Copilot AI reviewed Dec 9, 2025

View reviewed changes

ArchILLtect deleted the feat/implement-real-evaluator branch December 9, 2025 06:26

ArchILLtect mentioned this pull request Dec 9, 2025

chore: implement changes from code review #65

Open

4 tasks

		@@ -32,43 +39,32 @@ public class ChallengeRunService {
		* @return A RunResult object representing the outcome of the evaluation.
		*/

-     */
+     */
+    /**
+     * @deprecated Use {@link #runWithMode(String, Long, String, String)} for better telemetry tracking.
+     */
+    @Deprecated

Feat/implement real evaluator #64

Feat/implement real evaluator #64

Uh oh!

Conversation

ArchILLtect commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ArchILLtect commented Dec 9, 2025 •

edited

Loading