Skip to content

adavidryu/programmy.ai

Repository files navigation

Programmy

AI-powered practice for intro programming — problems generated from your course, graded in real time, with a tutor in the sidebar.


The problem

Students in courses like CSCE 120 (Intro to Programming @ TAMU) often hit a wall: they’ve been to lecture and read the material, but they don’t get enough practice that actually matches the exam. Static problem banks feel generic or out of sync with the class; office hours and TAs don’t scale. Programmy addresses this by generating practice problems on demand from the real course content, grading answers instantly, and providing an in-context assistant that knows which problem the student is working on.


What this is

Programmy is a learning platform where:

  • Students get unlimited practice problems tailored to the current week or exam, with control over difficulty (easy, medium, hard) and format (multiple choice or free response). They can switch weeks, request a new problem, submit an answer for immediate grading, and use a sidebar chat for hints or explanations.
  • Instructors (or course maintainers) provide content via S3 and a Bedrock Knowledge Base; the system uses that material to keep generated problems aligned with the syllabus instead of generic programming quizzes.

The core differentiator is RAG over course content: lectures and topics are ingested and retrieved so each problem matches the class’s style and coverage. Answer verification returns a clear correct/incorrect result. The assistant receives the current problem, week, and difficulty so it can give targeted help without leaving the practice flow.


How it works

Entry and auth — A public landing page describes the product and offers Sign In / Sign Up / Get Started; all route to Auth0. Authenticated users are redirected to the practice view; unauthenticated users hitting the practice view are sent back to the landing page.

Practice flow — On the practice view, the user selects:

  • Week or exam — Weeks 1–5, Exam 1, Weeks 6–8, 10–11, Exam 2 (configurable per deployment).
  • Difficulty — Easy, medium, or hard.
  • Problem type — MCQ (multiple choice) or FRQ (free response / code writing or tracing).

They click Generate New Problem. The app sends a POST request to /api/problems with weekId, difficulty, and problemType. The backend uses the Knowledge Base and S3 to fetch relevant content for that week/exam and difficulty, then calls the LLM to produce a problem (title, description, hints, test cases or choices, correct answer). The problem is displayed in the main panel. A 3-second rate limit between generate requests prevents API abuse.

Answer submission — The student enters an answer and submits. The client POSTs the problem text, answer, weekId, and difficulty to /api/verify-answer. The backend prompts the model to return only “correct” or “incorrect”; the UI shows the result (e.g. check or X and brief feedback).

Assistant — A chat panel is available alongside the problem. Each message is sent to /api/chat with the user message plus the current problem (title, description), weekId, difficulty, and any topics. The model is instructed to act as a CSCE 120 programming assistant, to give clear explanations and code examples in markdown, and to avoid spoiling the answer. Responses are streamed back and rendered in the panel. There is no persistence of chat history or attempt history in the current version; the experience is session-oriented.


Problem generation and grading

Generation — Handled by LectureService in the backend. For a given weekId (numeric week or exam identifier), difficulty, and problemType (MCQ or FRQ):

  1. A retrieval query is built (e.g. week 3 easy FRQ) and sent to the Bedrock Knowledge Base via RetrieveCommand. The query returns up to 5 relevance-scored chunks; each chunk’s text and score are collected.
  2. Week metadata can be loaded from S3 (e.g. lectures/week{N}/topics.json) for topics and difficulty bands; this is used where available to further constrain or label the problem.
  3. A prompt is built that asks the model to generate a problem of the given difficulty and type, using the retrieved context. The prompt specifies a JSON schema: id, title, description, difficulty, type, category, and either choices + correctAnswer + explanation (MCQ) or testCases (FRQ), plus hints. The model is invoked with temperature 0.7 and a 1000-token limit; the response is parsed as JSON and returned to the client.

Grading — The /api/verify-answer endpoint receives the problem (at least its description), the student’s answer, weekId, and difficulty. It builds a short prompt instructing the model to act as a programming grader and to respond with only the word “correct” or “incorrect”. The model is called with low temperature (0.1) and a 10-token limit. The response is normalized to lowercase and compared to "correct"; the API returns { isCorrect: boolean }. The UI uses this to show pass/fail and optional feedback.

Component Role
S3 Stores week/topic metadata (e.g. topics.json) and optional example or lecture content that the Knowledge Base ingests.
Knowledge Base Vector search over ingested course material; returns top-k chunks for the retrieval query so problems are grounded in the syllabus.
Bedrock (Claude) Problem generation (with RAG context), answer verification (correct/incorrect), and chat assistant (with current problem and week context).

AI in the loop

The same Claude model (e.g. anthropic.claude-3-5-sonnet-20241022-v2:0 via Bedrock) is used in three places:

  1. Problem generation — RAG ensures the model is conditioned on retrieved course material rather than inventing content. The prompt and JSON schema keep outputs structured for the UI.
  2. Answer verification — A single-purpose prompt and short max tokens keep responses to a single word, so the API can reliably parse correct/incorrect and the UX stays consistent.
  3. Chat assistant — The system prompt includes the current problem title and description, week/exam id, difficulty, and any topics. The model is instructed to help with concepts and formatting (e.g. code blocks) and to avoid giving away the answer. No conversation history is stored; each request is stateless with the current context.

There is no persistence of attempts, scores, or chat history in the current version; the app does not use a database for student data.


Tech stack

  • Frontend: Next.js 15 (App Router), React 19, TypeScript, Tailwind CSS 4, Geist (sans and mono). Single-page flows: landing (/) and practice (/content).
  • Auth: Auth0 via @auth0/nextjs-auth0. Catch-all route at pages/api/auth/[...auth0].ts for login, logout, and callback. Client-side auth context wraps UserProvider and exposes user, isAuthenticated, login, logout for gating and redirects.
  • AI and data: AWS Bedrock (Claude 3.5 Sonnet), Bedrock Agent Runtime (Knowledge Base retrieval), S3 for lecture/week metadata and ingested content. No separate backend server; Next.js API routes under app/api/ implement problems, verify-answer, and chat.

Key routes and services:

Route / layer Purpose
GET / Landing page; redirects authenticated users to /content.
GET /content Practice UI; redirects unauthenticated users to /.
POST /api/problems Generate problem (body: weekId, difficulty, problemType).
GET /api/problems Optional; e.g. fetch topics for a week (query: week).
POST /api/verify-answer Grade submission (body: problem, answer, weekId, difficulty).
POST /api/chat Assistant reply (body: message, problem, weekId, difficulty, topics).
LectureService S3 + Knowledge Base + Bedrock for generation and topic metadata.

Programmy — closing the gap between “I went to lecture” and “I can do this on the exam” with course-grounded, AI-powered practice.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors