AI Agent-Based Grading System

This is a demonstration of an AI agent-based system for automatically grading programming assignments. It uses a WebSocket-based backend with multiple specialized agents that work together to analyze, review, and grade student code submissions.

Files

agent_backend.py - The WebSocket server that manages agent communication and execution flow
grader_initialize.json - Initialization payload for the grading agents
grader_execute.json - Execution payload with problem and student submission
grader_payload_full.json - Complete reference payload (both init and execute)
grader_test.py - Python script to test the grading system
grader_demo.py - Demo script that simulates the grading workflow without API keys
run_grader.sh - Bash script to run the grader using websocat

Setup

Make sure you have Python 3.7+ installed

Install the required packages:

pip install websockets asyncio python-dotenv

Create a .env file in the project root with your OpenAI API key:
```
OPENAI_API_KEY=your_openai_api_key_here
```
If using the shell script, install websocat:
- Linux: sudo apt install websocat
- MacOS: brew install websocat
- Windows: Download from the GitHub releases page

Usage

Option 1: Using Python Script

Start the backend server:
```
python agent_backend.py
```
Run the test script:
```
python grader_test.py
```

Option 2: Using Demo Mode

If you want to see a simulated demonstration:

python grader_demo.py

Option 3: Using JSON and websocat

Start the backend server:
```
python agent_backend.py
```

Run the shell script:

bash run_grader.sh

Or manually send the WebSocket requests:

websocat ws://localhost:8765 < initialize_payload.json
websocat ws://localhost:8765 < execution_payload.json

Sample Payloads

Initialization Payload

{
  "action": "initialize",
  "nodes": [
    {
      "id": "agent1",
      "name": "data_collector",
      "description": "Collects and organizes data",
      "system_message": "You are a data collection expert. Collect and organize information from the user's query in a structured format.",
      "feedback_enabled": true
    },
    {
      "id": "agent2",
      "name": "researcher",
      "description": "Researches specific topics in depth",
      "system_message": "You are a research specialist. Focus on providing detailed, well-researched information on the user's query.",
      "feedback_enabled": true
    },
    {
      "id": "agent3",
      "name": "analyst",
      "description": "Analyzes data and provides insights",
      "system_message": "You are a data analyst. Analyze information and provide meaningful insights and patterns.",
      "feedback_enabled": false
    },
    {
      "id": "agent4",
      "name": "summarizer",
      "description": "Summarizes information concisely",
      "system_message": "You are a summarization expert. Create concise, informative summaries that capture the key points.",
      "feedback_enabled": false
    }
  ]
}

Note: The feedback_enabled flag allows an agent to request additional information from the user during execution. When set to true, the agent can ask for clarification or more details by including [FEEDBACK_REQUEST: your question here] in its response. The execution will pause, wait for user input, and then continue with the additional information.

Execution Payload

{
  "action": "execute",
  "user_question": "Explain the impact of artificial intelligence on healthcare over the next decade.",
  "execution_flow": [
    "agent1",
    ["agent2", "agent3"],
    "agent4"
  ]
}

Handling Feedback Requests

When an agent with feedback_enabled: true requests user feedback, the server will send a message with the following format:

{
  "type": "feedback_request",
  "agent": "requirements_analyst",
  "agent_id": "requirements",
  "question": "Can you provide more details about the workout types the app should track?",
  "original_response": "Based on your initial request, I've outlined the following requirements... [FEEDBACK_REQUEST: Can you provide more details about the workout types the app should track?]",
  "feedback_count": 1,
  "max_feedback": 2
}

The client can respond in one of three ways:

Providing detailed feedback:

{
  "type": "feedback_response",
  "content": "The app should track weightlifting with sets, reps, and weight; cardio with distance, time, and calories; and flexibility exercises with duration and difficulty level. I also want custom workout creation."
}

Indicating satisfaction (short positive response):

{
  "type": "feedback_response",
  "content": "Yes, that's perfect. Please continue."
}

Note: Short positive responses like "yes", "good", "continue" are automatically detected as satisfaction indicators and will allow the agent to move to the next step without further requests.

Skipping feedback entirely:

{
  "type": "feedback_skip"
}

Features and Limitations:

Each agent is limited to a maximum of 2 feedback requests
When the user indicates satisfaction, the agent will stop requesting feedback and move to the next step
Feedback count is tracked per agent
The system automatically detects satisfaction indicators in short responses

How It Works

The grading system uses a pipeline of 3 specialized AI agents:

Problem Analyzer - Breaks down the problem requirements into evaluation criteria
Code Reviewer - Reviews the student code against these criteria
Grader - Assigns a final grade with justification based on the analysis and review

Communication is managed through a WebSocket server that routes messages between agents and returns the final grade.

Extending

To grade different types of programming assignments:

Modify the problem statement in grader_execute.json
Update the student code submission in the same file
Run the grading process again

You can also customize the agents by modifying their system messages in grader_initialize.json.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
__pycache__		__pycache__
scratch		scratch
.composio.lock		.composio.lock
.gitignore		.gitignore
agent_backend.py		agent_backend.py
agent_composio.py		agent_composio.py
claude.md		claude.md
combined_payloads.json		combined_payloads.json
composio-autogen-core.txt		composio-autogen-core.txt
composio-core.txt		composio-core.txt
composio-doc.txt		composio-doc.txt
composiohq-composio.txt		composiohq-composio.txt
course_generator_payload.json		course_generator_payload.json
dsl_workflow_payload.json		dsl_workflow_payload.json
enhanced_university_use_case.json		enhanced_university_use_case.json
feedback_demo_payload.json		feedback_demo_payload.json
feedback_response_sample.json		feedback_response_sample.json
feedback_satisfied_sample.json		feedback_satisfied_sample.json
feedback_skip_sample.json		feedback_skip_sample.json
github_demo_payload.json		github_demo_payload.json
github_payloads.json		github_payloads.json
grader_workflow_payload.json		grader_workflow_payload.json
image_analysis_payload.json		image_analysis_payload.json
image_analysis_test.json		image_analysis_test.json
initialize_payload.json		initialize_payload.json
math_payloads.json		math_payloads.json
microsoft-autogen.txt		microsoft-autogen.txt
parallel_demo_payload.json		parallel_demo_payload.json
parallel_execution_payload.json		parallel_execution_payload.json
readme.md		readme.md
requirements.txt		requirements.txt
sample autogen-composio.txt		sample autogen-composio.txt
sql_generator_payload.json		sql_generator_payload.json
test_image_processing.py		test_image_processing.py
university_use_case.json		university_use_case.json
use-cases.txt		use-cases.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Agent-Based Grading System

Files

Setup

Usage

Option 1: Using Python Script

Option 2: Using Demo Mode

Option 3: Using JSON and websocat

Sample Payloads

Initialization Payload

Execution Payload

Handling Feedback Requests

How It Works

Extending

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Agent-Based Grading System

Files

Setup

Usage

Option 1: Using Python Script

Option 2: Using Demo Mode

Option 3: Using JSON and websocat

Sample Payloads

Initialization Payload

Execution Payload

Handling Feedback Requests

How It Works

Extending

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages