PyJudge is an automatic grading tools that takes a reference implementation and a student implementation then finds an input that causes a different output
This code used and is a modified version of PyExZ3 that is a Dynamic Symbolic Execution Engine for Python
This program is the deliverable of my final project
- Make sure you have Python 3.x installed
- Install Z3 here https://github.com/Z3Prover/z3
- For MacOS, open
setup.sh
and change the path according to your local machine then run:
. pyjudge/setup.sh
- try grading something
python grade.py test/max_3.py test/max_3_1.py
it should return something like this and saved the result to res
folder
Reference: max_3.max_3
Grading: max_3_1.max_3_1
======
RESULT
======
tested:
{(('a', 0), ('b', 0), ('c', 0)): (0, 0), (('a', 0), ('b', 0), ('c', 1)): (1, 1), (('a', 0), ('b', 2), ('c', 0)): (2, 2), (('a', -1), ('b', 0), ('c', 0)): (0, 0), (('a', 0), ('b', 0), ('c', -1)): (0, -1), (('a', 1), ('b', 2), ('c', 3)): (3, 3), (('a', 1), ('b', 0), ('c', 2)): (2, 2), (('a', 0), ('b', -1), ('c', 0)): (0, 0), (('a', 2), ('b', 0), ('c', 0)): (2, 2), (('a', 4), ('b', 5), ('c', 0)): (5, 5), (('a', 2), ('b', 0), ('c', 8)): (8, 8), (('a', 0), ('b', 1), ('c', 2)): (2, 2)}
tested from path dev or path eq:
{(('a', -1), ('b', 0), ('c', 0)): (0, 0), (('a', 0), ('b', 0), ('c', -1)): (0, -1), (('a', 0), ('b', -1), ('c', 0)): (0, 0)}
wrong:
{(('a', 0), ('b', 0), ('c', -1)): (0, -1)}
wrong from path dev or path eq:
{(('a', 0), ('b', 0), ('c', -1)): (0, -1)}
grade:
91.66666666666666%
python grade.py <reference_implementation>.py <student_implementation>.py
One of the goal of exploring this approach is to see if it can cover edge cases where random input generation can't. To see if it does that on a particular problem, try running it with random input generation and compare the result.
python random_grade.py <reference_implementation>.py <student_implementation>.py
- implementation must be in the form of function with return statement
- function have integer as input