In [3]:
"""
Gradelang

Thomas Howard III
Mary Wishart
Brittany Lewis

Introduction:

Grade is an autograding framework for Python which provides more flexibility than traditional Python TestCases 
by allowing for arbitrarily complex testing conditions. Gradelang is a domain specific language for the Grade framework.
It allows graders to create question structures to evaluate student executables. 

By creating Gradelang, we hope it will make it easier for teachers and graders to evaluate student code quickly, easily,
and fairly. We also hope that the ease and flexibility of testing will create more space for creativity in assignment
design.


Implementation:


Gradelang is broken up into block structures. There are four main types of blocks:

- Setup: Statements in the setup block run before every question. Setup will most likely involve statements that
         create necessary test files or check requirements for questions.
- Teardown: Statements in the teardown block run after every question. This could include operations such as cleaning
         up created files by deleting them.
- Question: Each question block runs independently from each other question and allows the grader to specify tests
            and award values associated with those questions.
- Output: This block determines how the results of our program are outputted.

Within those structures we support common arithmetic expressions as well as statements for file creation, 
variable assignment, arbitrary string, float, and int creation through the hypothesis framework. Most importantly
tests and be run and tested through the run and assert commands which are supported through the Grade pipeline


Challenges:

One challenge we faced was making each question truly independent. In order to ensure that one question test failure
would not affect any future tests (an important requirement for autograding) we run each question in a separate subprocess.

This can cause issues for python 3.7.3 and python version 3.7.4 both of which have a bug within their subprocess handling. 

If you have issues with your subprocess run, please try updating your python installation to python 3.8. Further
instructions and a docker image are also included. Please contact the project team with any problems you encounter.

In addition to these challenges, we also had some challenges with getting hypothesis to work with the structure we wanted
for our language. 


Examples:

In order to run these examples you will need to have:

- Python version 3.8
- ply
    - pip install ply
- Grade 
    - pip install grade
    - https://github.com/thoward27/grade
- Hypothesis
    - pip install Hypothesis
    - https://hypothesis.readthedocs.io/en/latest/
    
In addition to the below examples we have a full suite of python unittests in the test folder of our project. 

"""
from gradelang.interpreter import interpret

# Basic setup tests
empty = "setup {}"
interpret(empty)

trivial_passing = "setup { assert 1 == 1; }"
interpret(trivial_passing)

trivial_failing = "setup { assert 1 == 0; }"
interpret(trivial_failing)


#Question tests

empty = "question {}"
interpret(empty)

trivial_passing = "question { assert 1 == 1; }"
interpret(trivial_passing)

trivial_failing = "question { assert 0 == 1; }"
interpret(trivial_failing)

testing_output = """
question {
    run "echo hello world";
    assert "hello world" in stdout;
}
"""
interpret(testing_output)

testing_exit_success = """
question {
    run "echo hello world";
    assert exit successful;
}
"""
interpret(testing_exit_success)

testing_create_string="""
    question \"string\"{
        let x be String(minlen=10, maxlen=100);
        run "echo", x;
        assert x in stdout;
        award 10;
    }
"""
interpret(testing_create_string)

interpret(testing_exit_success)

name_string = 'question "named" {}'
interpret(name_string)

name_int = 'question 1 {}'
interpret(name_int)

#Teardown tests
empty = "teardown {}"
interpret(empty)

trivial_passing = "teardown { assert 1 == 1; }"
interpret(trivial_passing)

trivial_failing = "teardown { assert 0 == 1; }"
interpret(trivial_failing)

#Output tests
empty = "output {}"
interpret(empty)

json = 'output { json; }'
interpret(json)

markdown = 'output { markdown; }'
interpret(markdown)

#Testing programs
empty ="""
setup {}
question {}
teardown {}
output {}
"""
interpret(empty)

setup_failure = """
setup { assert 1 == 0; }
question { assert 1 == 1; }
teardown {}
output {}
"""
interpret(setup_failure)

proposal = """
        setup {
            touch "temp.txt";
            run "echo";
            assert exit successful;
        }

        teardown {
            remove "temp.txt";
        }

        output {
            json;
        }

        output {
        }
            question 1 {
            # Run the program, saving output.
            run "echo", "hello world";

            # Now let's run some checks.
            assert exit successful;

            # This checks both stdout and stderr
            assert "hello" in stdout;

            award 10;
        }

        question 2  {
            run "echo", "hello world";
            assert "goodbye" not in stdout;
            award 10;
            assert "hello" in stdout;
            assert "hello" not in stderr;
            award 10;
        }

        question 3 {
            let x be Float(minvalue=1);
            run "echo", x;

            # If we want to just look at stdout.
            assert x in stdout;
            
            String y = "fish";
            run "echo", y;
            assert "fish" in stdout;
            
            let z be String();
            run "echo", z;
            assert z in stdout;
            
            let camel be Int(min_value=6);
            run "echo", camel;
            assert camel in stdout;
            
            award 50;
        }
        """
interpret(proposal)



"""
Conclusions:

We successfully created a simple, easy to use language to work with the Grade autograding framework. We believe that this
will make it easier to efficiently grade assignments. We also think our language will be natural and easy to learn
for anyone who has used an autograder in the past.

"""

Grade Results
Grade Results
Grade Results
Grade Results
Question 0: 0/0.
Grade Results
Question 0: 0/0.
Grade Results
Question 0: 0/0.
Exception thrown: AssertionError()
Traceback (most recent call last):
  File "D:\gradelang\gradelang\interpreter.py", line 49, in worker
    walk(question.body)
  File "D:\gradelang\gradelang\walk.py", line 194, in walk
    return dispatch[action](ast)
  File "D:\gradelang\gradelang\walk.py", line 108, in <lambda>
    'seq': lambda ast: (walk(ast[1]), walk(ast[2])),
  File "D:\gradelang\gradelang\walk.py", line 194, in walk
    return dispatch[action](ast)
  File "D:\gradelang\gradelang\walk.py", line 114, in <lambda>
    'assert': lambda ast: _assert(ast[1]),
  File "D:\gradelang\gradelang\walk.py", line 102, in _assert
    raise AssertionError
AssertionError

Grade Results
Question 0: 0/0.
Grade Results
Question 0: 0/0.
Grade Results
Question string: 0/10.
Exception thrown: ValueError("Unknown node: ('paramlist', ('paramassign', 'minlen', ('integer', 

SyntaxError: LexToken(ID,'touch',142,29) (<string>)