Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial Grading #152

Merged
merged 12 commits into from
Nov 17, 2016
Merged

Partial Grading #152

merged 12 commits into from
Nov 17, 2016

Conversation

ankitjavalkar
Copy link
Contributor

Add partial grading to evaluation

Copy link
Contributor

@prathamesh920 prathamesh920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have few comments.
Also, we should add test cases for testing partial grade marks returned from the evaluator.

@@ -27,7 +28,7 @@ def teardown(self):
delete_files(self.files)
super(BashCodeEvaluator, self).teardown()

def check_code(self, user_answer, file_paths, test_case):
def check_code(self, user_answer, file_paths, partial_grading, test_case, marks):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_code now returns 3 values, so it will be better if the Returns description is modified in the method docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add further test cases & validations in a subsequent PR, I was hoping this could be merged so that further Evaluator based PRs can be designed accordingly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring can be updated now though.

@@ -1104,9 +1115,11 @@ class TestCase(models.Model):

class StandardTestCase(TestCase):
test_case = models.TextField(blank=True)
marks = models.FloatField(default=0.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So each test case will have marks!
Instead I think a single constant fraction value for each testcase is a better option.
Else we can sum the total testcases and find the ratio of right to total.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @prathamesh920 !

user_answer.correct = correct
user_answer.error = result.get('error')
if correct:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a fraction(value 0 to 1) returned from the code sever, then we do not need this if block, we can simply write
user_answer.marks = question.points * result['marks']
if wrong then marks is 0
if right then marks is 1
else fraction of marks.
So no need to check correct or wrong for partial grading when adding marks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If condition will have to stay.
In case when there is no partial grading for a question, in that case the evaluator always returns 0.0
So if a question is answered correctly but it is not graded partially then it will end up being marked 0.0 marks (in case success=True is not checked)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @prathamesh920 that each test case having marks is non optimal. I think I have a nicer way to specify this. Basically tread the "marks" you have above as a weight for each test, then the total mark is sum(obtained weight)/total(weight)*marks. By default each test case has a weight of 1, this way is nicer I think and much easier to implement. A teacher may even change the total marks without having to worry about the individual weights.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@prabhuramachandran prabhuramachandran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the comments made, I am OK with this being a two stage PR, however, it will be good to resolve the current issues.

@@ -27,7 +28,7 @@ def teardown(self):
delete_files(self.files)
super(BashCodeEvaluator, self).teardown()

def check_code(self, user_answer, file_paths, test_case):
def check_code(self, user_answer, file_paths, partial_grading, test_case, marks):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring can be updated now though.

user_answer.correct = correct
user_answer.error = result.get('error')
if correct:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @prathamesh920 that each test case having marks is non optimal. I think I have a nicer way to specify this. Basically tread the "marks" you have above as a weight for each test, then the total mark is sum(obtained weight)/total(weight)*marks. By default each test case has a weight of 1, this way is nicer I think and much easier to implement. A teacher may even change the total marks without having to worry about the individual weights.

@@ -1104,9 +1115,11 @@ class TestCase(models.Model):

class StandardTestCase(TestCase):
test_case = models.TextField(blank=True)
marks = models.FloatField(default=0.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @prathamesh920 !

@ankitjavalkar
Copy link
Contributor Author

  • Docstrings added.
  • Test cases now have point based weightage

Copy link
Contributor

@prathamesh920 prathamesh920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are few comments, kindly resolve them and test them.
I also feel that there should be testcases for partial grading, though can be added in stage 2 while optimizing.

@@ -27,7 +28,7 @@ def teardown(self):
delete_files(self.files)
super(BashCodeEvaluator, self).teardown()

def check_code(self, user_answer, file_paths, test_case):
def check_code(self, user_answer, file_paths, partial_grading, test_case, weightage):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring not modified. Can be done in next stage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of weightage perhaps use weight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -517,11 +517,16 @@ def check(request, q_id, attempt_num=None, questionpaper_id=None):
if question.type == 'code' else None
correct, result = paper.validate_answer(user_answer, question, json_data)
if correct:
new_answer.marks = (question.points * result['weightage'] /
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in code_evaluator line number 92 is following
result = {'success': success, 'error': error, 'marks': marks}
so here result['weightage'] will throw an error!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had already fixed this in a minor commit, It did not get pushed. I apologise. This has been fixed

new_answer.error = result.get('error')
else:
new_answer.error = result.get('error')
new_answer.marks = (question.points * result['weightage'] /
question.get_maximum_test_case_weightage()) \
if question.partial_grading and question.type == 'code' else 0
new_answer.save()
paper.update_marks('inprogress')
paper.set_end_time(timezone.now())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below we see that there is check if not result.get('success'):
in else we add the current question to completed questions and show the next question.
So in case my all testcases fail except for the last one then the question will be added to the completed questions!!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been fixed now

weightage += test_case_weightage
error = err
else:
error += err + "\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for all testcases we are appending the err.
In case, the first testcase passes and the next testcase fails then
error message will be Correct Answer + Assertion Error.

This can be covered in your next PR related to error message formatting,

Copy link
Member

@prabhuramachandran prabhuramachandran Nov 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe error should be a list?

else:
err = ("Error:expected"
" {0}, got {1}").format(inst_stdout+inst_stderr,
stdnt_stdout+stdnt_stderr
)
return False, err
return False, err, test_case_weightage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this returning the weightage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have initialised the test_case_weightage as 0.0 earlier. Hence the variable is returned as is.

However I agree that this may be a bit unclear, hence fixing this to return a constant 0.0

result = json.dumps({'success': False, 'error': 'Unable to connect to any code servers!'})
result = json.dumps({'success': False,
'weightage': 0.0,
'error': 'Unable to connect to any code servers!'})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please describe what the return values are in the docstring very clearly. This looks like the sum of the weights that the user code passed, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct, Fixed the docstrings accordingly

@ankitjavalkar
Copy link
Contributor Author

ankitjavalkar commented Nov 16, 2016

I have changed the Error display in the same PR (because both were interdependent)

Please check the Output Display Interface:

screenshot from 2016-11-16 20 33 57

@ankitjavalkar
Copy link
Contributor Author

  • Added a Test case to verify the situation pointed out by @prathamesh920

test/have dissimilar output, when compared to the instructor script.

Returns (False, error_msg): If mandatory arguments are not files or if
Returns (False, error_msg, 0.0): If mandatory arguments are not files or if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, should this just raise a custom exception?

if file_paths:
self.files = copy_files(file_paths)
if not isfile(clean_ref_code_path):
msg = "No file at %s or Incorrect path" % clean_ref_code_path
return False, msg
return False, msg, 0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these should perhaps raise an exception, no?

if test_case_success:
weight += test_case_weight

error += err + "\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is dirty, can't we collect a list of errors?

@@ -47,11 +68,13 @@ def check_code(self, user_answer, file_paths, test_case):
info = traceback.extract_tb(tb)
fname, lineno, func, text = info[-1]
text = str(test_case).splitlines()[lineno-1]
err = "{0} {1} in: {2}".format(type.__name__, str(value), text)
err = ("-----\nExpected Test Case:\n{0}\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ugly, can't it be some HTML code or can this be delegated to something else?

@prabhuramachandran
Copy link
Member

Merging for now.

@prabhuramachandran prabhuramachandran merged commit aa6ed71 into FOSSEE:master Nov 17, 2016
@ankitjavalkar ankitjavalkar deleted the partial branch January 16, 2017 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants