Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomising Values in questions and tests as part of assignment process #1153

Open
psychemedia opened this issue Jun 11, 2019 · 13 comments
Open
Milestone

Comments

@psychemedia
Copy link

psychemedia commented Jun 11, 2019

Is there a way / would it be useful to provide a way of incorporating randomised elements into a question.

For example, I might set a simple task:

Load the file X into a pandas dataframe and preview the first {{import random;N=random.choice(random.randint(6,10)+random.randint(11,16)}} rows of it.

and then in the next cell test on:

assert_equal(_, pd.read_csv(X).head({{N}})

When the assignment is created, execute the {{...}} code as part of generating the assigned notebook....hmmm... once. per. student. That could get expensive, couldn't it?!

Okay, so maybe not for each student. But if you're in the habit of recycling questions one year to the next, with the occasional parameter change, then that could work?!

@jhamrick
Copy link
Member

When the assignment is created, execute the {{...}} code as part of generating the assigned notebook....hmmm... once. per. student. That could get expensive, couldn't it?!

Hmm, yeah, good point. It would definitely require some infrastructure changes to do this (or even just have 2 or 3 different versions that get released to different students) so it would be worthwhile to think about exactly how to do this. Maybe something like:

  • nbgrader assign generates N versions of the assignment
  • nbgrader release copies all N versions to the exchange
  • nbgrader fetch will then randomly grab one version
  • the preprocessors used by nbgrader autograde must then also be aware of the different versions when overwriting cell contents from the database.

@jhamrick jhamrick added this to the Wishlist milestone Jun 16, 2019
@psychemedia
Copy link
Author

If N different versions of the assignment are created, and each student then allocated one of them, it would also be useful to record exactly which variant each student received, for example, just in case something went wrong with one of the variants...

@nthiery
Copy link
Contributor

nthiery commented Sep 18, 2019

Also, if the same student fetches several time, he presumably should always get the same variant. Otherwise fetching several times would allow to choose the variant (e.g. that of his friend).

Potential solution: choose the variant based on a hash of the student id and the assignment id. Then that can be recovered at grading time.

@nthiery
Copy link
Contributor

nthiery commented Sep 18, 2019

For the randomization itself, I have syntax and code from another project where we do code randomization. Here is an example, in C++:

#include <iostream>
#include <vector>
#include "randomization.h"
using namespace std;

CONST I = RANDOM_INT(1, 3);
CONST VAL = RANDOM_CHOICE("alice", "bob", "charlie")
CONST TAB = RANDOM_VECTOR(RANDOM_INT(4, 7), RANDOM_INT, 0, 10);
CONST TAB2D = RANDOM_VECTOR(4, RANDOM_VECTOR, 5, RANDOM_INT, 0, 10);

int main() {
    auto i = I;
    auto tab = TAB;
    auto tab2D = TAB2D;
    cout << i << endl;
    for (auto value: tab)
        cout << value << " ";
    cout << endl;
    return 0;
}

In the instructor version, the randomization is taken care off by the language itself (RANDOM_INT, ... are defined as functions in randomization.h). In the student version, the RANDOM instructions get evaluated and replaced by the analogue of nbgrader assign.

Here were the design goal:

  • the instructor notebook should remain usable as is. Reexecuting it a couple time checks several (alas not all) variants
  • the syntax remains similar to the usual syntax of the language (to reduce the learning curve for authors), but with a visual clue about what gets evaluated at assignment time and at execution time.
  • the syntax should be implementable (up to minor variants only) across languages so that minor configuration only is required in nbgrader itself. Of course the implementation of the randomization needs to be redone in every language.

Happy to elaborate. Maybe we can use the same code base for the two projects?

https://github.com/PremierLangage/cpp-info111/blob/028854edf3b53b9d6af53ee9cbbc269b0cd026ef/template/builder.py#L67-L196

Fun fact is that the design of our instructor version ->student version transformation was largely inspired by nbgrader :-)

@perllaghu
Copy link
Contributor

To clarify.... we're looking at something akin to:

Cell 1:
Write a method to calculate the circumference of a circle when the radius is <insert random value>
Cell 2:

 def calc_circ:
       -- your code here

Cell 3:
assert calc_circ() == <calculated value>

--- yes?

@nthiery
Copy link
Contributor

nthiery commented Sep 18, 2019 via email

@omrt9
Copy link

omrt9 commented Jul 7, 2020

Hi, we are implementing nbgrader for multiple courses at our university and were hoping to randomize values in the questions in assignments. I was wondering if any progress was made in this direction. Thanks in anticipation!

@nthiery
Copy link
Contributor

nthiery commented Jul 9, 2020

Still on my todo list, but don't hold your breath. I was hoping to get help from an engineer over the summer/fall, but it seems it won't materialize, so progress will be more later than sooner.

So you may want to proceed. It should boil down to writing a new nbgrader filter, very similar to that that handle the "BEGIN XXX", but filtering the lines through the randomizing function in cpp-info111. Feel free to hijack whatever chunk of code from there that can be relevant.

@omrt9
Copy link

omrt9 commented Jul 9, 2020

@nthiery Thank you for your prompt response and for sharing the code snippets to get me started. I will post an update here when I make some progress in this direction!

@omrt9
Copy link

omrt9 commented Jul 17, 2020

Having studied the code snippets and the ideas suggested by @nthiery, it seems to me that it will only work for randomization in test cases. I was hoping to randomize the values in the questions such that when students fetch the same assignment again, they receive the same version of the assignment.

For example, consider the following question,
Calculate the area of the rectangle with sides, l = (some random integer), and b = (some random integer),
where the students must write the answer (numerical value) as a solution. Furthermore, the instructor must be able to auto-grade these questions based on the generic formula (l x b). So, one student might receive values l = 7 and b = 8, whereas some other student might receive l = 1 and b = 10 and they all should be auto-graded using l x b.

I realize that this might require major infrastructural changes in nbgrader as currently only one version is broadcasted to all the students (the "outbound" in the exchange directory has only one file per assignment for all the students), whereas randomized versions of the assignments will require nbgrader to broadcast multiple versions to the students and keep track of each version distributed.

I was wondering if something similar to this is being planned for the future releases of nbgrader? @jhamrick @minrk

Thank you,
Om.

@nthiery
Copy link
Contributor

nthiery commented Jul 18, 2020

Hi @omrt9,

Oh right, I forgot to mention that. Something indeed needs to be done on the infrastructure side. For whatever they are worth, here are some preliminary idea I had:

  1. Customize nbgrader fetch to trigger the randomization filter, using a hash of the student id (name/...) to initialize the random seed.
  2. Customize nbgrader autograde (which uses both the student and instructor version), to run the randomization filter on a copy of the instructor version and use the result to grade the student version.

I believe this should be enough to resolve the problems you mention. But I don't quite trust any idea that's not been implemented :-)

@nthiery
Copy link
Contributor

nthiery commented Jul 18, 2020

The above makes two assumptions:

  • the random generator shall be identical on the student device and on the autograding device
  • all the lines that define the random substitution should occur in the same order in both the instructor and student version. I.e. they should not appear in a SOLUTION block. That being said, I don't see a reason why someone would want to do that.

@kno10
Copy link
Contributor

kno10 commented Jan 27, 2021

I do prefer the simpler version mentioned by @jhamrick where there are simply N assignments, and each students gets one of them.
This would largely boil down to a function where you would allow access to an assignment only if hash(userid+seed)%N=x.
So there would still be a separate instructor version for each. This allows for more complicated variants.

But it may be desirable to hide the chosen version ID from the student. Otherwise, they will know they need to find someone with the same variant-number. This would mean some renaming when downloading and uploading (map user visible assignment name "asg42" to grader assignment "asg42-variantX")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants