Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider the length limits of the three primary key fields in the Past Answer table #1069

Open
taniwallach opened this issue Jan 16, 2020 · 1 comment

Comments

@taniwallach
Copy link
Member

See: http://webwork.maa.org/moodle/mod/forum/discuss.php?d=4706#p13984

The change to 4 byte UTF-8 in the tables restricts the primary key fields to about 1000 bytes, and after some overhead needed, only about 240 characters. The change to lib/WeBWorK/DB/Record/PastAnswer.pm changed the limit of 3 fields from 100 characters to 80 characters each.

From the output of
git diff b40bbff11f16bccd9471459c989dcd4602773768 lib/WeBWorK/DB/Record/PastAnswer.pm

- course_id => { type=>"VARCHAR(100) NOT NULL", key=>1},
- user_id => { type=>"VARCHAR(100) NOT NULL", key=>1},
- set_id => { type=>"VARCHAR(100) NOT NULL", key=>1},
+ course_id => { type=>"VARCHAR(80) NOT NULL", key=>1},
+ user_id => { type=>"VARCHAR(80) NOT NULL", key=>1},
+ set_id => { type=>"VARCHAR(80) NOT NULL", key=>1},

It is not clear to me that in the current SQL setup, where there is a past_answer table per course we even really need course_id as a field in this table. Removing it would free up lots of "space" for the set_id field.

user_id is somewhat more dangerous to limit to strictly, as LTI and LMS systems may be setting it using an email address or hask-key ID they use.

Even if course_id remains in the table, we could somewhat mitigate the issue if we set a short size limit on course_id (and maybe also user_id).

Another approach would be to consider replacing the use of set_id as a string (in ALL the relevent tables), and using an integer index number instead, and having a table of index numbers for set names (maybe also for user_id's). Such a change would make the database data less readable to humans, but probably more efficient in terms of speed and storage space.

@drgrice1
Copy link
Sponsor Member

I agree that there is no reason that the course_id field needs to be in the past_answer table as each course has its own past answer_table.

I kind of like having the set id as a string in the tables for human readability, but if we need to free up space and switch to an index with an additional table pairing the indices with set names then I would not be opposed. Although there may be backward compatibility issues with this kind of change to the database. We would also need to implement mechanisms for handling importing courses with the old structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants