Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable large course enrolment #10964

Closed
moziliar opened this issue Feb 15, 2021 · 2 comments
Closed

Enable large course enrolment #10964

moziliar opened this issue Feb 15, 2021 · 2 comments
Assignees
Labels
a-Design High-level design or architecture a-UIX User Interface, User eXperience, responsiveness c.Feature User-facing feature; can be new feature or enhancement to existing feature committers only Difficult; better left for committers or more senior developers p.Medium Marginal impact; would like to do if time permits

Comments

@moziliar
Copy link
Contributor

moziliar commented Feb 15, 2021

Description of feature/enhancement

Allow course instructor to create course with > 100 students.

Justification

Currently, TEAMMATES course creation only accepts 100 students per enrolment. This causes inconvenience to course instructor with > 100 students as it will require subsequent enrolments. Inconsistency in student data might even be introduced.

Considerations

Direction

  • The current bottleneck is updateStudentCascade, which is run at per-student granularity.
  • Due to the above, the two possible directions would be
    • Batch DB save
    • Improve the cascading update logic efficiency

Correctness

  • In this change, we want to at least maintain the correctness level of the enrollment action workflow; that is, we improve the enrollment performance without compromising the correctness
    • If an erroneous state could occur, it still could; otherwise, it should not happen after the change, e.g. a student updated from team A to team B still has dangling responses in the old team
    • From the current look of the cascading logic, there's no such protection on referential integrity as the backend code maintaining it could fail, whereby a student could enter an intermediate state where some of his/her related entities are updated but not the rest; no rollback logic is present
  • Batching DB calls per cascade level would likely retain such level of correctness, as what can happen before batching could still happen; need to investigate whether the likelihood is higher
  • Need to investigate whether there are other invalid state batching DB call could incur

Update Aggregation

Dependency

  • The student update seem to have interdependency in the following update options:
    1. Student email in response (as they can clash with other students)
    2. Section update (involving moving existing responses across sections)
    3. Team update (remove all responses as they are team-based)
  • From the validateSectionsAndTeams check, we have the following guarantees should it pass:
    • Section overflow would not happen
    • Students from the same team will not end up in different section
  • If we migrate the team check to the validation logic (which should be fast in view of the heavy IO-bound operations), we can reduce the dependency level between students with respect to team validity
  • For student email change in responses, a case that has tight dependency would be that two students are swapping emails; however, this rarely occur (swapping email after the students have submitted at least one response using the email)
  • Despite the dependency, it seems that should an update of a student fails at certain stage of the cascading logic, what has been changed about the students remain. This means that batching update to DB by cascade layer (e.g. update all the students in batch, then update all the responses from these students in batch and so on) would not lower the level of correctness.

Cost

  • If we try to parallelize CPU-bound workload, it will cause resource (CPU, memory) surge and very likely new instance creation, leading to more instance hours and more cost.
  • IO bound operations can be parallelized, e.g. blocking deletion (response deletion) with .now() as Datastore usage is not charged by its internal resource used, but # of entity R/W.
  • Therefore, if we ever want to parallelize anything, we should aim at storage layer unless we can afford the extra cost.

Workflow Analysis

In updateStudentCascade:

  1. getOriginalStudent
  2. updateStudent with UpdateOptions
  3. updateFeedbackResponsesForChangingEmail
    1. getFeedbackResponsesFromGiverForCourse
    2. for updateFeedbackResponseCascade
      1. getFeedbackResponse
      2. updateFeedbackResponse
      3. for updateFeedbackResponseComment
    3. updateFeedbackResponseCommentsEmails
    4. Repeat for getFeedbackResponsesForReceiverForCourse
  4. updateFeedbackResponsesForChangingTeam
    1. getFeedbackResponsesFromGiverForCourse
    2. for getFeedbackQuestion
      1. deleteFeedbackResponseCascade
    3. Repeat for getFeedbackResponsesForReceiverForCourse
    4. getStudentsForTeam
    5. deleteFeedbackResponsesInvolvedEntityOfCourseCascade if empty course
  5. updateFeedbackResponsesForChangingSection
    1. getFeedbackResponsesFromGiverForCourse
    2. for updateFeedbackResponse
    3. for updateFeedbackResponseCommentsForResponse
      1. getFeedbackResponse
      2. for updateFeedbackResponseComment
    4. Repeat for getFeedbackResponsesForReceiverForCourse

Among all the operations, for a single students:

  • getFeedbackResponsesFromGiverForCourse x3
  • getFeedbackResponsesForReceiverForCourse x3
  • getFeedbackResponse x4
  • updateFeedbackResponse x4
  • updateFeedbackResponseComment x4

Consider reducing the unnecessary calls to the Datastore

@moziliar moziliar added a-UIX User Interface, User eXperience, responsiveness p.Medium Marginal impact; would like to do if time permits committers only Difficult; better left for committers or more senior developers c.Feature User-facing feature; can be new feature or enhancement to existing feature a-Design High-level design or architecture labels Feb 15, 2021
@syrinemahmoud
Copy link

that sounds great !

madanalogy added a commit that referenced this issue Mar 30, 2021
* Paginate enroll calls

* Adjust code style

* Reformat html file for instructor enroll page

* Increase number of students per chunk

* Add mechanism to break long request smaller if timeout

* Refactor code and update test

* Add delay of 1s between paginated calls

* Refactor code

* Remove retry mechanism

* Paginate enroll calls

* Adjust code style

* Reformat html file for instructor enroll page

* Increase number of students per chunk

* Add mechanism to break long request smaller if timeout

* Refactor code and update test

* Add delay of 1s between paginated calls

* Refactor code

* Remove retry mechanism

* Add FE checks

* Run checks on 1D array

* Paginate enroll calls

* Adjust code style

* Reformat html file for instructor enroll page

* Increase number of students per chunk

* Add mechanism to break long request smaller if timeout

* Refactor code and update test

* Add delay of 1s between paginated calls

* Refactor code

* Remove retry mechanism

* Add FE checks

* Run checks on 1D array

* Simplify error messages and add mechanism to remove error highlights

* Refactor code

* Fix indentation

* Show enroll results when error happens

* Set error message after populating result panels

* Fix lint

Co-authored-by: Rahul Rajesh <rahul.rajesh.bhat@gmail.com>
Co-authored-by: Ahmed Bahajjaj <Ahmed_Bahajjaj@u.nus.edu>
@wkurniawan07
Copy link
Member

"Allow course instructor to create course with > 100 students." is already achieved. Any other kind of performance improvement should warrant a different issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a-Design High-level design or architecture a-UIX User Interface, User eXperience, responsiveness c.Feature User-facing feature; can be new feature or enhancement to existing feature committers only Difficult; better left for committers or more senior developers p.Medium Marginal impact; would like to do if time permits
Projects
None yet
Development

No branches or pull requests

5 participants