-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#10964] Paginate enroll calls for large courses (FE) #10968
[#10964] Paginate enroll calls for large courses (FE) #10968
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the code construct, it looks alright!
Some pointers I noted down in the comments.
Suggestion: add a post-hook
in your local repo (and put in .gitignore
) to lint your code whenever you commit (just put npm run lint
and ./gradlew lint
) should be fine? Since linter will be run now or later.
Would be good to include a gif on how this enrollment process looks in both UI and network call (devtool) perspectives. Will have to measure the memory pressure but should not be a major concern in this issue. Of course, can do some testing in staging server too!
@rrtheonlyone would appreciate some feedback on #10943 as it would definitely help make this PR's UX a lot smoother. In the interim, if no big issue, perhaps you could merge that branch into this branch and start working on integrating the error display while that PR is undergoing final review.
Edit: can also merge this first before doing the error message optimisation. Just a suggestion.
.../pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.html
Outdated
Show resolved
Hide resolved
.../pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.html
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
Extra thought on the linearly scaled waiting time (4-5min for 1000 students): how can we accelerate this? Or is it a concern at all? @damithc |
Not at the moment. It's fine as long as the user sees regular progress. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem of misalignment still persists. Please look at the 300+ LoC. This will most likely lead to lint failure anyways (not sure why the CI is not running for this PR to master)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now. But pending discussion on whether to increase the batch size and split on demand.
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question for discussion: if you use chunk in many places, should you already define an interface for it to make the code look cleaner? Not a suggestion but a food for thought.
Could you snip a gif showcasing how it would work? I think a chunk size of 200 would be good demo for timeout and split.
Keep up the good work!
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
6e07cc5
to
7fed62b
Compare
7fed62b
to
57c021f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive 💯! I like the code construct in the current phase :)
Can you simulate the error occasion and see how the request is broken down into two halves? I believe this can be done by adding some lines to make some chunks larger.
Also, concerning @rrtheonlyone 's worry about quick succession and eventual consistency of the datastore we use, should we pad some interval in between enrolments to achieve a best effort consistency preservation? On the other hand, I did have the thought of using FE to perform pre-flight check on the to-enrol students. That is for another time. Given the latency in local server, I suspect a 1s backoff can allow the datastore instance to reach quorum most of the time.
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
(resp: Students) => { | ||
if (resp.students.length !== 0) { | ||
this.loadExistingStudentsData(existingStudentsHOTInstance, resp.students); | ||
} else { | ||
// Shows a message if there are no existing students. Panel would not be expanded. | ||
this.isExistingStudentsPresent = false; | ||
this.isExistingStudentsPanelCollapsed = !this.isExistingStudentsPanelCollapsed; // Collapse the panel again | ||
} | ||
}, (resp: ErrorMessageOutput) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the change of indentation here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work put in here 👍
To me this is not the way forward for this... just because we got away with it in downloading session results, doesn't mean it always works.
Reasons why I feel this solution is too naive:
- the number you break up into (200?) is a magic number and just a hopeful toss at the moment (unless you have stress tested on this)
- the correction mechanism is adding unnecessary complexity to the code base (why split in half and try API again?)
- datastore is eventually consistent (we have no control over this and we can't wait it out and hope we get correct data) => this adds more problems since we may end up with wrong data in the db
My stance on this is that we should aim to optimize the logic on the backend - that is more firm and will stand the test of time. Of course, feel free to convince me otherwise.
There must be some way to come up with a new execution logic for validation + creation. From how we interact with the database and how we do the validation in the logic layer.
First step, I feel is to separate enrolling new students from updating old students (or be more clever about how we figure out if a student is already there). If that can be done, there are def ways we can optimize the validation and creation logic to easily handle <1000 students. We can do a projection query to retrieve the needed data / we can do a batch create for instance.
My intuition tells me that this batching with retry mechanism on the frontend like this is unnecessarily complex - we are operating in the scale of 100s (its not like there is millions of datapoints). However, I may be wrong - def warrants some discussion.
Undeniably, the current logic is too complex atm. Below are some of my opinions:
Improving the workflow won't help if the bulk of the workload is in IO bound operation, likened to Amdahl's law. Paralleling the individual enrolment might help but depends on the instance resource. However, if batching student entity creation helps in DB later, it's worth investigating. To this, I'd like to check the scalability of the solution still. Although I'm sure we won't have to enrol a University worth of students into a single course. |
I am not sure what you are trying to say above. I am bouncing the idea that we can try achieve this entire operation by making 1/2 API calls with no batching (with backend optimization).
No - split and retry is only triggered upon error. What if there is no error but a succesful enrollment after validating with stale data? I am suspecting that the eventual consistency issue is going to be worsened by the repeated API calls since it is only increases your chances of getting stale data from the database. I do not know why this is "actually safe" as you point out. Even the current production version has this problem - just that the probability of it happening is almost zero since the instructor will take some time to break up and re-upload. (can you help me test and say the same for your solution here?)
This is indeed what this PR is doing. I have already addressed my concerns to this idea above. We need to be clear that this is the best way forward for this. (e.g. show that no amount of backend optimisation can help).
Yes, that's right - why not optimize on the IO operations then? Something like getting only a subset of data (projections) and making a small amount of calls to enroll (batch operations). I am not talking about parallelising here haha. |
We should look into BE optimization for sure. The current entity-by-entity enrolment logic can definitely be optimized (unsure about the benefit without concrete stress test result: say if we shave 70% of the latency and allow 250 students enrolment instead of 100, is this scalable enough? This might be my naive thought, will test it out) Just bouncing some ideas against eventual consistency issue: would enforcing strong consistency at objectify level help (like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Code looks much better.
Here is another odd behaviour:
Suppose we have for instance 200 students and there is an error in the 200th student that only the backend can catch. When we click enroll, we get back an error message about the 200th student. However, we don't see any enrollment panel about the successful enrollment of students 100-150. This is def tricky but we should be clear to the user about which students got enrolled - right now a user might think the whole batch failed and might get a surprise later on.
const noInvalidRowsBefore: number = this.invalidRowsIndex.size; | ||
|
||
Array.from(studentEnrollRequests.keys()).forEach((key: number) => { | ||
const request: StudentEnrollRequest | undefined = studentEnrollRequests.get(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when would this be undefined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would not be undefined, but enforcing non-null is not permitted and so putting studentEnrollRequests.get(key)!!
will not pass lint
|
||
if (teamSectionMap.get(request.team) !== request.section) { | ||
this.invalidRowsIndex.add(key); | ||
const firstIndex: number | undefined = teamIndexMap.get(request.team); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here - why undefined (also check for the others)
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Outdated
Show resolved
Hide resolved
...pp/pages-instructor/instructor-course-enroll-page/instructor-course-enroll-page.component.ts
Show resolved
Hide resolved
@rrtheonlyone Hi I could not reproduce the scenario you mentioned. Here I try to enroll 200 students with the last one having an incorrect email which can only be caught by backend, but the panel still shows correct enrollment results |
@daongochieu2810 Try put a duplicate team for the last student instead (duplicate team that can only be caught by the backend). I just tried with 51 students (where the 51st student has the error) and this is what I see: I verified that the first 50 have actually been uploaded. |
…hieu2810/teammates into hieu/enroll-large-course
@rrtheonlyone The result panel is populated if an error happens |
Yup better now - however the |
I solved that by setting error message after populating the result panels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
@daongochieu2810 can you resolve the conflicts and update me after |
…hieu2810/teammates into hieu/enroll-large-course
@madanalogy I have resolved the conflicts |
Could you help to fix the failing component tests as well? Think you just need to update a snapshot as in #11051 |
Part of #10964
Outline of Solution
concat
fromrxjs
to wait and combine the responses of child requestsDemo on localhost with split-and-retry mechanism: here