Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud-Based execution of move-contacts commands #12

Closed
kennsippell opened this issue Jan 19, 2024 · 10 comments · Fixed by #177
Closed

Cloud-Based execution of move-contacts commands #12

kennsippell opened this issue Jan 19, 2024 · 10 comments · Fixed by #177
Assignees
Labels
Type: Feature Add something new. uganda

Comments

@kennsippell
Copy link
Member

Currently this tool outputs a cht console line which can be executed to move contacts. Let's take this to the next step and execute that command in the cloud.

@kennsippell kennsippell added Type: Feature Add something new. uganda labels Jan 21, 2024
@kennsippell
Copy link
Member Author

kennsippell commented Feb 12, 2024

Discussion here to also move households and not just CHP areas https://github.com/moh-kenya/config-echis-2.0/issues/1662 + using forms in the CHT to trigger the move (not a UI interface like cht-user-management).

@kennsippell
Copy link
Member Author

kennsippell commented Apr 16, 2024

Notes from design conversation today with @paulpascal.

What states will jobs go through:

  1. Pending
  2. (Not MVP) Flag user for maintenance
  3. (Not MVP) Wait for the user to sync
  4. (Not MVP) Check sentinel backlog and couch2pg backlog to ensure it is safe
  5. Execute the move-contacts + upload-docs
  6. (Not MVP) Remove user from maintenance mode
  7. Success
  8. Fail

What is needed from a Cloud-Based Queue

  1. Hostable in docker (rabbitMQ)? Or can it be cloud-based like SQS?
  2. Great if had a UI to view status, click to cancel, click to view logs, etc so we don’t have to build that
  3. We need to wait for users to sync... so we need to able to pop from queue many times, but decide not to process job for some days

Design questions:

  1. What should happen if a user never syncs? Should we eventually move them? What is a reasonable limit?
  2. What queuing technology should we use for this?
  3. How can a user cancel a job? Learn about the state of the job? Investigate a failed job?
  4. What parts of this should be built into CHT-Core's sentinel?

Next Steps:

  1. Evaluation of cloud-based queues against requirements above.
  2. Pick one.
  3. It’d be great to just write this down and kinda get some opinions. Maybe post it on #development or #cht-user-management
  4. Pick how things will get pushed into the queue?
  • Options
    • User management tool push into queue
    • UI in CHT
  • The biggest factor here is WHO should move contacts and reorganize hierarchies? Should they be online users? Who is it in #moh-togo/#moh-uganda/#moh_kenya?
  1. Make work thread(s) that pull from the queue and kick of cht-conf move-contacts code (via process, or via import, whatever)

-- End of MVP --

  1. Expose the queue’s UI somehow?
  2. Or build a UI to show status, cancel, etc.
  3. Moving contacts safely - Don’t take the server down
  4. Moving contacts safely - Don’t lose personal data

@mrjones-plip
Copy link
Collaborator

Any thoughts of doing this through a long lived task via CHT Core API and Sentinel? While it would mean needing to upgrade Core where you want to use the feature, there's already a system in place for long running tasks and queues - eg bulk upload. The user man. tool should also be able to query the status of the job as well.

@kennsippell
Copy link
Member Author

@mrjones-plip Who can we talk with to learn about how this is implemented in sentinel and how these long running tasks/queues work today? I'm only familiar with our homebrew couch-based queuing system used for outbound push. Is this the same system?

@mrjones-plip
Copy link
Collaborator

@latin-panda @m5r and @njuguna-n did the original work on the CSV bulk upload according the PR I found!

cc @jkuester

@kennsippell
Copy link
Member Author

Thanks. Quick scan and this PR doesn't appear to touch sentinel at all. Perhaps bulk uploads aren't processed by sentinel at all?

@mrjones-plip
Copy link
Collaborator

mrjones-plip commented May 9, 2024

Oh! Well, that would be good to know if it wasn't in Sentinel - sorry if I've led you astray.

I dug up some test steps (private slack link) that I originally used to performance testing of bulk upload. Until some the engineers chime in on this ticket, maybe this will better expose how it works? From what I can tll there's a parseCsv() function which writes to the medic_log database. The medic_log entry is queried via AJAX to update the job progress on screen as its rows of the CSV are processed. When it's done a CSV of errors is available for download.

@latin-panda
Copy link

latin-panda commented May 10, 2024

Yes, as @mrjones-plip explained, that bulk upload tool is a bunch of promises and waits until all are resolved - ongoing in the server (not a scheduler and not sentinel) - and writes the upload status in the medic_log (how many users are pending, failed, or successful).
At the moment, I don't have more experience in Sentinel than what's documented, perhaps if this feature is expected to be heavy, it might need to create some sort of transition (entry point) then Sentinel will start listening for db changes and queueing those transitions to apply the changes

@kennsippell
Copy link
Member Author

kennsippell commented May 12, 2024

OK good to know! Thanks to both of you for the background info.

If we consider the long-term plan to move contacts without dataloss via something like medic/cht-core#8860, then this will become a very async operation (multiple days). Therefore, I don't think the pattern used for bulk upload is right for this problem.

I think it's quite a bit more complex to build in sentinel. It would probably look a bit like how outbound push is written now; but I personally think we shouldn't be investing bespoke queuing technologies which clutter our already strained CouchDB. If a Core Dev has time to take this on, I do think it would be a more reusable approach across projects ... but I also think this is too much to ask of @paulpascal who has time to make progress on this now, and is doing great via a public reusable queue.

Thanks for the suggestion and please do let me know if you think we're missing anything, if our plan to move contacts via multi-day async queuing isn't correct, or if anything else here isn't in the best interest of users.

@mrjones-plip
Copy link
Collaborator

Thanks @kennsippell - that sounds like a fair assessment! I appreciate the consideration on what it would look like inside CHT Core vs outside.

paulpascal added a commit that referenced this issue Jun 10, 2024
paulpascal added a commit that referenced this issue Jun 19, 2024
paulpascal added a commit that referenced this issue Jun 19, 2024
paulpascal added a commit that referenced this issue Jun 26, 2024
paulpascal added a commit that referenced this issue Jun 29, 2024
* wip: move contact setup raw feature

* feat(#12): add centralized queue layer  & cleanup

* fix(#12): oups test

* fix(#12): get this test pass

* fix(#12): lint

* feat: address feedback part 1

* fix(#12): failing test

* feat(#12): fix review feedback part2

* feat(#12): fix review feedback part2

* fix(#12): job postponed

* fix(#12): update  postponed test

* feat(#12): make this toast display

* ref(#12): remove move_result.html as no more used

* fix(#12): address docker feedback

* Apply suggestions from code review

* fix(#12): address feedback part 5

* ref(#12): switch to an official version of cht-conf support session

* Apply suggestions from code review

* doc(#12): update readme

* fix(#12): add integration test + feedback

* 1.4.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Add something new. uganda
Projects
Development

Successfully merging a pull request may close this issue.

4 participants