Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use batched uploads of tree subsections to more efficiently generate trees on Studio #321

Open
rtibbles opened this issue Mar 23, 2021 · 0 comments

Comments

@rtibbles
Copy link
Member

Following the logic used for efficient copying in the copy_node functionality in Studio, ricecooker should batch upload subsections of the overall node tree with full associated file and assessment item metadata.

This would mean that rather than waiting for an API call for every single node, batches of ~1000 could be sent at a time, and an asynchronous task then set in motion on Studio to do the writing to the database.

Ricecooker would poll Studio at a low frequency interval (~1 minute, but could be tuned) to see when the task completed, then follow up with subsequent updates. For parallelization, different child node subsections that are independent of each other could be POSTed in parallel, relying on Studio's mptt locking mechanisms to ensure correct ordering and tree value calculation. However, to ensure correct ordering, direct siblings should be POSTed together.

Previous discussion of this is here: #231 (comment)

The 'longer range' suggestion of parsing SQLite DBs from ricecooker would rely on the restorechannel command on Studio being reliable and fast. It is neither, and is likely not to be either unless it uses the Kolibri import mechanisms, so it seems more sustainable to take the 'short term' approach that is scalable and easy to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant