Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create bulk_add_nodes internal endpoint for initiating bulk creation of nodes #3041

Open
rtibbles opened this issue Mar 24, 2021 · 8 comments
Labels

Comments

@rtibbles
Copy link
Member

rtibbles commented Mar 24, 2021

Desired behavior

A new API endpoint that will take a bulk set of nodes with either channel information, or a parent_id and initiate an asynchronous task to create a new tree or insert them into an existing ricecooker tree. It should return the task_id of the generated task in order to allow ricecooker to poll for updates to the task from the task endpoint.

The main data for the POST request should be a JSON array of objects, each of which represents a ContentNode.

This data should be in the same format as that currently used by

def api_add_nodes_to_tree(request):
but with an additional optional children key that will contain other ContentNodes.

Stretch goal:

Generate file upload URLs 'as you go' for any file information generated, which can then be fed back to ricecooker during task polling. Ricecooker can than start doing direct file uploads to GCS while the tree is being generated.

Required for learningequality/ricecooker#321

Some suggested stages for tackling this issue:

  1. Make a clone of the existing api_add_nodes_to_tree API view that runs the node adding in an asynchronous task and returns the task_id instead - possible that with the existing payload it is feasible to pass the payload solely as task arguments.
  2. Investigate size limitations for this approach and add error checking if the API is sent JSON that is too large.
  3. If the size of the JSON that can be used in this way is very small, investigate ways to store the JSON in the database from the endpoint, and then reference this saved JSON in the asynchronous task.
  4. Optimize the node creation using similar bulk creation methods used in the copying logic in https://github.com/learningequality/studio/blob/unstable/contentcuration/contentcuration/db/models/manager.py#L338
@ozer550
Copy link
Member

ozer550 commented Mar 11, 2022

@rtibbles I would like to work on this issue. I'm new to the Kolibri ecosystem. Can I get some pointers to files where should I look into to implement this?

@rtibbles
Copy link
Member Author

The linked file: https://github.com/learningequality/studio/blob/hotfixes/contentcuration/contentcuration/views/internal.py#L522 here is the best place to start.

In addition, to see how we have efficiently managed bulk node creation in another part of the code base, follow the implementation of the copy_node method of the ContentNode manager class: https://github.com/learningequality/studio/blob/hotfixes/contentcuration/contentcuration/db/models/manager.py#L331

@vkWeb
Copy link
Member

vkWeb commented Mar 12, 2022

@rtibbles assigning this to @ozer550 and myself. We both will pair program this feature together.

@vkWeb
Copy link
Member

vkWeb commented Nov 29, 2022

We both got busy with solving and collobaroting on more important issues. So, this didn't get addressed. Unassigning both of us until we visit this again.

@vkWeb vkWeb unassigned vkWeb and ozer550 Nov 29, 2022
@AllanOXDi AllanOXDi self-assigned this Dec 1, 2022
@AllanOXDi
Copy link
Member

Let me try to bite it.

@AllanOXDi AllanOXDi removed their assignment Dec 27, 2022
@bjester bjester added this to the Studio: next major release milestone Jan 18, 2023
@akash5100
Copy link
Contributor

Hi there! I'm interested in this issue. Is anyone currently working on this issue or is it available for open-source contribution? Thank you!

@rtibbles
Copy link
Member Author

Hi @akash5100, thanks for your interest. Unfortunately since I wrote this issue, I am not sure that this is the right path forward, so I may close or rewrite this issue.

@akash5100
Copy link
Contributor

akash5100 commented Feb 16, 2023

Thank you for your quick reply, @rtibbles! I appreciate your help.
I was wondering if there are any other issues that are available for open-source contributions. I am very interested in contributing to open-source projects and would love to get involved in this project(studio). Thank you for your time and assistance.

@bjester bjester added P3 - low Priority: Stretch goal and removed P2 - normal Priority: Nice to have labels Mar 14, 2023
@bjester bjester removed this from the Studio: next major release milestone Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants