Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove task queue #205

Merged
merged 24 commits into from
Nov 19, 2023
Merged

remove task queue #205

merged 24 commits into from
Nov 19, 2023

Conversation

pmeier
Copy link
Member

@pmeier pmeier commented Nov 17, 2023

Closes #183, closes #204. TL;DR this PR gets rid of huey / the task queue. It has been the source of confusion, limited some nice to have features (looking at you #185), and overall made the implementation more complex.

Here are the highlights:

  • Remove the task queue and huey as dependency
  • Rename Chat.prepare and Chat.answer to Chat.aprepare and Chat.aanswer to better reflect their async nature See remove task queue #205 (comment)
  • Re-add Chat.prepare and Chat.answer as synchronous versions of their async counterparts. See remove task queue #205 (comment)
  • Add a new ragna.local_root function that functions as global configuration
  • Remove the config as parameter for components. If they need local storage, the new ragna.local_root function should be used.
  • Enable the ability to implement regular and async functions on the components. A regular function will now be run on a separate thread and thus keeping Ragna "async first"
  • Create a ragna.deploy namespace and move the config, authentication, REST API, and UI implementation to it.
  • Rename config.core to config.components
  • Move authentication and document settings to the "global" config scope, i.e. config.authentication and config.document

I didn't bother to update the documentation properly as we need to have a complete overhaul before the next release anyway.

@pmeier pmeier mentioned this pull request Nov 17, 2023
@pmeier pmeier marked this pull request as ready for review November 18, 2023 22:17
ragna/core/_rag.py Outdated Show resolved Hide resolved
@pmeier
Copy link
Member Author

pmeier commented Nov 18, 2023

@nenb Would you be able to test this PR and see if it works as expected. Just note that I didn't implement any streaming for #185 just yet. This will come in a follow-up PR.

ragna/core/_rag.py Outdated Show resolved Hide resolved
@pmeier
Copy link
Member Author

pmeier commented Nov 19, 2023

Sleeping on this, I think we should not block the removal of the task queue over the issues with the sync endpoints. They were only suggested by me in #183 (comment) to make it more clear that there is blocking behavior. But with this PR, there is no blocking anymore. Even without workers, everything is async. Thus, we can leave the sync endpoints for later. I'll remove them again from this PR and open an issue to fix this later.

@pmeier
Copy link
Member Author

pmeier commented Nov 19, 2023

@nenb I'm merging this without waiting for your reply to move on to the other features that build on top of this. Since we are not adding any sync endpoints for now, I'm confident that I didn't break anything. Feel free to comment here or a new issue if you find something that I missed.

@pmeier pmeier merged commit 2c1e5c4 into main Nov 19, 2023
10 checks passed
@pmeier pmeier deleted the remove-task-queue branch November 19, 2023 14:28
@nenb
Copy link
Contributor

nenb commented Nov 19, 2023

@pmeier Nice work, sorry I haven't been terribly responsive the last few days, I should be back online from tomorrow.

From looking through the PR now, the only bit that caught my eye was anyio.to_thread.run_sync. Some of the chunking/embedding can be quite CPU intensive, and I was wondering if it might be better to run in a separate process (to_process from anyio). I downloaded 5 books from Project Gutenberg in .txt format and tried to do a quick profile of threads vs processes for these 5 books. Unfortunately I got blocked on a bunch of serialisation issues (related to the chromadb ONNX format for their default model, and then related to tiktoken) and so I wasn't able to perform this comparison.

I'm mentioning this here for future reference. I know you have a bunch of work planned, and this is likely premature optimisation. But perhaps worth keeping in mind ways to mitigate the impact of CPU-heavy workloads in future designs. And also some of the defaults, like the values that are used when starting the API with uvicorn. Again, likely premature optimisation on my part!

@pmeier
Copy link
Member Author

pmeier commented Nov 20, 2023

Unfortunately I got blocked on a bunch of serialisation issues (related to the chromadb ONNX format for their default model, and then related to tiktoken) and so I wasn't able to perform this comparison.

Do you think this is an issue with Ragna? If so, could you write up an issue so I can have a look?

I'm mentioning this here for future reference. I know you have a bunch of work planned, and this is likely premature optimisation. But perhaps worth keeping in mind ways to mitigate the impact of CPU-heavy workloads in future designs.

Yeah, I would put it under premature optimization unless we have valid complains about this. And in that case I would maybe go a step further and not just add the subprocess option, but a task queue would also fit nicely in there.

@nenb
Copy link
Contributor

nenb commented Nov 20, 2023

Do you think this is an issue with Ragna? If so, could you write up an issue so I can have a look?

I don't think it's an issue with ragna, but I will write up some details (along with a bunch of other things I haven't responded to) throughout today, so you can have a look.,

@nenb
Copy link
Contributor

nenb commented Nov 21, 2023

@pmeier I have opened an issue here with details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

queueless Ragna? Change the default queue implementation something other than memory
2 participants