-
-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for the AI Horde #77
Comments
+1 for AI Horde - If you haven't heard of it, you can give it a quick spin on https://artificial-art.eu/. I personally rent out my GPU to the Horde for fun, and many other people do too :) Btw OP is a big contributor to different FOSS projects in the space |
I've heard of it, it's a very cool project :) There is plenty of motivation to have a cloud solution without a lengthy boot phase and limited model support. As you mention hardware is a big gatekeeper, and like 90% of issues are about installation, so there is that too. Unfortunately I don't think it's going to be easy to integrate though: The plugin is using ComfyUI not just as an SD backend, but also a flexible pipeline for more complex workflows, and for performing various image related tasks. Parts can be replaced with multiple calls and doing image operations with Qt, but it's a lot of effort and inefficient compared to running those operations via torch on GPU tensors. A (very) old version of this plugin used Auto1111's API (similar to the Horde API), and switching to Comfy has been a great boon. Other things will be plain impossible to replicate - mostly the (100% denoise) inpainting/outpainting, which is quite complex, to achieve results that are comparable to Adobe (to some extend), without prompt or expertise. At the same time I feel this is an important selling point. Some example requirements that I don't see how to meet:
|
Do you do multiple calls to comfy, or do you manually create a comfyUI pipeline json based on your current settings and send that? Just to point out that the AI Horde is also using comfyUI in the background, so anything comfyUI can do, we can do as well. To answer your questions:
|
Yes I should have lead with that: technically it's definitely possible to make it work. The Comfy install, as well as the user/auth infrastructure that AI Horde already has can be reused. But I'm sceptical if it can be done without signficant extensions to the prompt API, some of which may end up rather specifically targeting the Krita plugin. As the plugin evolves, I am also frequently adjusting or extending the ComfyUI pipeline, which is harder to do if must be supproted by Horde workers. I will give a high-level overview of the inpaint pipeline. Let's assume the image is full HD, and the mask bounds 400x400px. We are replacing the masked contents entirely here, no img2img!
Sorry, very long, but this is already a condensed version. It's a kind of high-res-fix workflow, but doesn't upscale the entire image, only a region around the mask. The pipeline is built dynamically with variations if the resolution is too small rather than too large, or not a multiple-of-8, various other corner cases that simply happen when users without any particular knowledge about SD use an image application. All of this is one ComfyUI prompt. Some of the things can be done client-side (but more difficult and less efficient), but as far as I can see there is still a large gap to what AI Horde API provides. Some of the things might be neat general extensions. But probably some you would also consider to be very specific. |
If you have a specific pipeline for your plugin, we could arrange to have it copied over to our worker and when a specific trigger in the payload is sent (e.g . Assuming a proper setup of the comfy job template using placeholders, we can ensure that whatever arguments you send to the payload are forwarded to the right place. Technically we could allow you simply sending the whole comfy pipeline json and source images in one go, but that has risks to the workers which is why we can't do it right now. |
Yes if the workers would accept entire prompts just like the /prompt endpoint of Comfy it would work seamlessly out of the box - but I also assumed that would be a security nightmare. What you're suggesting with named fixed templates with arbitrary additional inputs can work too. Currently workflows are built dynamically, to account for nodes that need to be added depending on image size or how many Lora/ControlNet/whatever there are. Do the job templates support things like that? Or is this handled by worker code? |
yes ,we do dynamically change our comfy pipeline based on the type of request img2img, controlnet, Loras etc. If the type of workflows used are fairly simple, we could just create a number of versions and choose one based on the payload. If it's more complex, we could add in the payload some special triggers which would dynamically "compose" the final comfyworkflow according to some logic. We would just need to agree on a format. |
Can you point me to the code or workflow template which handles that for AI Horde right now? So I can get an idea of what is there. I currently have 6 workflows. With ability to conditionally include/exclude/repeat parts, I could represent them as is (or condense them into fewer if it makes sense). Without it might quickly become an unwieldy number due to all the combinations. If a workflow template was something like a Jinja template (or quivalent) which gets the payload as input dict, I think that would work without requiring any custom code on the worker. |
The pipeline discovery happens in this part https://github.com/Haidra-Org/hordelib/blob/3b001e1296ca72a3dae7bb34cc875e68c59f3bed/hordelib/horde.py#L639 We basically select one of the predefined pipeline jsons we have, based on the parameters in the payload. For example if a controlnet is requested we load the workflow containing controlnets. It wouldn't be particularly difficult to extend this |
I created a branch for the Krita plugin with a very simple/hacky Horde client to better judge how it fits in. Only basic txt2img. I think it will be quite a bit of work to make it nice, but all straight forward. There are some question, but mostly about superficial details. The more interesting question is how to support custom workflows, so next I looked at hordelib, my plan was to somehow throw a small part of my code in there which takes a "payload" dict and generates a complex workflow used by the plugin and executes it. I'm sure it can be done, but reading the code I get the feeling that it would either be very foreign in hordelib codebase, or require so much rewrite that it will be difficult to maintain for both sides. So first I'd like to come back to taking ComyUI prompt JSON on the horde API directly: it seems to me now the only solution that doesn't cause considerable friction in the long run. What I could easily do is replace certain nodes in Comfy prompt that I want to send with Horde versions:
The prompt JSON structure is pretty simple and node types could be checked against a whitelist. But I haven't thought about it much beyond that, probably you have a better idea how feasible it is? It looks like it's very easy to integrate that into hordelib, but perhaps security concerns remain. |
If it's something we could "standardize" to allow people to compose custom workflows safely somehow, I could see it being useful to onboard on hordelib. I.e. if we can make it a generic thing, and not something specific to your krita plugin only. It would be an useful colaboration as it would give a lot of flexibility to Power Users of the AI Horde.
The concerns I have is about the halting problem and the potential for someone to send something extraordinarily difficult in order to crash the workers. There's also the problem in that it's impossible to determine the kudos consumption of a payload like that as can be of infinite complexity. The latter is not a showstopper, but someone sending an infinite loop or a crashing payload is. I would love to hear if you have any ideas on how to validate a comfy payload to avoid these. If we can figure out a way to scan the payload json for sanity, I could make it widely available. The only safe way to pass complete comfy payloads currently would be to use a trusted user role. I.e. only Specific users would be able to send such payload. We can enable this adhoc for people. It's not optimal for allowing everyone to use the plugin without GPU, but it's a start. |
I suspect the only reliable way to deal with that would be to run jobs in an external process/container which can be monitored and terminated if it times out.
I'll experiment a bit to see how that might look like. |
We already do multiprocess comfy via hordelib. I've asked Tazlin (our backend dev) to chime in. |
Dumping my thoughts regarding "generic" workflows to make sure we're on the same page. A Horde Workflow Template consists of
The horde would support a generic endpoint which takes a workflow ID and matching payload. Workflows are deployed to workers using a semi-automatic process (maybe via a PR). When horde workers get a request for a custom workflow they:
I hope this would allow to add and modify custom workloads with minimal overhead. I still want to come up with a concrete example: so far I've tried to get something that meets Horde requirements but would also be a good format for me to maintain workflows. It's difficult, but maybe not necessary. Can always add another indirection. |
The AI Horde is a crowdsourced generative AI cluster for people to share compute. This can allow everyone to use this tool, and not only those with powerful GPUs. The AI Horde provides a completely open and documented REST API interface that you can easily integrate with from this plugin.
Happy to help with more questions about this
The text was updated successfully, but these errors were encountered: