-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
api: Create generative AI APIs using AI subnet #2246
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
ff1c8d3
to
11f49a3
Compare
6cb0eef
to
010cd01
Compare
eb782a5
to
caf4590
Compare
f999499
to
db20f20
Compare
db20f20
to
427a89f
Compare
427a89f
to
4baac9d
Compare
4baac9d
to
deee56d
Compare
Pretend they are JSON for now, adding support for multipart forms next.
That required upgrading ajv itself, which was a bit of a pain but worked and also found a few issues in our schema.
f7f5590
to
deb0f50
Compare
deb0f50
to
d89416f
Compare
d89416f
to
802d297
Compare
0e4de82
to
b0994b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Added some minor comments. I have some other questions comments (about billing and monitoring) but I'll open them in Discord.
type: string | ||
default: SG161222/RealVisXL_V4.0_Lightning | ||
enum: | ||
- SG161222/RealVisXL_V4.0_Lightning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that currently, we accept only 2 models and that these 2 specific models are always available in Os?
Btw. what happens if the model is not available in the specific O?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leszko Forgot to reply this. Yeah right now we accept only these 2 models (or others in other APIs) which are the ones recommended by the AI Network team on the docs. e.g. https://docs.livepeer.org/ai/pipelines/text-to-image#models
Other models (called "on-demand") are still supported, but they might have less support from Orchestrators so I opted to keep them out on this first version.
Currently, an O can keep only 1 model (from any pipeline) warmed up, as it has to keep an ai-worker
container running with that model loaded in the GPU. When the model is not available (warm) in the O, it will just kill that running container and start another ai-worker
configured with the requested model. The model is usually available on the disk if the operator previously configured it (see this getting started), otherwise the ai-worker
will download the model dynamically from hugging face.
This means that asking for an uncommon model can make the request really slow. Just loading the model on the GPU already takes a couple dozen seconds, and if the O has to download it as well it could take even minutes. So that's why I opted to limit the models that can be asked here (but there's still a risk no O has it warm and it takes longer than usual).
I believe this whole flow is being improved on by the AI network team, like with the "remote worker" architecture (similar to splitting O+T nodes). Maybe @rickstaa could correct if I said anything wrong or provide more info here 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a couple tests are failing
What does this pull request do? Explain your changes. (required)
This creates the initial versions of the AI Generate APIs, based on the design doc.
This initial version implements basically the proxy to an internal AI Gateway service, thus
also implementing the exact same interface as the gateway.
The main complexity of this was adding support to the multipart requests, currently used by
the AI Gateway interface for these APIs. This required upgrading the major version of the
ajv
library and some of its related packages, as well as adding a bit of code to handlemultipart form validation and error handling.
Note: haven't added the actual API reference on purpose (
paths:
section on API schema), given we'renot yet sure how this new API is going to be advertised. Right now it is under an experiment so no end-users
will be able to use by themselves.
Specific updates (required)
/generate/text-to-image
API/generate
APIsHow did you test each of these updates (required)
yarn test
Does this pull request close any open issues?
Implements ENG-2181
Checklist