[RHDHPAI-535] Address Images Missing Content#34
Conversation
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
|
Update: Had to remove the CI building and pushing the detr-resnet-101 image because it required Git LFS, on my local fork this wasn't an issue but for this org it is flagging it as requiring additional resources (subscription to it on Git). Because of this I have moved the building and pushing of this image to being done manually similar to the vLLM. |
There was a problem hiding this comment.
I tried to test the updates, but unfortunately I get an error too when I'm trying to clone/checkout the branch of your fork due to the large size of models.safetensors file.
batch response: This repository is over its data quota.
Account responsible for LFS bandwidth should purchase
more data packs to restore access.
I think if you have a template ready that you've used to do the testing for the object detection image it would be awesome (one with the image you've pushed manually)!
I see the action after you've merged the PR on your fork was successfull too: https://github.com/Jdubrick/ai-developer-images/actions/runs/13978075242/job/39136688940
I'm wondering if it would be best to include the contents of detr-resnet-101 to a |
I thought about it tbh. I haven't spent a tone of time on it (e.g I don't want to misslead you), however I'm thinking we need to be sure we use a fixed version. Meaning we need to be sure that the content's version inside the developer images is specific. Thinking that we include the contents here, we are at least sure we build every time the same image and not a newer one. However this could be facilitate with many different ways (maybe a hash in your download function could help?). Also maybe we could use the code from the |
I know I had to enable git lfs to be able to work with it, which I don't think is great as a requirement for others to work with a repo tbh. IMO it would be okay to add it to an ignore file and when a maintainer goes to update they will pull the new changes in - We have fixed versions in quay and these are just the files used to build them so we have historical records if that makes sense? The Makefile used to pull the info and the script for converting into the proper format all came from |
Hmm I'm ok with adding it to ignore, my only question was how/if we can track which contents have created a speicifc image tag in quay.io and if we can fix a version of the model that is built locally, but I guess as long as this process is done manually we cannot be sure. |
|
Yeah it would be tough to track that because it has to be done manually, I think since it isn't updated very often we should be okay for now but it may be worth looking into having some sort of tracker implemented that you're required to update if you push or maybe storing compressed versions of the content that are labeled? @thepetk |
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
thepetk
left a comment
There was a problem hiding this comment.
I think with the new changes is good to be merged. I've downloaded sucessfully all contents using the Makefile command and built locally the detr image
What does this PR do?:
This PR adds the missing content so that the images being built by CI are in proper working order. Prior to this the images were missing the actual model information, especially
detr-resnet-101.In addition to adding a
Makefileand the necessary build info fordetr-resnet-101building, I also created aCONTRIBUTING.mdfile to better document how you can maintain these images. The helm-charts/application-gitops was previously marked as deprecated as well so I removed theconfig.envfrom it to avoid CI building new versions.Because the
detr-resnet-101data is large I had to move it to manual building and validation, similar to the model-server/vllm workflow. This is a constraint set by GH because of LFS unfortunately.Which issue(s) this PR fixes:
https://issues.redhat.com/browse/RHDHPAI-535
PR acceptance criteria:
Testing and documentation do not need to be complete in order for this PR to be approved. We just need to ensure tracking issues are opened and linked to this PR, if they are not in the PR scope due to various constraints.
Tested and Verified
Documentation (READMEs, Product Docs, Blogs, Education Modules, etc.)
How to test changes / Special notes to the reviewer:
You can perform the following to ensure all images are built properly:
config.envfile for each container image and redirect it to your Quay account, tag it however you want.developer-imagesyou should make sure GH Actions are enabled and you have your Quay info stored in GH secrets (QUAY_USERNAME, QUAY_PASSWORD). See CI definition for more info.mainbranch, this will trigger the CI and allow the build + push to occur to your Quay.ai-lab-templateand change the image references in the env files to your newly built and hosted ones. All this info is stored here: https://github.com/redhat-ai-dev/ai-lab-template/tree/main/scripts/envs