Dockerfile for cloud conversion of LLM models #319

ssss141414 · 2025-06-19T08:39:24Z

This docker file will create an image contains environment for cloud conversion of LLM models.

QNN. Including:
Python 3.10 with autogptq installed.
Python 3.12 with nightly ort-qnn installed.
AMD.
Shared with QNN.
Intel.
Python 3.12 with openvino installed.

xieofxie · 2025-06-20T02:51:52Z

model_lab_configs/dockerfile/qnn/Dockerfile

@@ -0,0 +1,41 @@
+FROM python:3.12-slim AS base


if we plan to use prebuilt docker, make sure to release it in an official place

Yes. I think prebuilt docker is necessary, since otherwise the job needs to install requirements every time, which take a long time..

Working on official release. For this PR, just a place to put the dockerfile.

model_lab_configs/dockerfile/qnn/requirements-3.10.txt

xieofxie

LGTM, make sure we versioned the dockerfile for new requirements txt

* nit * add amd llm phi * update parameters like isLLM * add evalRuntime * use runtime * add back isGPURequired * update * update * wrong phi * use copy * add execute ep * fix model list (#255) * update phi silica * intel npu (#257) * update intel npu * fix --------- Co-authored-by: hualxie <hualxie@microsoft.com> * add og to amd * nit * Fangyangci/convert to intel npu (#256) * resnet * bert-base-multilingual-cased * 1 * 1 * vit * 1 * fix inference_sample for intel * remove * fix lf * fix intel npu bugs (#259) * fix inference_sample bug * add intelNpu * add intelnpu runtime * fix lf * fix lf * llm intel (#261) * fix inference_sample bug * depseek * add intelNpu * update optimum * add intelnpu runtime * fix lf * fix lf * llm intel model * fix diff * fix check * code style --------- Co-authored-by: hualxie <hualxie@microsoft.com> * clip intel (#262) * 1 * clip * 1 * 1 * naming * add "library": "transformers" * \n * \n * \n * \n * \n * redundant file * use with open_ex (#264) * use open_ex * add comment * remove dup * fix clip copy * update olive to latest * onnxruntime 1.22.0 * onnxruntime does not have 1.22.0 for windows x64, weird * remove QNN in readme * update more readme * forget * add check * use default name pixel_values * rename to ov_model_st_quant * fix * default qnn * strange name for clip * update olive; rollback qnn * update ov name * add deps to resnet * gpu down back to 1.21.0 * fix * change workload profile * loginrequiredmodelids (#273) * Change nonllm model wcr sample code. (#274) * change wcr sample * fix comments * intel evaluations (#276) * vit * fix default size * fix sanitize.py * update wcr for evaluation (#277) * add olive, genai to wcr * update WCR * personal update 377e233de4814b1bc92e173d6dbb503f1f94dc04 * intel use cpu version * add npu to open vino --------- Co-authored-by: hualxie <hualxie@microsoft.com> * add genai and fix sample (#275) * Hualxie/intel wcr (#278) * update olive version * update py * all use new olive --------- Co-authored-by: hualxie <hualxie@microsoft.com> * evaluations (#279) * vit * fix default size * intel bert * fix intel bert * fix intel bert size * clip * run sanitize.py * update copy, intel bert for intel * simplify ov sample * fix bert * rollback for label incorrectness * add evaluate / scikit-learn * google bert * google bert * remove unused * 1 --------- Co-authored-by: hualxie <hualxie@microsoft.com> * qnn still use official * update test * Hualxie/add qnn (#283) * add vit qnn * add google bert * add intel bert * update resnet * rename qdq * nit * 512 * fix data_config * 512 * fix data_config again --------- Co-authored-by: hualxie <hualxie@microsoft.com> * update clip 16 * add clip32 * add clip32, lain * fix * fix * change to genai_winml for qnn llm (#285) * change genai winml version * Hualxie/more fixes (#288) * fix samples * olive-ai==0.9.1 * use AutoImageProcessor --------- Co-authored-by: hualxie <hualxie@microsoft.com> * use mini-imagenet to align with training data * use mini * intel clip accuracy (#287) * intel clip accuracy * fix metrics * add requirement * fix laion bug * handle transpose * 4.48 work for resnet auto processor (#290) * 4.48 work for resnet auto processor * add use_fast = False * add transpose for amd * remove clean_cache * fix onnx --------- Co-authored-by: hualxie <hualxie@microsoft.com> * Fix llama sample and bert recipe. (#291) * fix llama sample * fix intel bert and google bert max_length * fix * add EP and check * remove unused & update error logic (#294) Co-authored-by: hualxie <hualxie@microsoft.com> * change genai sample (#292) * change genai sample * remove unused statement * test (#295) Co-authored-by: hualxie <hualxie@microsoft.com> * Hualxie/add deps (#296) * update install_freeze * update * all use separate installation * comment * revert * update --------- Co-authored-by: hualxie <hualxie@microsoft.com> * use final url (#297) Co-authored-by: hualxie <hualxie@microsoft.com> * remove transformers in system * to lf * lf * install separately (#299) Co-authored-by: hualxie <hualxie@microsoft.com> * clean up reqs & all lf (#300) * clean up reqs * all lf --------- Co-authored-by: hualxie <hualxie@microsoft.com> * update torch version to support RTX50** (#301) * update torch version to support RTX50** * fix download * fix url * update other version * fix version * 1 * fix version * fix version * use 2.6.0 in intelNPU * fix NvidiaGpu-AutoGptq * 1.22.0 & 0.8.0 (#302) Co-authored-by: hualxie <hualxie@microsoft.com> * unify workflow name (#303) * add * add phi3.5 * add inference_model.json * fix all inference_model.json * Hualxie/passes check (#306) * add olive pass check * test some * nit * add more --------- Co-authored-by: hualxie <hualxie@microsoft.com> * Revert "Merge pull request #304 from microsoft/fangyangci/addInferenceModel" (#308) This reverts commit cf1a9f6, reversing changes made to 31e5ed8. * Hualxie/more config check against Olive (#307) * check more * more * more * more --------- Co-authored-by: hualxie <hualxie@microsoft.com> * Hualxie/update and comment pass check (#309) * they use default value, clean up * comment --------- Co-authored-by: hualxie <hualxie@microsoft.com> * add contribute/get-started (#311) * add openai/clip-vit-large-patch14 * data * revert * in the middle * use debugInfo * update * some thoughts * remove empty debugInfo --------- Co-authored-by: hualxie <hualxie@microsoft.com> * fix error * fix sanitize.py print * change name * revert * fix mistake * remove * add version in modeproject * modelinfo version sync * do not exit when error * revert some change * move errors to end * fix mistake * do not exit when error occurs * fix naming * revert miss merge * Update readmes (#315) * Update readmes * fix --------- Co-authored-by: hualxie <hualxie@microsoft.com> * fix sanitize print (#313) * fix error * fix sanitize.py print * change name * revert * fix mistake * do not exit when error occurs * fix naming * revert miss merge * fix * revert test code * use default version -1 * Hualxie/update contribute guide (#318) * add openai/clip-vit-large-patch14 * data * update docs * check if exist * nit * remove openai/clip-vit-large-patch14 * already set from config --------- Co-authored-by: hualxie <hualxie@microsoft.com> * revert to open load model in playground (#323) * update to 1.22.0.post1 (#322) Co-authored-by: hualxie <hualxie@microsoft.com> * add name to templates * updates * add name * all lf * nit * remove * Dockerfile for cloud conversion of LLM models (#319) * gpu dockerfile * update docker files * intel docker image * reuse * fix * add readme * Refactor the code in `sanitize.py` * 1 * 1 * use displayName (#325) * use displayName * update --------- Co-authored-by: hualxie <hualxie@microsoft.com> * manual fix * manual fix * add format * remove import in __init__.py * add print tip * move auto formatter to file * try fix rename diff * backup: rename original sanitze.py to sanitize_old.py * 1 * try fix rename * backup: rename original sanitze.py to sanitize_old.py * add new sanitize.py * add comment * feat: add more checks (#329) * dump checks to file * commit * update --------- Co-authored-by: hualxie <hualxie@microsoft.com> * feat: add llm eval config (#328) * add LLM Evaluator Template" * fix * use fallbackValue * add description * - * update * merge * clean * fix * add req --------- Co-authored-by: hualxie <hualxie@microsoft.com> * runtime = ep+device * add runtime in passes * add phi 4 mini for open vino (#327) * add phi 4 mini for open vino * fix * update transformers for phi4 * ? * use features * revert * nit --------- Co-authored-by: hualxie <hualxie@microsoft.com> * use action * rename * remove any * fix naming * add display name in conversion * fix naming * phi4 * fix naming * fix naming * fix naming check path * remove readonly * fix tab * rtx recipe * fix install_freeze * delete copy * fix comments * fix comments * fix conflicts * fix name * fix name * rename intel recipe remove reuse_cache to fix intelGpu/intelNpu cache model name not match * fix recipe name * Add biceps (#321) * revert * update image * delete dockerfile * add pyEnvRuntimeFeatures (#334) Co-authored-by: hualxie <hualxie@microsoft.com> * update resource (#337) * add dml recipes; update onnxruntime-genai-winml==0.8.3 (#336) * will it work? * add bert dml * ignore * ? * copy OrtTransformersOptimization * correct target * add llm ones * update data * add latency * ds * llama * qwen * add others * update * nit * update sanitize * DirectML * rename * 0.8.3 * add pyEnvRuntimeFeatures * add eval nightly for olive * vit * add more * add clips * save_as_external_data = true * more samples * clean up --------- Co-authored-by: hualxie <hualxie@microsoft.com> * new recipe for Mistral-7B-Instruct-v0.3 * remove usecache * remove dml * feat: write line endings (#339) * use [] * updated * nit --------- Co-authored-by: hualxie <hualxie@microsoft.com> * feat: fix clip (#341) * need another pr * remove Hide models * fix --------- Co-authored-by: hualxie <hualxie@microsoft.com> * add qianwen2.5 7b * feat: add line endings (#344) * line endings * . * add back for 6033 error --------- Co-authored-by: hualxie <hualxie@microsoft.com> * fix * add qwen other models * fix name * wirh * remove status and hide * revert * Update cloud conversion bicep workload (#347) * update bicep * update wcr * sort * add GetSortKey * add deepseek * fix * add requirements.txt * fix * default index * remove genai * fix string * update readme for DML * phi4 * fix README.md * feat: finalize project update scenario (#352) * version consideration * align files --------- Co-authored-by: hualxie <hualxie@microsoft.com> * phi3-mini * add left intel gpu * remove some mistake * format inference_model.json (#355) Co-authored-by: hualxie <hualxie@microsoft.com> * add DisplayNameToRuntimeRPC map (#354) * add DisplayNameToRuntimeRPC map * rename --------- Co-authored-by: hualxie <hualxie@microsoft.com> --------- Co-authored-by: hualxie <hualxie@microsoft.com> Co-authored-by: Charles Zhang <progzhangchao@163.com> Co-authored-by: Yue Sun <yuesu@microsoft.com> Co-authored-by: xieofxie <xieofxie@126.com> Co-authored-by: fangyangci <133664123+fangyangci@users.noreply.github.com> Co-authored-by: Yue Sun <2015.apro@gmail.com> Co-authored-by: Chao Zhang <zhangchao@microsoft.com> Co-authored-by: fangyangci <fangyangci@microsoft.com> Co-authored-by: ssss141414 <407748083@qq.com>

ssss141414 added 2 commits June 18, 2025 22:11

gpu dockerfile

65f4058

update docker files

06993a4

xieofxie reviewed Jun 20, 2025

View reviewed changes

model_lab_configs/dockerfile/qnn/requirements-3.10.txt Outdated Show resolved Hide resolved

intel docker image

59c52fb

ssss141414 changed the title ~~Dockerfile for cloud conversion of QNN~~ Dockerfile for cloud conversion of LLM models Jun 23, 2025

ssss141414 added 2 commits June 23, 2025 23:36

reuse

25549e6

fix

4c184b5

ssss141414 marked this pull request as ready for review June 25, 2025 06:35

add readme

402311f

xieofxie approved these changes Jun 26, 2025

View reviewed changes

ssss141414 merged commit 6710400 into dev Jun 26, 2025
1 check passed

ssss141414 deleted the shzhen/docker branch June 26, 2025 01:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dockerfile for cloud conversion of LLM models #319

Dockerfile for cloud conversion of LLM models #319

Uh oh!

ssss141414 commented Jun 19, 2025 •

edited

Loading

Uh oh!

xieofxie Jun 20, 2025

Uh oh!

ssss141414 Jun 20, 2025

Uh oh!

Uh oh!

xieofxie left a comment

Uh oh!

Uh oh!

Uh oh!

Dockerfile for cloud conversion of LLM models #319

Dockerfile for cloud conversion of LLM models #319

Uh oh!

Conversation

ssss141414 commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xieofxie Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

ssss141414 Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xieofxie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ssss141414 commented Jun 19, 2025 •

edited

Loading