Refined QDQ recipes of BERT/CLIP/VIT for QC and AMD. #1797

tezheng · 2025-04-27T09:56:14Z

Describe your changes

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running lintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

examples/bert/qdq/google_bert_script.py

examples/clip/qdq/clip_script.py

examples/clip/qdq/sbert_clip_script.py

github-advanced-security

lintrunner found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

jambayk · 2025-04-28T16:07:03Z

examples/bert/qdq/google_bert_base_qdq.json

since these are all new files, are still keeping the old qdq config jsons?

bumping this again. Are we keeping both the old and new qdq configs?

We have already released parts of these in AITK and I created a pr #1893 to align the recipes. It is part of this pr: CLIP text + vision, bert and vit qdq.
Please take a look @jambayk CC @tezheng

I updated the previous qdq configs directly

For others, we could take time to merge if feel needed

examples/clip/qdq/laion_clip_text_b32_qdq.json

examples/bert/qdq/google_bert_script.py

examples/clip/qdq/clip_script.py

examples/clip/qdq/sbert_clip_script.py

examples/bert/qdq/google_bert_script.py

examples/bert/qdq/wikitext.py

+
+    subset = kwargs.get("subset")
+    split = kwargs.get("split", "train")
+    ds = load_dataset(dataset, name=subset, split=split)


examples/bert/qdq/wikitext.py

+
+    # Save the dataset to CSV
+    data_frame = ds.to_pandas()
+    print(f"Dataset size: {len(data_frame)}")


examples/bert/qdq/wikitext.py

+    # Save the dataset to CSV
+    data_frame = ds.to_pandas()
+    print(f"Dataset size: {len(data_frame)}")
+    print(data_frame.head())


examples/bert/qdq/wikitext.py

+    print(data_frame.head())
+
+    data_frame.to_csv(args.output_csv, index=False)
+    print(f"Dataset saved to {args.output_csv}")


examples/clip/qdq/eval_retrieval_qnn.py

examples/vit/qdq/imagenet_script.py

tezheng · 2025-04-30T05:52:41Z

@jambayk I think the lint errors are false alarms, do you have any ideas/suggestions?

jambayk · 2025-04-30T21:15:31Z

@jambayk I think the lint errors are false alarms, do you have any ideas/suggestions?

@tezheng There were some lint rules and requirements update on main. Please try lintrunner init to use the latest. For this PRm I have pushed a commit with the lint fixes.

jambayk · 2025-04-30T21:26:20Z

examples/bert/qdq/google_bert_script.py

+@Registry.register_post_process()
+def bert_scl_post_process(outputs):
+    """Post-processing for Sequence Classification task."""
+    match outputs:


Olive still supports python 3.9 so this syntax is invalid. lint fails because of this.

jambayk · 2025-05-06T23:29:50Z

examples/bert/qdq/google_bert_base_qdq.json

+                { "surgeon": "MatMulAddToGemm" }
+            ]
+        },
+        "transformer_optimizer": {


I think this might not be required? with opset 20, the model gets exported with layernorm operator and quant_preprocess in onnxstatic quantization performs gelu fusion automatically

jambayk · 2025-05-06T23:32:20Z

examples/clip/qdq/laion_clip_text_b32_qdq.json

+            "type": "GraphSurgeries",
+            "surgeries": [
+                { "surgeon": "ReplaceAttentionMaskValue", "replacement": -100.0 },
+                { "surgeon": "MatMulAddToGemm" }


can you also add this new surgery "PowReduceSumPowDiv2LpNorm" for the clip models?

jambayk · 2025-05-06T23:34:49Z

examples/bert/qdq/google_bert_base_qdq.json

+            "type": "GraphSurgeries",
+            "surgeries": [
+                { "surgeon": "ReplaceAttentionMaskValue", "replacement": -100.0 },
+                { "surgeon": "MatMulAddToGemm" }


this works well for qc. not sure if it's recommended for amd.

jambayk · 2025-05-07T00:03:48Z

examples/bert/qdq/google_bert_base_qdq.json

+        "model_script": "google_bert_script.py"
+    },
+    "passes": {
+        "conversion": { "type": "OnnxConversion", "target_opset": 20, "dynamic": true, "use_dynamo_exporter": false },


dynamic and use_dynamo_exporter options are not required

jambayk · 2025-05-07T00:05:09Z

examples/bert/qdq/wikitext.py

+    return Dataset.from_pandas(pd.DataFrame(data))
+
+
+def parse_args():


do we need this and the main part?

xieofxie · 2025-05-14T01:19:40Z

examples/clip/qdq/openai_clip_text_b16_qdq.json

+            "model_type": "bert",
+            "opt_level": 1,
+            "optimization_options": {
+                "enable_gelu": false,


this will not compile on AMD NPU

Follow-up review issues.

examples/clip/qdq/eval_retrieval.py

+        )
+    )
+
+    print("Text encoding latency", model.text_model.latency)


examples/clip/qdq/eval_retrieval.py

+    )
+
+    print("Text encoding latency", model.text_model.latency)
+    print("Image encoding latency", model.vision_model.latency)


examples/clip/qdq/eval_retrieval.py

+        args.tokenizer,
+    )
+
+    print("Evaluation result", format_output(result))


examples/utils/olive_eval.py

+
+    def prepare_session(
+        self,
+        inference_settings: Optional[dict[str, Any]] = None,


examples/utils/olive_eval.py

+        self,
+        inference_settings: Optional[dict[str, Any]] = None,
+        device: Device = Device.CPU,
+        execution_providers: Optional[Union[str, list[str]]] = None,


examples/utils/olive_eval.py

+        self,
+        inference_settings: Optional[dict[str, Any]] = None,
+        device: Device = Device.CPU,
+        execution_providers: Optional[Union[str, list[str]]] = None,


examples/utils/olive_eval.py

+        inference_settings: Optional[dict[str, Any]] = None,
+        device: Device = Device.CPU,
+        execution_providers: Optional[Union[str, list[str]]] = None,
+        rank: Optional[int] = None,


olive/model/handler/pytorch.py

+                **model.config.to_dict(),
+                **(self.model_attributes or {}),
+            }
+        except Exception as _:


To fix the issue, we should handle the exception appropriately by logging the error. This ensures that any failure during the assignment to self.model_attributes is recorded, aiding in debugging and maintaining transparency. If the exception is non-critical, we can log it as a warning or info message. Additionally, we should add a comment explaining why the exception is being handled in this way.

jambayk · 2025-06-16T21:19:28Z

closing this is it has gone stale and new PRs related to this has merged or are open.

github-advanced-security bot found potential problems Apr 27, 2025

View reviewed changes

examples/bert/qdq/google_bert_script.py Fixed Show fixed Hide fixed

examples/clip/qdq/clip_script.py Fixed Show fixed Hide fixed

examples/clip/qdq/sbert_clip_script.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Apr 27, 2025

View reviewed changes

jambayk reviewed Apr 28, 2025

View reviewed changes

examples/clip/qdq/laion_clip_text_b32_qdq.json Outdated Show resolved Hide resolved

tezheng force-pushed the zhengte/qdq_bert_clip_vit branch from b38f396 to 1d827ae Compare April 29, 2025 06:44

github-advanced-security bot found potential problems Apr 29, 2025

View reviewed changes

examples/bert/qdq/google_bert_script.py Dismissed Show dismissed Hide dismissed

examples/clip/qdq/clip_script.py Fixed Show fixed Hide fixed

examples/clip/qdq/sbert_clip_script.py Dismissed Show dismissed Hide dismissed

github-advanced-security bot found potential problems Apr 29, 2025

View reviewed changes

tezheng force-pushed the zhengte/qdq_bert_clip_vit branch from 5860e49 to b8e068d Compare April 29, 2025 16:53

tezheng enabled auto-merge (squash) April 30, 2025 05:48

jambayk reviewed Apr 30, 2025

View reviewed changes

tezheng force-pushed the zhengte/qdq_bert_clip_vit branch from 20a6c6f to b4d82b1 Compare May 2, 2025 13:39

jambayk reviewed May 6, 2025

View reviewed changes

jambayk reviewed May 7, 2025

View reviewed changes

xieofxie reviewed May 14, 2025

View reviewed changes

tezheng added 3 commits May 14, 2025 10:19

Refined QDQ recipes of BERT/CLIP/VIT for QC and AMD.

019fbb1

fixup! Refined QDQ recipes of BERT/CLIP/VIT for QC and AMD.

c0e0c28

Follow-up review issues.

Integrate onnxruntime-winml.

5e394d6

tezheng force-pushed the zhengte/qdq_bert_clip_vit branch from e259c65 to 5e394d6 Compare May 14, 2025 02:43

github-advanced-security bot found potential problems May 14, 2025

View reviewed changes

hack! Minor Olive fix

b00a845

github-advanced-security bot found potential problems May 14, 2025

View reviewed changes

xieofxie mentioned this pull request Jun 3, 2025

update bert vit clip qdq to align with Model Lab #1893

Merged

6 tasks

jambayk closed this Jun 16, 2025

auto-merge was automatically disabled June 16, 2025 21:19
Pull request was closed

jambayk deleted the zhengte/qdq_bert_clip_vit branch June 20, 2025 18:19

@@ -182,4 +182,5 @@
                         }
-                    except Exception as _:
-                        pass
+                    except Exception as e:
+                        # Log the exception to ensure visibility while allowing the program to continue.
+                        logger.warning("Failed to set model attributes from model.config: %s", str(e))

		return Dataset.from_pandas(pd.DataFrame(data))


		def parse_args():

Refined QDQ recipes of BERT/CLIP/VIT for QC and AMD. #1797

Refined QDQ recipes of BERT/CLIP/VIT for QC and AMD. #1797

Uh oh!

Conversation

tezheng commented Apr 27, 2025

Describe your changes

Checklist before requesting a review

(Optional) Issue link

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-advanced-security bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check warning

Check warning

Check warning

Check warning

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tezheng commented Apr 30, 2025

Uh oh!

jambayk commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jambayk May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check notice

Copilot Autofix

jambayk commented Jun 16, 2025

Uh oh!

Uh oh!

jambayk commented Apr 30, 2025 •

edited

Loading

jambayk May 6, 2025 •

edited

Loading