[Snippets][CPU] Added Brgemm FP32 blocking support by dynamic K, N dimensions #25745

a-sidorova · 2024-07-26T07:43:16Z

Details:

Added update support of K and N dimensions for Brgemm block in BrgemmKernelExecutor::update_config

Tickets:

147852

Prerequisites:

[Snippets] Added single evaluation of Brgemm in Tail Loop by dynamic M #25378

a-sidorova

Open question for discussion: Should we remove SetBrgemmBeta pass? This PR add beta initialization support in BrgemmKernelExecutor::update_config to update this value in runtime for dynamism. It successfully covers static pipeline as well.
I have some thoughts:

I'd like to remove this pass in general. However, it's used by tpp brgemm emitter -> so I decided to left this pass for now
There is idea to set beta = dynamic_value for dynamic cases. Because this value is set in Brgemm node when I update this value only on BrgemmKernelExecutorConfig and don't mind about Brgemm::beta 🤔

I think we should discuss this moment

UPD:
Discussed offline and decided to move beta to tpp brgemm and create dummy empty pass to force first iteration for CPU case (leave the pass SetBrgemmMeta for tpp)

src/plugins/intel_cpu/src/emitters/snippets/x64/kernel_executors/brgemm.cpp

src/common/snippets/include/snippets/lowered/pass/insert_specific_iterations.hpp

src/plugins/intel_cpu/tests/unit/snippets_transformations/x64/lowered/brgemm_blocking.cpp

src/plugins/intel_cpu/src/emitters/snippets/x64/kernel_executors/brgemm.cpp

src/common/snippets/include/snippets/lowered/pass/iter_handler.hpp

v-Golubev

Great work 👍

v-Golubev · 2024-08-01T22:26:10Z

src/plugins/intel_cpu/src/transformations/snippets/x64/pass/lowered/brgemm_cpu_blocking.hpp

+    class DummyPass : public snippets::lowered::pass::RangedPass {
+    public:
+        DummyPass() = default;
+        OPENVINO_RTTI("DummyPass", "RangedPass")
+        bool run(snippets::lowered::LinearIR& linear_ir,
+                 snippets::lowered::LinearIR::constExprIt begin,
+                 snippets::lowered::LinearIR::constExprIt end) override;
+        std::shared_ptr<snippets::lowered::pass::PassBase> merge(const std::shared_ptr<snippets::lowered::pass::PassBase>& other) override;
+    };
+


The main remaining concern is that we have to keep DummyPass in cpu blocking pass (and SetBrgemmBeta in TPP one) in public section just because we need these passes to build the reference LIRs in tests. Ideally, these iterations handlers should be in private section.

However, there is no obvious solution for that so we can probably return to that discussion later. I don't think this blocks the merge

[Snippets][CPU] Supported Brgemm blocking by dynamic K [Snippets][CPU] Added validation checks [Snippets] fixed build

commit 67fd2eb6d83435b195ef56004d7d9f9c2a728502 Merge: 5f09ab51c0 9432b3d2a5 Author: Ujjayant Kadian <118752727+ujjayant-kadian@users.noreply.github.com> Date: Tue Aug 6 13:07:36 2024 +0100 Merge branch 'master' into uk/changing-sub-byte-i4-element-order commit 9432b3d2a577bc27e8008d85002ce57c4b0e3159 Author: Min, Byungil <byungil.min@intel.com> Date: Tue Aug 6 19:20:02 2024 +0900 [GPU] Bugfix reorder for byfx format (#25782) + Reorder returns OOR error while handling byfx from a fused permute parent ### Details: - *item1* - *...* ### Tickets: - CVS-147330 --------- Signed-off-by: Min, Byung-il <byungil.min@intel.com> commit 606d909ab8ec130fd7c6a9d2d56a839978903a2f Author: Bogdan Pereanu <bogdan.pereanu@intel.com> Date: Tue Aug 6 13:12:32 2024 +0300 [NPU] Disable MCL in case of UD28 (#25903) ### Details: - *The UD28 Windows driver version doesn't support as expected the MutableCommandList feature - just disable this feature from the plugin in case this driver is used* ### Tickets: - *EISW-133845* commit b6447980be06caf6bb6c1592eee4eb6de094218c Author: Anastasiia Pnevskaia <anastasiia.pnevskaia@intel.com> Date: Tue Aug 6 10:26:04 2024 +0200 [DOCS] Corrected build guides in docs. (#25922) ### Details: - Corrected build guides in docs. ### Tickets: - commit 265dfad8ebcdae2b17611d833ec8da0f0ddc9bd2 Author: Przemyslaw Wysocki <przemyslaw.wysocki@intel.com> Date: Tue Aug 6 10:19:41 2024 +0200 Change index precision from `i64` to `i32` in MaxPool14 to MaxPool8 downgrade transformation (#25514) ### Tickets: - CVS-146277 commit 9eeb7a18d5ae039d1b406cab405ad2083dc5680c Author: Maciej Smyk <maciejx.smyk@intel.com> Date: Tue Aug 6 09:38:15 2024 +0200 [DOCS] Dependencies and Building for OpenVINO GenAI article for master (#25908) Adding information on the OpenVINO GenAI Dependencies and ref-link to the GenAI building in user docs. commit cbf4035c257042aec180102d434287c27d9cd2f6 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue Aug 6 11:16:55 2024 +0400 Bump hendrikmuhs/ccache-action from 1.2.13 to 1.2.14 (#25917) Bumps [hendrikmuhs/ccache-action](https://github.com/hendrikmuhs/ccache-action) from 1.2.13 to 1.2.14. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/hendrikmuhs/ccache-action/releases">hendrikmuhs/ccache-action's releases</a>.</em></p> <blockquote> <h2>v1.2.14</h2> <h2>What's Changed</h2> <ul> <li>Add sccache to PATH after installation by <a href="https://github.com/kendalharland"><code>@kendalharland</code></a> in <a href="https://redirect.github.com/hendrikmuhs/ccache-action/pull/204">hendrikmuhs/ccache-action#204</a></li> <li>Make ccache-action respect environment variables by <a href="https://github.com/TrentHouliston"><code>@TrentHouliston</code></a> in <a href="https://redirect.github.com/hendrikmuhs/ccache-action/pull/217">hendrikmuhs/ccache-action#217</a></li> <li>updates</li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/kendalharland"><code>@kendalharland</code></a> made their first contribution in <a href="https://redirect.github.com/hendrikmuhs/ccache-action/pull/204">hendrikmuhs/ccache-action#204</a></li> <li><a href="https://github.com/cclauss"><code>@cclauss</code></a> made their first contribution in <a href="https://redirect.github.com/hendrikmuhs/ccache-action/pull/213">hendrikmuhs/ccache-action#213</a></li> <li><a href="https://github.com/TrentHouliston"><code>@TrentHouliston</code></a> made their first contribution in <a href="https://redirect.github.com/hendrikmuhs/ccache-action/pull/217">hendrikmuhs/ccache-action#217</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/hendrikmuhs/ccache-action/compare/v1...v1.2.14">https://github.com/hendrikmuhs/ccache-action/compare/v1...v1.2.14</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/hendrikmuhs/ccache-action/commit/ed74d11c0b343532753ecead8a951bb09bb34bc9"><code>ed74d11</code></a> Bump <code>@types/node</code> from 22.0.0 to 22.1.0 (<a href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/222">#222</a>)</li> <li><a href="https://github.com/hendrikmuhs/ccache-action/commit/a92dd99d2cf20a1db8898b00bb383b234fb1cf15"><code>a92dd99</code></a> Bump <code>@types/node</code> from 20.14.11 to 22.0.0 (<a href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/220">#220</a>)</li> <li><a href="https://github.com/hendrikmuhs/ccache-action/commit/aa7d29411285c29f578109e54b7a8d8155c2fbb3"><code>aa7d294</code></a> Bump typescript from 5.5.3 to 5.5.4 (<a href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/218">#218</a>)</li> <li><a href="https://github.com/hendrikmuhs/ccache-action/commit/6f0874030891bf49d844fff92b862568f093dabe"><code>6f08740</code></a> Make ccache-action respect environment variables (<a href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/217">#217</a>)</li> <li><a href="https://github.com/hendrikmuhs/ccache-action/commit/ed979992cda44142d976add1d5a7d6f39f7e8b67"><code>ed97999</code></a> Bump <code>@types/node</code> from 20.14.10 to 20.14.11 (<a href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/216">#216</a>)</li> <li><a href="https://github.com/hendrikmuhs/ccache-action/commit/ca1e5062f3378412bbfeb780d1ebe3c2a4913081"><code>ca1e506</code></a> Bump actions/checkout from 2 to 4 (<a href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/214">#214</a>)</li> <li><a href="https://github.com/hendrikmuhs/ccache-action/commit/069136ab7ab2267ea6624fde73f80d7d472d323e"><code>069136a</code></a> Bump <code>@types/node</code> from 20.14.9 to 20.14.10 (<a href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/212">#212</a>)</li> <li><a href="https://github.com/hendrikmuhs/ccache-action/commit/3cf745af56c860cc76c89ffd830efec6aef03b56"><code>3cf745a</code></a> Bump typescript from 5.5.2 to 5.5.3 (<a href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/211">#211</a>)</li> <li><a href="https://github.com/hendrikmuhs/ccache-action/commit/9a0cc152966f2c3f3df86a6e0364da1608924006"><code>9a0cc15</code></a> Keep GitHub Actions up to date with GitHub's Dependabot (<a href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/213">#213</a>)</li> <li><a href="https://github.com/hendrikmuhs/ccache-action/commit/b7c0e162a73e852cdd80bd368aa77e7801fce009"><code>b7c0e16</code></a> Bump <code>@types/node</code> from 20.14.8 to 20.14.9 (<a href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/210">#210</a>)</li> <li>Additional commits viewable in <a href="https://github.com/hendrikmuhs/ccache-action/compare/c92f40bee50034e84c763e33b317c77adaa81c92...ed74d11c0b343532753ecead8a951bb09bb34bc9">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=hendrikmuhs/ccache-action&package-manager=github_actions&previous-version=1.2.13&new-version=1.2.14)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit d9d5ace62609f909cdaac0ac8073d78a1f19607d Author: Vladimir Paramuzov <vladimir.paramuzov@intel.com> Date: Tue Aug 6 10:24:43 2024 +0400 [Transformations] Extend MoveEltwiseUpThroughData pass with per channel case (#24401) ### Details: - Added pass to swap Reshape/Squeeze/Unsqueeze -> Eltwise (per channel) commit 513c812fcf5049cab4084b7a09e862f7df357880 Author: Gorokhov Dmitriy <dmitry.gorokhov@intel.com> Date: Tue Aug 6 09:33:29 2024 +0400 [CPU] FullyConnected weights compression: mxfp4 (wei=f4e2m1, scales=f8e8m0) support (#25783) ### Details: - This PR extends FC weights compression support with mxfp4 (wei=f4e2m1, scales=f8e8m0) precision - ISA coverage: avx2, avx512 - oneDNN fork changes: https://github.com/openvinotoolkit/oneDNN/pull/258 ### Tickets: - [CVS-142986](https://jira.devtools.intel.com/browse/CVS-142986) ### Dependencies: oneDNN 3.5 migration: https://github.com/openvinotoolkit/openvino/pull/25153 commit 73e1b94625c277ad89d4a613eef889213a1b856e Author: Vladimir Paramuzov <vladimir.paramuzov@intel.com> Date: Tue Aug 6 09:21:09 2024 +0400 [GPU][TRANSFORMATIONS] Disable per pass validation in some cases (#25874) ### Details: - Disable per pass validation for GPU specific passes and mixed precision markup to improve model loading time commit 7fd8b2ed77d4b31cab9556742320a793506f7327 Author: Vladimir Paramuzov <vladimir.paramuzov@intel.com> Date: Tue Aug 6 09:16:49 2024 +0400 [GPU] Dynamic pipeline host opt (#25886) ### Details: - Reduce count of copies for layouts/shapes and other complex objects commit d604f1d8b2a60fa68b704c2a8f81e283c4aa2f0f Author: Michal Miotk <michal.miotk@intel.com> Date: Tue Aug 6 00:54:25 2024 +0200 fix for confused input with output in assert error message (#25915) ### Details: - short fix for message ### Tickets: - N/A commit f8d0e8c47c5be32b2e5e44e4449a337fcbc130fb Author: Andrew Kwangwoong Park <andrew.park@intel.com> Date: Tue Aug 6 02:52:42 2024 +0900 Revert "[GPU] Avoid crop buffer fusing when dynamic shape and squeeze/unsqueeze reshape mode" (#25895) ### Details: - This revert https://github.com/openvinotoolkit/openvino/pull/25700 - As support for Crop->Reshape(Squeeze/Unsqueeze modes) buffer optimization was added by https://github.com/openvinotoolkit/openvino/pull/25836 ### Tickets: - 146626 commit 5264c9995f3a41b642a3359155edb719243944a1 Author: Karol Blaszczak <karol.blaszczak@intel.com> Date: Mon Aug 5 18:41:37 2024 +0200 [DOCS] tiny article name changes (#25910) commit 3cf27441ff5cd497499bc37d92e55d901e88ca59 Author: Ilya Lavrenov <ilya.lavrenov@intel.com> Date: Mon Aug 5 19:47:38 2024 +0400 Removed GHA WA for older ONNX versions (#25912) ### Details: - Removed WA introduced here https://github.com/openvinotoolkit/openvino/pull/25234 because ONNX version is updated here https://github.com/openvinotoolkit/openvino/pull/24242 commit afb194f3747ed56ab524500842cb50281abe41a9 Author: Rinne <AsakusaRinne@gmail.com> Date: Mon Aug 5 22:33:17 2024 +0800 [JAX FE] Add translation for more operations. (#25292) ### Details: - *Add the translation for reduce_window_max, reduce_window_sum, rsqrt, reshape , squeeze, slice, broadcast_in_dim, copy, dot_general and transpose of JAX frontend* - *Add corresponding test* ### Tickets: - *CVS-145575* - *CVS-145583* - *CVS-145580* - *CVS-145574* - *CVS-145581* - *CVS-145579* - *CVS-145582* - *CVS-145573* - *CVS-145578* NOTE: this PR should be merged after #25290 --- @mvafin Could you please help to review this PR? cc @rkazants --------- Co-authored-by: Maxim Vafin <maxim.vafin@intel.com> Co-authored-by: Roman Kazantsev <roman.kazantsev@intel.com> commit c30a0bcf6ba4f1b75412c353cefe63f97f6ee33c Author: Georgy Krivoruchko <georgy.krivoruchko@intel.com> Date: Mon Aug 5 18:12:22 2024 +0400 [ONNX] Aligned behavior for ReduceProd-11,13,18 (#25875) ### Details: - Aligned behavior of ReduceProd operation ### Tickets: - 143347 commit 10a2e91d2502bc7bc5aa7c2fbcc5b845c7a00975 Author: Aleksandr Voron <aleksandr.voron@intel.com> Date: Mon Aug 5 15:54:19 2024 +0200 [CPU][ARM] Enable ACL MVN executor for `initAcrossChannels` option in NHWC layout (#25905) ### Details: - This configuration (initAcrossChannels is true and NHWC is used) was disabled for ACL executor to enable `yolo_v3_tiny`. The last check shows this restriction is not required anymore. ### Tickets: - *ticket-id* commit 12a5e5a505da2d793bed99efdfdc4bda42be9850 Author: Georgy Krivoruchko <georgy.krivoruchko@intel.com> Date: Mon Aug 5 17:06:03 2024 +0400 [ONNX] Switched to ONNX 1.16.0 (#24242) ### Details: - Switched to ONNX 1.16.0 - Removed WA for ONNX 1.15.0 - ONNXRuntime for tests 1.18.1 ### Tickets: - 136748, 138876 --------- Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com> commit 7eedf84ef21918e84f4488a582645a82d921e507 Author: Luo Cheng <cheng.luo@intel.com> Date: Mon Aug 5 21:05:08 2024 +0800 [CPU] Add score output for PagedAttention (#25594) ### Details: - *Add score output for PagedAttention* - *...* ### Tickets: - *[146969](https://jira.devtools.intel.com/browse/CVS-146969)* commit a6413b415ff8cc7cd9eb9cf3cfe96334bd1907e4 Author: Przemyslaw Wysocki <przemyslaw.wysocki@intel.com> Date: Mon Aug 5 15:00:01 2024 +0200 [PyOV] Replace `std::stringstream` with `std::fstream` in `import_model` (#25724) ### Details: - The current implementation breaks when the model size is > 2gb - `std::fstream` does not limit the model size - Tested in https://github.com/openvinotoolkit/openvino/blob/master/src/bindings/python/tests/test_runtime/test_compiled_model.py#L57 - The fix has been verified ### TODO: - Should we simulate > 2gb model case in tests? ### Tickets: - EISW-130771 commit e35acf91e9a953ee081d0bae355a7e848ef41b86 Author: Attila Csok <attila.csok@intel.com> Date: Mon Aug 5 15:45:52 2024 +0300 [intel-npu] Adding NPU_TURBO option to plugin (#25646) ### Details: - Adding npu_turbo option for intel-npu plugin - updating documentation with turbo and other missing properties Master backport of https://github.com/openvinotoolkit/openvino/pull/25603 ### Tickets: - [*ticket-id*](https://jira.devtools.intel.com/browse/CVS-147038) commit 64c5f67a5aa31b020d295c210b0345bdd74e4dbb Author: Ilya Lavrenov <ilya.lavrenov@intel.com> Date: Mon Aug 5 14:08:22 2024 +0400 Fixed compatibility with new version of 'wheel' (#25899) ### Details: - *item1* - *...* ### Tickets: - *ticket-id* commit c664ca7f288f59722d82e9bfbb994f0c7c1e232e Author: Xuejun Zhai <xuejun.zhai@intel.com> Date: Sun Aug 4 17:32:04 2024 +0800 Clean meta plugin tests from CPU/GPU plugin (#24477) ### Details: - Move BATCH related test out from CPU/GPU func test to BATCH func test - Move HETERO related test out from CPU/GPU func test to HETERO func test - *...* ### Tickets: - *ticket-id* --------- Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> Co-authored-by: Chen Peter <peter.chen@intel.com> commit 59a0f019913681287f82553a07e0b299404de821 Author: Peyara Nando <nandu45@outlook.com> Date: Sat Aug 3 05:45:30 2024 +0530 Implemented getOutputElementType (#25760) Implemented Method on c++ side. Updated typescript definitions. Created unit tests. For Issue [https://github.com/openvinotoolkit/openvino/issues/25406](https://github.com/openvinotoolkit/openvino/issues/25406) Resolved merge errors --------- Co-authored-by: Alicja Miloszewska <alicja.miloszewska@intel.com> commit 5f09ab51c00ed0d207bc02963783efe597dda5de Author: Kadian <ujjayant.kadian@intel.com> Date: Fri Aug 2 16:14:42 2024 +0100 Modified comments commit 0f1ad2b95de9d7985f8db93e99450bb490c260d0 Merge: 99523fc962 ae454eebbd Author: Kadian <ujjayant.kadian@intel.com> Date: Fri Aug 2 16:08:41 2024 +0100 Merge branch 'uk/changing-sub-byte-i4-element-order' of github.com:ujjayant-kadian/openvino into uk/changing-sub-byte-i4-element-order commit 99523fc9624738b9af5fdd1ca58aa301f44d49df Author: Kadian <ujjayant.kadian@intel.com> Date: Fri Aug 2 13:06:18 2024 +0100 Added a new pattern in pattern matcher [CPU] Avoid rounding to zero for Reduce node in quantized models (#25766) - *If the Reduce node has both input and output precision to be integers from the original model, then rounding to zero should be done before converting intermediate floating point value to integer.* - *However, if such integer precisions are resulted from quantization, then we should not do such rounding, in order to maintain accuracy.* - *Add corresponding test cases.* - *CVS-147352* Correct clang format issues Tried to resolve the segmentation fault Corrected clang format error Tried to correct segmentation fault Removed std::move Using std::move with much more caution commit ae454eebbdde2d2582cbe43e5a10e62a7ec61d50 Merge: 46b84b994e b2319a5bea Author: Ujjayant Kadian <118752727+ujjayant-kadian@users.noreply.github.com> Date: Fri Aug 2 16:04:40 2024 +0100 Merge branch 'openvinotoolkit:master' into uk/changing-sub-byte-i4-element-order commit d29948c758501bafe807ff0feeed8875574545a6 Author: Roman Kazantsev <roman.kazantsev@intel.com> Date: Fri Aug 2 19:02:31 2024 +0400 [TF FE][SDL] Fix performance inefficiencies (#25884) **Details:** Fix performance inefficiencies **Ticket:** 148599 Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> commit 46b84b994e55306f89aa437fd2271b6164e548b1 Author: Kadian <ujjayant.kadian@intel.com> Date: Fri Aug 2 15:45:34 2024 +0100 Using std::move with much more caution commit cb2814d2ee5e832b7e8a0809f55185f305133cad Author: Kadian <ujjayant.kadian@intel.com> Date: Fri Aug 2 15:43:28 2024 +0100 Removed std::move commit ae13bed22c9a222611dce04075f3bbe6ac87091e Author: Kadian <ujjayant.kadian@intel.com> Date: Fri Aug 2 15:34:40 2024 +0100 Tried to correct segmentation fault commit a98775ad74bcdf2bcc58820f2373adcaf3d98dff Author: ujjayant-kadian <ujjayant.kadian@intel.com> Date: Fri Aug 2 14:31:54 2024 +0000 Corrected clang format error commit c2ba823ef43f6804b07781af3707220be184f541 Author: Kadian <ujjayant.kadian@intel.com> Date: Fri Aug 2 15:24:24 2024 +0100 Tried to resolve the segmentation fault commit a33afe422f0ff9f655dd9f660d35f441e148433e Author: Sergey Shlyapnikov <sergey.shlyapnikov@intel.com> Date: Fri Aug 2 18:17:31 2024 +0400 [GPU] Fix Crop->Reshape (Squeeze/Unsqueeze modes) buffer optimization (#25836) These changes fix a significant accuracy issue (reducing perplexity from 120 000 to 17) for Llama models with precalculated constant sin/cos values. However, there is still a problem with sin/cos representation in FP16 precision, which will be addressed in a separate PR. ### Details: - Fixed Crop->Reshape (Squeeze/Unsqueeze modes) buffer optimization - Update rope_ref kernel to support dynamic paddings for cos/sin inputs - Fix propagate_padding() function and update shape infer tests ### Tickets: - [CVS-148220](https://jira.devtools.intel.com/browse/CVS-148220), [CVS-146283](https://jira.devtools.intel.com/browse/CVS-146283) commit b2319a5bea85fd057d1e3ea102e83d8d6af6c6db Author: Alexandra Sidorova <alexandra.sidorova@intel.com> Date: Fri Aug 2 18:09:25 2024 +0400 [Snippets][CPU] Added Brgemm FP32 blocking support by dynamic K, N dimensions (#25745) ### Details: - *Added update support of `K` and `N` dimensions for Brgemm block in `BrgemmKernelExecutor::update_config`* ### Tickets: - *147852* ### Prerequisites: - [x] https://github.com/openvinotoolkit/openvino/pull/25378 commit b625fcbfbf95be80d1fe57f471a02b8fd31d94ef Author: Roman Kazantsev <roman.kazantsev@intel.com> Date: Fri Aug 2 17:44:16 2024 +0400 [TF FE] Extend UnsortedSegmentSum for ND indices (#25877) **Details:** This extension is needed for some customer model **Ticket:** 148750 --------- Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> commit 64072f63e7afc66a3e7a49f2bc00d3ae0f695b02 Author: Maxim Vafin <maxim.vafin@intel.com> Date: Fri Aug 2 15:38:01 2024 +0200 [PT FE] Update GHA tests (#25868) ### Details: - *item1* - *...* ### Tickets: - *ticket-id* commit ab1e8dec8341f7ada47d959de895056ddb93ff52 Author: Karol Blaszczak <karol.blaszczak@intel.com> Date: Fri Aug 2 15:18:03 2024 +0200 [DOCS] ovms llm data master (#25880) commit a9f670a1073b3ce7660b3dac133bed4b45e518d5 Author: mei, yang <yang.mei@intel.com> Date: Fri Aug 2 20:55:52 2024 +0800 [CPU] Align cpu execution order before/after ResolveComplexInplaceConflicts() (#24937) ### Details: - *Align cpu execution order before/after ResolveComplexInplaceConflicts()* - *Keep order information of Results and Parameters when dump CPU graph to ov::Model* - *Let MemoryInput always execute first to avoid potential issue because it will update its sibling MemoryOutput memory after execution* ### Tickets: - *CVS-134638* - *CVS-148497* ### Description: - CPU execution order of some nodes may changes after https://github.com/openvinotoolkit/openvino/blob/2024.2.0.dev20240513/src/plugins/intel_cpu/src/graph.cpp#L285. Sometimes that may give ResolveComplexInplaceConflicts() incorrect execution order information. That may lead to ResolveComplexInplaceConflicts() get the wrong conclusion which edge memory should be shared. So this PR add SortTopologically() right before ResolveComplexInplaceConflicts() to let execution order not change much before/after ResolveComplexInplaceConflicts()* - *The node order of CPU graph topology is not stable. For example in below graph* ![image](https://github.com/openvinotoolkit/openvino/assets/37289649/ca14e697-6986-4c30-9c2a-86603cc4a106) *If Parameter0 is before than Parameter1 in graphNodes, in original SortTopologically(), it will first recurse node down from Parameter0. So in final sorted graphNodes, Parameter0 will be sorted after Parameter1. Then in second round of SortTopologically(), it will first recurse from Parameter1 and in final sorted graphNodes, Parameter0 will be sorted before Parameter0 again. This will make sometimes ReduceProd is executed before ScatterNDUpdate while sometimes ReduceProd is after ScatterNDUpdate. It will mislead ResolveComplexInplaceConflicts()* - *MemoryInput will update its sibling MemoryOutput memory after execution. To avoid memory changes during the execution of other nodes, always let MemoryInput execute first* commit 2e95269d14cfb7c865f2fd5e2329d6c9523469a4 Author: ujjayant-kadian <ujjayant.kadian@intel.com> Date: Fri Aug 2 12:35:17 2024 +0000 Correct clang format issues commit 63e9e38413e223e645029baf18359bf5df21b076 Merge: 6bc933a4dd ea6731f8a7 Author: Kadian <ujjayant.kadian@intel.com> Date: Fri Aug 2 13:07:13 2024 +0100 Merge branch 'uk/changing-sub-byte-i4-element-order' of github.com:ujjayant-kadian/openvino into uk/changing-sub-byte-i4-element-order commit da2a4e770a163af6419e0d9e46594e58dbc8ef64 Author: Aleksandr Voron <aleksandr.voron@intel.com> Date: Fri Aug 2 13:33:21 2024 +0200 [CPU][ARM] Added debug logs to ACL Interpolate executor (#25866) ### Details: - Added debug logs to ACL Interpolate executor to debug easier - Remove redundant check (since it duplicates the check https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/nodes/executors/acl/acl_interpolate.cpp#L135_L136) ### Tickets: - *ticket-id* commit 34d41aeb078eca2c0d55c556011bea1ba7729fdd Author: Maksim Doronin <maksim.doronin@intel.com> Date: Fri Aug 2 11:40:44 2024 +0100 Add folder parameter for reference blobs in SIT (#25651) ### Details: - Adding a new optional parameter for SIT to specify a directory with reference outputs. So instead of running NetsVal-CalcRef on CPU or TEMPLATE we can re-use predefined reference outputs. However, their names must comply with the existing name convention ### Tickets: - E-131878 commit 3c5a52e30f363cd25122ccd5b8bc081d717e8e03 Author: Damian Kurek <damian.kurek@intel.com> Date: Fri Aug 2 10:55:38 2024 +0200 [GPU] Use parallel sum reduction in MVN BFYX OPT kernel (#25840) Optimize MVN BFYX OPT kernel ### Details: - Use parallel sum reduction in order to improve efficiency ### Tickets: - 148585 commit ab613e115267feece039e409dea6fb8e10371746 Author: Maxim Vafin <maxim.vafin@intel.com> Date: Fri Aug 2 10:55:20 2024 +0200 [PT FE] Move sending telemetry to stage after conversion is done (#25855) ### Details: - *Previously telemetry was send every time `FrameworkNode` is created. Now we send it only when `FrameworkNode` exist in the model and only once per op type* ### Tickets: - *ticket-id* commit 7bc7fb0cb64d09fd258724f6e3c5935f162cd129 Author: Georgy Krivoruchko <georgy.krivoruchko@intel.com> Date: Fri Aug 2 14:24:47 2024 +0400 Changed dependency types-setuptools (#25872) ### Details: - Solution verification ### Tickets: - N/A commit a9c004798b56eb5e74502a44ef320ff5333d23dd Author: Wilson Seok <wilson.seok@intel.com> Date: Thu Aug 1 13:39:13 2024 -0700 [GPU] Add crop check in optimize check of buffer fusing (#25850) ### Details: - Add crop check in optimize check of buffer fusing to pass through simple dynamic shape crop case ### Tickets: - Follow up PR25737 commit 7b11ba4c1bec8f7e2063f1c43b0c10a314d724bd Author: Alexey Smirnov <alexey.smirnov@intel.com> Date: Thu Aug 1 20:32:47 2024 +0100 [NPUW] Introduce new passes to online partitioning (#25679) Config (internal/extended): ``` "NPU_COMPILATION_MODE_PARAMS" : "compute-layers-with-higher-precision=Sqrt,Power,ReduceMean,Add_RMSNorm", "NPU_USE_NPUW" : "YES", "NPUW_FOLD" : "YES", "NPUW_DCOFF_TYPE" : "f16", "NPUW_DCOFF_SCALE" : "YES", "NPUW_ONLINE_ISOLATE" : "P:DQMatMulGQ/compute,P:DQMatMulCW/compute,P:RMSNorm/compute", "NPUW_ONLINE_NOFOLD" : "compute" ``` Config (user/basic): ``` "NPU_COMPILATION_MODE_PARAMS" : "compute-layers-with-higher-precision=Sqrt,Power,ReduceMean,Add_RMSNorm", "NPU_USE_NPUW" : "YES", "NPUW_FOLD" : "YES", "NPUW_DCOFF_TYPE" : "f16", "NPUW_DCOFF_SCALE" : "YES", "NPUW_ONLINE_PIPELINE" : "COMPUTE" ``` --------- Co-authored-by: Dmitry Matveev <dmitry.matveev@intel.com> commit 3b4e747c8687d0e11501b63f3e425a335e8c9641 Author: Ilya Lavrenov <ilya.lavrenov@intel.com> Date: Thu Aug 1 21:23:30 2024 +0400 Allow to override CPACK_ARCHIVE_COMPONENT_INSTALL (#25867) ### Details: - To override by external cmake options - Useful for GenAI to create a single archive commit 605b13fbee58b48cf27cf0e64ac154148dfd8b39 Author: Alicja Miloszewska <alicja.miloszewska@intel.com> Date: Thu Aug 1 08:12:30 2024 -0700 [PyOV] Add more ov.Model constructors (#25635) ### Details: - Accept sinks as output ports in addition to generic nodes and op class instances in `ov.Model` ctors - Add test Added support for: - `Model(results: List[openvino._pyopenvino.op.Result], sinks: List[ov::Output<ov::Node>], parameters: List[openvino._pyopenvino.op.Parameter], name: str = '')` - `Model(results: List[ov::Output<ov::Node>], sinks: List[ov::Output<ov::Node>], parameters: List[openvino._pyopenvino.op.Parameter], name: str = '')` ### Tickets: - *[CVS-131037](https://jira.devtools.intel.com/browse/CVS-131037)* --------- Co-authored-by: Anastasia Kuporosova <anastasia.kuporosova@intel.com> commit 754f48a0d96d0451fd7c7cf4a68019dfafd20c5e Author: Pawel Raasz <pawel.raasz@intel.com> Date: Thu Aug 1 17:10:38 2024 +0200 [core] Unify axis normalization/validation utils (#25614) ### Details: - Split function for smaller simper utils, responsible for validation or normalization or more complex doing both. - Unify the functions parameters order - Remove redundant check of rank - Produce smaller binary size - Fix Coverity issue `Improper use of negative value`. ### Tickets: - CVS-136544 commit 2e399de62eed4ab212e36032380e6972921b5cd9 Author: Alexandra Sidorova <alexandra.sidorova@intel.com> Date: Thu Aug 1 18:26:25 2024 +0400 [Tests] Commented out debug prints in input range generation (#25848) ### Details: - *Commented out debug prints in input range generation in test infrastructure to avoid large outputs during test executions:* ![image](https://github.com/user-attachments/assets/8e19df2c-2bd2-4327-91cd-da439d0da544) ### Tickets: - *N/A* commit 2f8c265b6cb9b078757b71b0a81d6b95bfd4bcb8 Author: Maciej Smyk <maciejx.smyk@intel.com> Date: Thu Aug 1 16:14:30 2024 +0200 [DOCS] CODEOWNER update for master (#25863) JIRA: 148360 Update of documentation paths for codeowner groups. commit 81e7b21e6bec757398fdb4074e085799ee5c795c Author: Andrei Kashchikhin <andrey.kashchikhin@intel.com> Date: Thu Aug 1 15:06:59 2024 +0100 [CI] [GHA] Get VCPKG version from repository (#25862) ### Tickets: - *132496* commit 504873014ccc800005504841d9819ccf04abc312 Author: Prakash <qxprakash@gmail.com> Date: Thu Aug 1 17:57:11 2024 +0530 [OV JS] Add vision-background-removal sample notebook (#25714) ### Details: - added vision-background-removal notebook - added comments and formatting ### Things Remaining: - adding the sample in the readme - adding the weights download once the unet model ir gets uploaded @vishniakov-nikolai @almilosz please give feedback With Regards Prakash commit fb4e2d3e832d488f94012cc5e4cde1a6d4c4bf44 Author: Vishniakov Nikolai <nikolai.vishniakov@intel.com> Date: Thu Aug 1 14:26:31 2024 +0200 [OVJS] Update openvino-node binaries to 2024.3 in master (#25823) ### Details: - update openvino-node package version to 2024.3.0 in master branch commit 7e16d63b042371655f75869890a770aa9c01e703 Author: Andrei Kashchikhin <andrey.kashchikhin@intel.com> Date: Thu Aug 1 12:55:11 2024 +0100 [CI] [GHA] Gather statistics on newly added Ubuntu workflows (#25856) New workflows were introduced in https://github.com/openvinotoolkit/openvino/pull/25234 but were not added to the workflow that gathers statistics. ### Tickets: - *144917* commit 18e775ff8d7c56e0ba3bfbdb6c94494eddb2d4ce Author: Aleksandr Voron <aleksandr.voron@intel.com> Date: Thu Aug 1 13:35:38 2024 +0200 [CPU][ARM] MLAS transpose executor deprioritised (#25854) ### Details: - The latest performance reports on Ampere show ACL transpose executor provides better performance rather than MLAS Transpose executor (details are in the ticket). Therefore, MLAS Transpose executor priority has been decreased. - Redundant check has been deleted in ACL Transpose executor. ### Tickets: - CVS-148625 commit a0062533f09fc2362004cb7c179ca88d6a4549cd Author: Ilya Lavrenov <ilya.lavrenov@intel.com> Date: Thu Aug 1 16:59:24 2024 +0400 Added version for OpenVINO developer package local version (#25859) ### Details: - To allow to select developer package of specific version - Required for GenAI build as part of OpenVINO extra modules commit eda2f7f40598cce2f970ea635454546844a801ba Author: Zhang Yi <yi3.zhang@intel.com> Date: Thu Aug 1 19:27:09 2024 +0800 [Core][CPU]markup rope's sin/cos generation with f32 (#25662) ### Details: - *Sin/Cos table generation must run in f32 otherwise it has accuracy issue* - Reference : https://github.com/huggingface/transformers/pull/29285 ### Tickets: - *CVS-146672* commit 45b4737e706d0b06f5dd5c4e513fc181ddf4c3ba Author: Karol Blaszczak <karol.blaszczak@intel.com> Date: Thu Aug 1 13:06:49 2024 +0200 [DOCS] supportedmodels table fix 24.3 (#25860) port: https://github.com/openvinotoolkit/openvino/pull/25818 commit 546daf2959928457116fcb807337a511da37c8d9 Author: M <mortezaho.1376@gmail.com> Date: Thu Aug 1 03:26:00 2024 -0700 [GSOC][CPU][ARM] Add NEON implementation for attention softmax (#25616) ### Details: - This PR aims to add NEON implementation for attention softmax commit 7617b37f047b29c67e5010bc54b40ed6de858d76 Author: Karol Blaszczak <karol.blaszczak@intel.com> Date: Thu Aug 1 11:51:53 2024 +0200 [DOCS] add benchmark results for phi (#25838) (#25851) port: https://github.com/openvinotoolkit/openvino/pull/25838 Co-authored-by: Michael Frank Hansen <michael.f.hansen@intel.com> commit 508795f44e301d5f848a212dbfc1257d8552a09b Author: Prakash <qxprakash@gmail.com> Date: Thu Aug 1 15:03:25 2024 +0530 [OV JS] Add vision-background-removal sample script (#25698) ### Details: - added script code and added the unet model weights inside the directory -- ```/openvino/samples/js/node/assets/models``` @vishniakov-nikolai can you please upload it - focused on the implementaion and formatting - output images for now will be saved in the same directory , I will change it later as per your feedback - @vishniakov-nikolai I am a bit doubtful about my naming convention so let me know if I need to modify any names ### Things remaining - [x] Proper comments remaining - [x] Bit of refactoring - [x] Readme Please provide Feedback @vishniakov-nikolai @almilosz With Regards Prakash commit dc3eaf0a2b816fc32a59e79455bce33ec54f535c Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Thu Aug 1 07:26:11 2024 +0000 Bump actions/upload-artifact from 4.3.3 to 4.3.4 (#25846) Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.3 to 4.3.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/upload-artifact/releases">actions/upload-artifact's releases</a>.</em></p> <blockquote> <h2>v4.3.4</h2> <h2>What's Changed</h2> <ul> <li>Update <code>@actions/artifact</code> version, bump dependencies by <a href="https://github.com/robherley"><code>@robherley</code></a> in <a href="https://redirect.github.com/actions/upload-artifact/pull/584">actions/upload-artifact#584</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/upload-artifact/compare/v4.3.3...v4.3.4">https://github.com/actions/upload-artifact/compare/v4.3.3...v4.3.4</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/actions/upload-artifact/commit/0b2256b8c012f0828dc542b3febcab082c67f72b"><code>0b2256b</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-artifact/issues/584">#584</a> from actions/robherley/bump-pkgs</li> <li><a href="https://github.com/actions/upload-artifact/commit/488dcefb9bf01619ac19bad29c5c5409a1e4dd4c"><code>488dcef</code></a> licensed cache</li> <li><a href="https://github.com/actions/upload-artifact/commit/04c51f57662651dd3333286989e2db1111c0fd07"><code>04c51f5</code></a> ncc</li> <li><a href="https://github.com/actions/upload-artifact/commit/32a9e276a8f8ac18b4b2dce8213ed340ed4e5ed8"><code>32a9e27</code></a> bump <code>@actions/artifact</code> and npm audit</li> <li><a href="https://github.com/actions/upload-artifact/commit/552bf3722c16e81001aea7db72d8cedf64eb5f68"><code>552bf37</code></a> new version</li> <li><a href="https://github.com/actions/upload-artifact/commit/79616d2ded92999fceefea2ca2e4bdf6101fa919"><code>79616d2</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-artifact/issues/565">#565</a> from actions/eggyhead/use-artifact-v2.1.6</li> <li>See full diff in <a href="https://github.com/actions/upload-artifact/compare/v4.3.3...0b2256b8c012f0828dc542b3febcab082c67f72b">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/upload-artifact&package-manager=github_actions&previous-version=4.3.3&new-version=4.3.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit fa949478e149f17cce514ebd0d019e8766ef249d Author: Karol Blaszczak <karol.blaszczak@intel.com> Date: Thu Aug 1 09:04:13 2024 +0200 [DOCS] rn fixes and model table (#25835) commit ba681ed72d7e30b2fe94e1cfc5a950a0bcf9bb54 Author: Wilson Seok <wilson.seok@intel.com> Date: Wed Jul 31 20:53:18 2024 -0700 [GPU] Rollback whlie-loop structure for 2nd stage of optimize all crops (#25737) ### Details: - Rollback while-loop structure for 2nd stage of optimize all crops because it has regression for reshape case which has padding. ### Tickets: - 146653 commit 8cfd586e6128055b600e1abe9dcce263071dec7d Author: Eddy Kim <eddy.kim@intel.com> Date: Thu Aug 1 10:05:32 2024 +0900 [GPU] group_normalization for bfzyx (#25753) ### Details: - This PR updates the `group_normalization_bfyx` kernel to support bfzyx format. - Additionally, this PR fixes the output feature calculation logic of the group_norm_fsv16 kernel and a model caching related logic for dynamic model. ### Tickets: - 147841 commit 13b3e4703e32053797099256849b78ebfef6d49c Author: Roman Kazantsev <roman.kazantsev@intel.com> Date: Thu Aug 1 01:44:49 2024 +0400 [TF FE] Stabilize Bitwise layer tests on all platforms and fix u16 bug (#25843) **Details:** Fix u16 bug "Tensor data with element type u16, is not representable as pointer to i32" **Ticket:** 122716 --------- Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> commit d2ab797a0fff1f95ec9ea39e444798dbba499cf6 Author: Ilya Lavrenov <ilya.lavrenov@intel.com> Date: Wed Jul 31 23:22:43 2024 +0400 Fixed compilation with clang and libc++ (#25813) ### Details: - *item1* - *...* ### Tickets: - Closes https://github.com/openvinotoolkit/openvino/issues/25420 commit b26c533421b1ca3f3254df1de14300dbe928405b Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed Jul 31 21:01:11 2024 +0200 Update setuptools requirement from <72,>=65.6.1 to >=65.6.1,<73 in /src/bindings/python (#25792) Updates the requirements on [setuptools](https://github.com/pypa/setuptools) to permit the latest version. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pypa/setuptools/blob/main/NEWS.rst">setuptools's changelog</a>.</em></p> <blockquote> <h1>v72.1.0</h1> <h2>Features</h2> <ul> <li>Restore the tests command and deprecate access to the module. (<a href="https://redirect.github.com/pypa/setuptools/issues/4519">#4519</a>) (<a href="https://redirect.github.com/pypa/setuptools/issues/4520">#4520</a>)</li> </ul> <h1>v72.0.0</h1> <h2>Deprecations and Removals</h2> <ul> <li>The test command has been removed. Users relying on 'setup.py test' will need to migrate to another test runner or pin setuptools before this version. (<a href="https://redirect.github.com/pypa/setuptools/issues/931">#931</a>)</li> </ul> <h1>v71.1.0</h1> <h2>Features</h2> <ul> <li> <p>Added return types to typed public functions -- by :user:<code>Avasam</code></p> <p>Marked <code>pkg_resources</code> as <code>py.typed</code> -- by :user:<code>Avasam</code> (<a href="https://redirect.github.com/pypa/setuptools/issues/4409">#4409</a>)</p> </li> </ul> <h2>Misc</h2> <ul> <li><a href="https://redirect.github.com/pypa/setuptools/issues/4492">#4492</a></li> </ul> <h1>v71.0.4</h1> <h2>Bugfixes</h2> <ul> <li>Removed lingering unused code around Distribution._patched_dist. (<a href="https://redirect.github.com/pypa/setuptools/issues/4489">#4489</a>)</li> </ul> <h1>v71.0.3</h1> <h2>Bugfixes</h2>  </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pypa/setuptools/commit/441799f8b45a1a01c608db49333403db1b0d7100"><code>441799f</code></a> Bump version: 72.0.0 → 72.1.0</li> <li><a href="https://github.com/pypa/setuptools/commit/59aff448e79415ee3e491a8426553b373d7914e5"><code>59aff44</code></a> Merge pull request <a href="https://redirect.github.com/pypa/setuptools/issues/4522">#4522</a> from pypa/feature/graceful-drop-tests</li> <li><a href="https://github.com/pypa/setuptools/commit/c437aaa8d5b969a9fe8c8147463bfcb85b31ab26"><code>c437aaa</code></a> Restore the tests command and deprecate access to the module.</li> <li><a href="https://github.com/pypa/setuptools/commit/a6726b95f7a50dc5945e012050f00450c883fdcd"><code>a6726b9</code></a> Add celery and requests to the packages that test integration. Ref <a href="https://redirect.github.com/pypa/setuptools/issues/4520">#4520</a></li> <li><a href="https://github.com/pypa/setuptools/commit/5e1b3c414779317bc3e105d9bae82ce70c22dbf9"><code>5e1b3c4</code></a> Bump version: 71.1.0 → 72.0.0</li> <li><a href="https://github.com/pypa/setuptools/commit/4c0b9f3ee6ee47c597572655567f215c08c90137"><code>4c0b9f3</code></a> Merge pull request <a href="https://redirect.github.com/pypa/setuptools/issues/4458">#4458</a> from pypa/debt/remove-test-command</li> <li><a href="https://github.com/pypa/setuptools/commit/be8e3a09812f0a3717045098ac6ce7b52fc7d202"><code>be8e3a0</code></a> Merge pull request <a href="https://redirect.github.com/pypa/setuptools/issues/4507">#4507</a> from pypa/docs/4483-install-core-extra</li> <li><a href="https://github.com/pypa/setuptools/commit/99d2c722ca5d58ef1360ed86a3252cc16bd84dfd"><code>99d2c72</code></a> Add documentation clarifying how to reliably install setuptools with its depe...</li> <li><a href="https://github.com/pypa/setuptools/commit/63c89f93d6d43ff96ce5f7f5a862395f924905d0"><code>63c89f9</code></a> 👹 Feed the hobgoblins (delint).</li> <li><a href="https://github.com/pypa/setuptools/commit/c405ac1bf29b945db9af7ba9b0dd77e4d871f72a"><code>c405ac1</code></a> Merge branch 'main' into debt/remove-test-command</li> <li>Additional commits viewable in <a href="https://github.com/pypa/setuptools/compare/v65.6.1...v72.1.0">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com> Co-authored-by: Anastasia Kuporosova <anastasia.kuporosova@intel.com> commit a60140ef5c60f60304ad2a70ebff0f80f97cd51c Author: Dmitry Matveev <dmitry.matveev@intel.com> Date: Wed Jul 31 19:16:29 2024 +0100 Add NPUW to labeler (#25839) ### Details: - Mark changes under "src/plugins/intel_npu/src/plugin/npuw" with NPUW label ### Tickets: - n/a commit 3a9464dc34900b8ee11249f6f56f7a3636a796c8 Author: Vladislav Golubev <vladislav.golubev@intel.com> Date: Wed Jul 31 20:01:30 2024 +0200 [Snippets] Support Brgemm with transposed_b via BrgemmCopyB (#24932) ### Details: - *Support FP32/BF16/I8 matmuls with transpose_b=true via BrgemmCopyB* - *BrgemmCopyB emitter: handle tail iteration by N before the main body* - *Remove workaround on LDB and N dim rounding in brgemm emitters and related buffers* ### Tickets: - *CVS-114487* ## TODO: - [ ] BufferAllocation test for FP32 brgemm with repacking - [ ] SetBrgemmCopyBBuffersShape tests - [ ] MHA with transpose B for low precisions (FP32 already exists) - [ ] FuseTransposeBrgemm tests commit f48b30aab7ae2bb05c9f3709f9398eefe17ff66f Author: Andrei Kashchikhin <andrey.kashchikhin@intel.com> Date: Wed Jul 31 18:39:31 2024 +0100 [CI] [GHA] Introduce additional Ubuntu versions via separate workflows (#25234) ### Details: - This is a sister PR to #25202, the idea is the same: test more Linux flavours. This PR adds Ubuntu 22/24 as separate workflows instead of a matrix used in #25202. - The approach with separate workflows seems better as it does not require unique names for artefacts for matrix jobs and dependent jobs thus making it easier to write and maintain w/o magic strings. ### Tickets: - *144917* commit 161fce5d380e6ab3bdf0dcc6109ea904f11672bd Author: Zlobin Vladimir <vladimir.zlobin@intel.com> Date: Wed Jul 31 20:01:50 2024 +0400 Update open-model_zoo submodule (#25826) commit 25455a0dd97d9c724522dab43f2a019e2a6643d0 Author: Ujjayant Kadian <118752727+ujjayant-kadian@users.noreply.github.com> Date: Wed Jul 31 16:28:45 2024 +0100 NPUW: Change the sub-byte (i4) element order in the unpack procedure to match OpenVINO 2024.0 (#25827) ### Details: In the latest versions of OpenVINO the sub-byte order is defined as [1,0] meaning that first (MSB) 4 bits of an 8-bit vector form 1st element, and the last (LSB) 4 bits of an 8-bit vector form 0th element. Our unpack procedures for i4 were aligned with the older representation, where sub-byte order was defined as [0,1] meaning that first (MSB) 4 bits of an 8-bit vector form 0th element, and the last (LSB) 4 bits were the 1st element. **Updated these unpack functions to use this new order.** ### Tickets: - *121052* commit 3e058b90a891fee9e707dd9c2859492fa5166f71 Author: Roman Lyamin <Roman.Lyamin@intel.com> Date: Wed Jul 31 18:45:15 2024 +0400 [GPU] Fix lws calculation for reorder_kernel_bfyx_to_blocked_format kernel (#25830) ### Tickets: - *[146165](https://jira.devtools.intel.com/browse/CVS-146165)* commit a5d82f2ebf15bb11b452a4027c6b7ae54ca2951c Author: Sebastian Golebiewski <sebastianx.golebiewski@intel.com> Date: Wed Jul 31 15:04:21 2024 +0200 [DOCS] Updating Edit Button for articles for master (#25832) Porting: https://github.com/openvinotoolkit/openvino/pull/25831 commit 98956aa41354f0402bc7e84ad993efef21cb8cf8 Author: Alexandra Sidorova <alexandra.sidorova@intel.com> Date: Wed Jul 31 16:54:52 2024 +0400 [CPU][RISCV64] Fixed onednn build for RVV case (#24151) ### Details: - *Missed include `primitive.hpp` in RVV pooling implementation* - *oneDNN PR: https://github.com/openvinotoolkit/oneDNN/pull/259* - *It's not seen in CI since OV is built with default `-march=rv64imafdc` - without vector intrinsic support. Need to build with RVV support (`-march=rv64gcv0p7`)* ### Tickets: - *N/A* commit 10620e9fd68cbfb2f6ae2a1298e6af8425367bfe Author: Sun Xiaoxia <xiaoxia.sun@intel.com> Date: Wed Jul 31 19:54:29 2024 +0800 Fix executor memory leak when "-nstreams 0" (#25778) ### Details: - *create executor config when streams=0* ### Tickets: - *146686* commit cae739b96354aff83945767d2fad094e03ebebce Author: Edward Shogulin <edward.shogulin@intel.com> Date: Wed Jul 31 12:28:41 2024 +0100 [LPT] Dequantization precision reusage (#25668) ### Details: - *NNCF quantized fp16 model on GPU support* ### Tickets: - *CVS-126300* commit 3e49c22ff76f55304ea2bb1a832fce8b2a04ea69 Author: Alexandra Sidorova <alexandra.sidorova@intel.com> Date: Wed Jul 31 15:24:23 2024 +0400 [Snippets] Added auto sorting of LoopPorts (#25623) ### Details: - *Added support of expression enumeration - new attribute `m_exec_num` of `Expression`. Calculated as `exec_num_left + (exec_num_right - exec_num_left) / 2`. Now we can figure out which expression is executed earlier than another using `m_exec_num O(1)` instead of `find(begin(), end(), smth) == end() O(n)`* - *Refactored LoopInfo interface: united all `update` and `replace` into one `replace_with_new_ports`.* - *Added auto sorting of ports in LoopInfo: after port replacing, new expression/node insertion using helpers - loop ports are automatically reordered by expression execution numbers* - *Removed previous workarounds with `GetTopologicalOrder` from tokenization pass* ### Tickets: - *113536* - *142990* - *137819* commit 89b49c10ca719505712b53cf44370dbdb3782fbc Author: Karol Blaszczak <karol.blaszczak@intel.com> Date: Wed Jul 31 13:12:50 2024 +0200 [DOCS] 24.3 archives and final touches (#25829) port: https://github.com/openvinotoolkit/openvino/pull/25828 commit f0d7cd8c22e2a994a4371cc5e15d6be33c9e6785 Author: Sebastian Golebiewski <sebastianx.golebiewski@intel.com> Date: Wed Jul 31 13:05:07 2024 +0200 [DOCS] Updating Tool Ecosystem article (#25824) Adding information on OpenVINO-based AI projects. Co-authored-by: Maciej Smyk <maciejx.smyk@intel.com> Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com> commit ea6731f8a75b907eea1ee9317c2cd89a2d54e4c4 Merge: 70b8346d72 3c713d4aec Author: Ujjayant Kadian <118752727+ujjayant-kadian@users.noreply.github.com> Date: Wed Jul 31 11:56:25 2024 +0100 Merge branch 'master' into uk/changing-sub-byte-i4-element-order commit 11c01898f507c1abb7d64d70f89ffcc281081373 Author: Roman Kazantsev <roman.kazantsev@intel.com> Date: Wed Jul 31 14:19:01 2024 +0400 [TF FE] Support TensorListConcatV2 operation for multiple undefined dims in element_shape (#25814) **Details:** Support TensorListConcatV2 operation for multiple undefined dims in element_shape **Ticket:** 105671 --------- Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com> commit 3c713d4aec23c825baa71fd524f93140bc928ce9 Author: Chen Xu <chen.xu@intel.com> Date: Wed Jul 31 17:32:10 2024 +0800 [CPU] Avoid rounding to zero for Reduce node in quantized models (#25766) ### Details: - *If the Reduce node has both input and output precision to be integers from the original model, then rounding to zero should be done before converting intermediate floating point value to integer.* - *However, if such integer precisions are resulted from quantization, then we should not do such rounding, in order to maintain accuracy.* - *Add corresponding test cases.* ### Tickets: - *CVS-147352*

### Details: - *The PR enables dynamic FP32 MHA tokenization on x64 platforms 🎉* - *`std::vector.resize()` which was used for buffer scratchpad allocation is very expensive operation due to default constructor of elements. This PR replace `std::vector.resize()` with CPU Node Scratchpad memory which can be shared between nodes. Also since each thread must have the own scratchpad memory, we allocated `size * threads_max` - however, in execution thread count can be less (depends on parallel work amount). Now we allocate only `size * n_threads` where `nthreads` is real count of working threads.* - *Fixed dimension K validation in `BrgemmBlocking` pass: one of inputs can have dynamic value of this dimension* - *Fixed `utils::broadcast_merge_dim()` and supported broadcasting of integer values in IterHandlers. Added unit tests for `utils::broadcast_merge_dim()`* ### Tickets: - *149900* ### Prerequisites: - [x] #25326 - [x] #25378 - [x] #25623 - [x] #25638 - [x] #25745 - [x] #25957 - [x] #25733

a-sidorova added this to the 2024.4 milestone Jul 26, 2024

github-actions bot added the category: CPU OpenVINO CPU plugin label Jul 26, 2024

a-sidorova commented Jul 26, 2024

View reviewed changes

a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch from 0b9e837 to 9176ba5 Compare July 26, 2024 11:37

a-sidorova marked this pull request as ready for review July 26, 2024 11:37

a-sidorova requested review from a team as code owners July 26, 2024 11:37

a-sidorova assigned v-Golubev and IvanNovoselov Jul 26, 2024

a-sidorova mentioned this pull request Jul 26, 2024

[Snippets][CPU] Enabled dynamic MHA FP32 tokenization on x64 #25500

Merged

7 tasks

v-Golubev reviewed Jul 31, 2024

View reviewed changes

src/plugins/intel_cpu/src/emitters/snippets/x64/kernel_executors/brgemm.cpp Outdated Show resolved Hide resolved

src/plugins/intel_cpu/src/emitters/snippets/x64/kernel_executors/brgemm.cpp Show resolved Hide resolved

a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch 2 times, most recently from 28a5f4b to b1eedba Compare August 1, 2024 06:47

a-sidorova requested a review from v-Golubev August 1, 2024 06:48

a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch from b1eedba to bd4225b Compare August 1, 2024 06:53

v-Golubev reviewed Aug 1, 2024

View reviewed changes

a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch from 6071cee to c2dda80 Compare August 1, 2024 12:39

a-sidorova requested a review from v-Golubev August 1, 2024 12:41

a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch from c2dda80 to 05909aa Compare August 1, 2024 13:14

v-Golubev approved these changes Aug 1, 2024

View reviewed changes

dmitry-gorokhov approved these changes Aug 2, 2024

View reviewed changes

dmitry-gorokhov added this pull request to the merge queue Aug 2, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 2, 2024

dmitry-gorokhov added this pull request to the merge queue Aug 2, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 2, 2024

a-sidorova added 4 commits August 2, 2024 15:09

[Snippets][CPU] Supported Brgemm blocking by dynamic N and K

705be2c

[Snippets][CPU] Supported Brgemm blocking by dynamic K [Snippets][CPU] Added validation checks [Snippets] fixed build

[Snippets][CPU] Moved beta from Brgemm to BrgemmTPP

fb920c2

[Snippets][CPU] Made simpler beta initialization in brgemm executor

1ad9ad4

[Snippets] Applied Vladislav comments

98a29c0

a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch from 05909aa to 98a29c0 Compare August 2, 2024 11:09

dmitry-gorokhov enabled auto-merge August 2, 2024 11:10

dmitry-gorokhov added this pull request to the merge queue Aug 2, 2024

Merged via the queue into openvinotoolkit:master with commit b2319a5 Aug 2, 2024
136 checks passed

dmitry-gorokhov deleted the feature/snippets/dynamism/blocking_by_k_n branch August 2, 2024 15:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Snippets][CPU] Added Brgemm FP32 blocking support by dynamic K, N dimensions #25745

[Snippets][CPU] Added Brgemm FP32 blocking support by dynamic K, N dimensions #25745

a-sidorova commented Jul 26, 2024 •

edited

Loading

a-sidorova left a comment •

edited

Loading

v-Golubev left a comment

v-Golubev Aug 1, 2024

[Snippets][CPU] Added Brgemm FP32 blocking support by dynamic K, N dimensions #25745

[Snippets][CPU] Added Brgemm FP32 blocking support by dynamic K, N dimensions #25745

Conversation

a-sidorova commented Jul 26, 2024 • edited Loading

Details:

Tickets:

Prerequisites:

a-sidorova left a comment • edited Loading

Choose a reason for hiding this comment

v-Golubev left a comment

Choose a reason for hiding this comment

v-Golubev Aug 1, 2024

Choose a reason for hiding this comment

a-sidorova commented Jul 26, 2024 •

edited

Loading

a-sidorova left a comment •

edited

Loading