Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Snippets][CPU] Added Brgemm FP32 blocking support by dynamic K, N dimensions #25745

Conversation

a-sidorova
Copy link
Contributor

@a-sidorova a-sidorova commented Jul 26, 2024

Details:

  • Added update support of K and N dimensions for Brgemm block in BrgemmKernelExecutor::update_config

Tickets:

  • 147852

Prerequisites:

@a-sidorova a-sidorova added this to the 2024.4 milestone Jul 26, 2024
@github-actions github-actions bot added the category: CPU OpenVINO CPU plugin label Jul 26, 2024
Copy link
Contributor Author

@a-sidorova a-sidorova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open question for discussion: Should we remove SetBrgemmBeta pass? This PR add beta initialization support in BrgemmKernelExecutor::update_config to update this value in runtime for dynamism. It successfully covers static pipeline as well.
I have some thoughts:

  • I'd like to remove this pass in general. However, it's used by tpp brgemm emitter -> so I decided to left this pass for now
  • There is idea to set beta = dynamic_value for dynamic cases. Because this value is set in Brgemm node when I update this value only on BrgemmKernelExecutorConfig and don't mind about Brgemm::beta 🤔

I think we should discuss this moment

UPD:
Discussed offline and decided to move beta to tpp brgemm and create dummy empty pass to force first iteration for CPU case (leave the pass SetBrgemmMeta for tpp)

@a-sidorova a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch from 0b9e837 to 9176ba5 Compare July 26, 2024 11:37
@a-sidorova a-sidorova marked this pull request as ready for review July 26, 2024 11:37
@a-sidorova a-sidorova requested review from a team as code owners July 26, 2024 11:37
@a-sidorova a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch 2 times, most recently from 28a5f4b to b1eedba Compare August 1, 2024 06:47
@a-sidorova a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch from b1eedba to bd4225b Compare August 1, 2024 06:53
@a-sidorova a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch from 6071cee to c2dda80 Compare August 1, 2024 12:39
@a-sidorova a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch from c2dda80 to 05909aa Compare August 1, 2024 13:14
Copy link
Contributor

@v-Golubev v-Golubev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work 👍

Comment on lines +29 to +38
class DummyPass : public snippets::lowered::pass::RangedPass {
public:
DummyPass() = default;
OPENVINO_RTTI("DummyPass", "RangedPass")
bool run(snippets::lowered::LinearIR& linear_ir,
snippets::lowered::LinearIR::constExprIt begin,
snippets::lowered::LinearIR::constExprIt end) override;
std::shared_ptr<snippets::lowered::pass::PassBase> merge(const std::shared_ptr<snippets::lowered::pass::PassBase>& other) override;
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main remaining concern is that we have to keep DummyPass in cpu blocking pass (and SetBrgemmBeta in TPP one) in public section just because we need these passes to build the reference LIRs in tests. Ideally, these iterations handlers should be in private section.

However, there is no obvious solution for that so we can probably return to that discussion later. I don't think this blocks the merge

@dmitry-gorokhov dmitry-gorokhov added this pull request to the merge queue Aug 2, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 2, 2024
@dmitry-gorokhov dmitry-gorokhov added this pull request to the merge queue Aug 2, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 2, 2024
@a-sidorova a-sidorova force-pushed the feature/snippets/dynamism/blocking_by_k_n branch from 05909aa to 98a29c0 Compare August 2, 2024 11:09
@dmitry-gorokhov dmitry-gorokhov added this pull request to the merge queue Aug 2, 2024
Merged via the queue into openvinotoolkit:master with commit b2319a5 Aug 2, 2024
136 checks passed
@dmitry-gorokhov dmitry-gorokhov deleted the feature/snippets/dynamism/blocking_by_k_n branch August 2, 2024 15:02
ujjayant-kadian added a commit to ujjayant-kadian/openvino that referenced this pull request Aug 6, 2024
commit 67fd2eb6d83435b195ef56004d7d9f9c2a728502
Merge: 5f09ab51c0 9432b3d2a5
Author: Ujjayant Kadian <118752727+ujjayant-kadian@users.noreply.github.com>
Date:   Tue Aug 6 13:07:36 2024 +0100

    Merge branch 'master' into uk/changing-sub-byte-i4-element-order

commit 9432b3d2a577bc27e8008d85002ce57c4b0e3159
Author: Min, Byungil <byungil.min@intel.com>
Date:   Tue Aug 6 19:20:02 2024 +0900

    [GPU] Bugfix reorder for byfx format (#25782)

    + Reorder returns OOR error while handling byfx from a fused permute
    parent

    ### Details:
     - *item1*
     - *...*

    ### Tickets:
     - CVS-147330

    ---------

    Signed-off-by: Min, Byung-il <byungil.min@intel.com>

commit 606d909ab8ec130fd7c6a9d2d56a839978903a2f
Author: Bogdan Pereanu <bogdan.pereanu@intel.com>
Date:   Tue Aug 6 13:12:32 2024 +0300

    [NPU] Disable MCL in case of UD28 (#25903)

    ### Details:
    - *The UD28 Windows driver version doesn't support as expected the
    MutableCommandList feature - just disable this feature from the plugin
    in case this driver is used*

    ### Tickets:
     - *EISW-133845*

commit b6447980be06caf6bb6c1592eee4eb6de094218c
Author: Anastasiia Pnevskaia <anastasiia.pnevskaia@intel.com>
Date:   Tue Aug 6 10:26:04 2024 +0200

    [DOCS] Corrected build guides in docs. (#25922)

    ### Details:
     - Corrected build guides in docs.

    ### Tickets:
     -

commit 265dfad8ebcdae2b17611d833ec8da0f0ddc9bd2
Author: Przemyslaw Wysocki <przemyslaw.wysocki@intel.com>
Date:   Tue Aug 6 10:19:41 2024 +0200

    Change index precision from `i64` to `i32` in MaxPool14 to MaxPool8 downgrade transformation (#25514)

    ### Tickets:
     - CVS-146277

commit 9eeb7a18d5ae039d1b406cab405ad2083dc5680c
Author: Maciej Smyk <maciejx.smyk@intel.com>
Date:   Tue Aug 6 09:38:15 2024 +0200

    [DOCS] Dependencies and Building for OpenVINO GenAI article for master (#25908)

    Adding information on the OpenVINO GenAI Dependencies and ref-link to
    the GenAI building in user docs.

commit cbf4035c257042aec180102d434287c27d9cd2f6
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Tue Aug 6 11:16:55 2024 +0400

    Bump hendrikmuhs/ccache-action from 1.2.13 to 1.2.14 (#25917)

    Bumps
    [hendrikmuhs/ccache-action](https://github.com/hendrikmuhs/ccache-action)
    from 1.2.13 to 1.2.14.
    <details>
    <summary>Release notes</summary>
    <p><em>Sourced from <a
    href="https://github.com/hendrikmuhs/ccache-action/releases">hendrikmuhs/ccache-action's
    releases</a>.</em></p>
    <blockquote>
    <h2>v1.2.14</h2>
    <h2>What's Changed</h2>
    <ul>
    <li>Add sccache to PATH after installation by <a
    href="https://github.com/kendalharland"><code>@​kendalharland</code></a>
    in <a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/pull/204">hendrikmuhs/ccache-action#204</a></li>
    <li>Make ccache-action respect environment variables by <a
    href="https://github.com/TrentHouliston"><code>@​TrentHouliston</code></a>
    in <a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/pull/217">hendrikmuhs/ccache-action#217</a></li>
    <li>updates</li>
    </ul>
    <h2>New Contributors</h2>
    <ul>
    <li><a
    href="https://github.com/kendalharland"><code>@​kendalharland</code></a>
    made their first contribution in <a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/pull/204">hendrikmuhs/ccache-action#204</a></li>
    <li><a href="https://github.com/cclauss"><code>@​cclauss</code></a> made
    their first contribution in <a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/pull/213">hendrikmuhs/ccache-action#213</a></li>
    <li><a
    href="https://github.com/TrentHouliston"><code>@​TrentHouliston</code></a>
    made their first contribution in <a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/pull/217">hendrikmuhs/ccache-action#217</a></li>
    </ul>
    <p><strong>Full Changelog</strong>: <a
    href="https://github.com/hendrikmuhs/ccache-action/compare/v1...v1.2.14">https://github.com/hendrikmuhs/ccache-action/compare/v1...v1.2.14</a></p>
    </blockquote>
    </details>
    <details>
    <summary>Commits</summary>
    <ul>
    <li><a
    href="https://github.com/hendrikmuhs/ccache-action/commit/ed74d11c0b343532753ecead8a951bb09bb34bc9"><code>ed74d11</code></a>
    Bump <code>@​types/node</code> from 22.0.0 to 22.1.0 (<a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/222">#222</a>)</li>
    <li><a
    href="https://github.com/hendrikmuhs/ccache-action/commit/a92dd99d2cf20a1db8898b00bb383b234fb1cf15"><code>a92dd99</code></a>
    Bump <code>@​types/node</code> from 20.14.11 to 22.0.0 (<a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/220">#220</a>)</li>
    <li><a
    href="https://github.com/hendrikmuhs/ccache-action/commit/aa7d29411285c29f578109e54b7a8d8155c2fbb3"><code>aa7d294</code></a>
    Bump typescript from 5.5.3 to 5.5.4 (<a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/218">#218</a>)</li>
    <li><a
    href="https://github.com/hendrikmuhs/ccache-action/commit/6f0874030891bf49d844fff92b862568f093dabe"><code>6f08740</code></a>
    Make ccache-action respect environment variables (<a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/217">#217</a>)</li>
    <li><a
    href="https://github.com/hendrikmuhs/ccache-action/commit/ed979992cda44142d976add1d5a7d6f39f7e8b67"><code>ed97999</code></a>
    Bump <code>@​types/node</code> from 20.14.10 to 20.14.11 (<a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/216">#216</a>)</li>
    <li><a
    href="https://github.com/hendrikmuhs/ccache-action/commit/ca1e5062f3378412bbfeb780d1ebe3c2a4913081"><code>ca1e506</code></a>
    Bump actions/checkout from 2 to 4 (<a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/214">#214</a>)</li>
    <li><a
    href="https://github.com/hendrikmuhs/ccache-action/commit/069136ab7ab2267ea6624fde73f80d7d472d323e"><code>069136a</code></a>
    Bump <code>@​types/node</code> from 20.14.9 to 20.14.10 (<a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/212">#212</a>)</li>
    <li><a
    href="https://github.com/hendrikmuhs/ccache-action/commit/3cf745af56c860cc76c89ffd830efec6aef03b56"><code>3cf745a</code></a>
    Bump typescript from 5.5.2 to 5.5.3 (<a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/211">#211</a>)</li>
    <li><a
    href="https://github.com/hendrikmuhs/ccache-action/commit/9a0cc152966f2c3f3df86a6e0364da1608924006"><code>9a0cc15</code></a>
    Keep GitHub Actions up to date with GitHub's Dependabot (<a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/213">#213</a>)</li>
    <li><a
    href="https://github.com/hendrikmuhs/ccache-action/commit/b7c0e162a73e852cdd80bd368aa77e7801fce009"><code>b7c0e16</code></a>
    Bump <code>@​types/node</code> from 20.14.8 to 20.14.9 (<a
    href="https://redirect.github.com/hendrikmuhs/ccache-action/issues/210">#210</a>)</li>
    <li>Additional commits viewable in <a
    href="https://github.com/hendrikmuhs/ccache-action/compare/c92f40bee50034e84c763e33b317c77adaa81c92...ed74d11c0b343532753ecead8a951bb09bb34bc9">compare
    view</a></li>
    </ul>
    </details>
    <br />

    [![Dependabot compatibility
    score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=hendrikmuhs/ccache-action&package-manager=github_actions&previous-version=1.2.13&new-version=1.2.14)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

    Dependabot will resolve any conflicts with this PR as long as you don't
    alter it yourself. You can also trigger a rebase manually by commenting
    `@dependabot rebase`.

    [//]: # (dependabot-automerge-start)
    [//]: # (dependabot-automerge-end)

    ---

    <details>
    <summary>Dependabot commands and options</summary>
    <br />

    You can trigger Dependabot actions by commenting on this PR:
    - `@dependabot rebase` will rebase this PR
    - `@dependabot recreate` will recreate this PR, overwriting any edits
    that have been made to it
    - `@dependabot merge` will merge this PR after your CI passes on it
    - `@dependabot squash and merge` will squash and merge this PR after
    your CI passes on it
    - `@dependabot cancel merge` will cancel a previously requested merge
    and block automerging
    - `@dependabot reopen` will reopen this PR if it is closed
    - `@dependabot close` will close this PR and stop Dependabot recreating
    it. You can achieve the same result by closing it manually
    - `@dependabot show <dependency name> ignore conditions` will show all
    of the ignore conditions of the specified dependency
    - `@dependabot ignore this major version` will close this PR and stop
    Dependabot creating any more for this major version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this minor version` will close this PR and stop
    Dependabot creating any more for this minor version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this dependency` will close this PR and stop
    Dependabot creating any more for this dependency (unless you reopen the
    PR or upgrade to it yourself)

    </details>

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit d9d5ace62609f909cdaac0ac8073d78a1f19607d
Author: Vladimir Paramuzov <vladimir.paramuzov@intel.com>
Date:   Tue Aug 6 10:24:43 2024 +0400

    [Transformations] Extend MoveEltwiseUpThroughData pass with per channel case (#24401)

    ### Details:
    - Added pass to swap Reshape/Squeeze/Unsqueeze -> Eltwise (per channel)

commit 513c812fcf5049cab4084b7a09e862f7df357880
Author: Gorokhov Dmitriy <dmitry.gorokhov@intel.com>
Date:   Tue Aug 6 09:33:29 2024 +0400

    [CPU] FullyConnected weights compression: mxfp4 (wei=f4e2m1, scales=f8e8m0) support (#25783)

    ### Details:
    - This PR extends FC weights compression support with mxfp4 (wei=f4e2m1,
    scales=f8e8m0) precision
     - ISA coverage: avx2, avx512
    - oneDNN fork changes:
    https://github.com/openvinotoolkit/oneDNN/pull/258

    ### Tickets:
     - [CVS-142986](https://jira.devtools.intel.com/browse/CVS-142986)

    ### Dependencies:
    oneDNN 3.5 migration:
    https://github.com/openvinotoolkit/openvino/pull/25153

commit 73e1b94625c277ad89d4a613eef889213a1b856e
Author: Vladimir Paramuzov <vladimir.paramuzov@intel.com>
Date:   Tue Aug 6 09:21:09 2024 +0400

    [GPU][TRANSFORMATIONS] Disable per pass validation in some cases (#25874)

    ### Details:
    - Disable per pass validation for GPU specific passes and mixed
    precision markup to improve model loading time

commit 7fd8b2ed77d4b31cab9556742320a793506f7327
Author: Vladimir Paramuzov <vladimir.paramuzov@intel.com>
Date:   Tue Aug 6 09:16:49 2024 +0400

    [GPU] Dynamic pipeline host opt (#25886)

    ### Details:
     - Reduce count of copies for layouts/shapes and other complex objects

commit d604f1d8b2a60fa68b704c2a8f81e283c4aa2f0f
Author: Michal Miotk <michal.miotk@intel.com>
Date:   Tue Aug 6 00:54:25 2024 +0200

    fix for confused input with output in assert error message (#25915)

    ### Details:
     - short fix for message

    ### Tickets:
     - N/A

commit f8d0e8c47c5be32b2e5e44e4449a337fcbc130fb
Author: Andrew Kwangwoong Park <andrew.park@intel.com>
Date:   Tue Aug 6 02:52:42 2024 +0900

    Revert "[GPU] Avoid crop buffer fusing when dynamic shape and squeeze/unsqueeze reshape mode" (#25895)

    ### Details:
     - This revert https://github.com/openvinotoolkit/openvino/pull/25700
    - As support for Crop->Reshape(Squeeze/Unsqueeze modes) buffer
    optimization was added by
    https://github.com/openvinotoolkit/openvino/pull/25836

    ### Tickets:
     - 146626

commit 5264c9995f3a41b642a3359155edb719243944a1
Author: Karol Blaszczak <karol.blaszczak@intel.com>
Date:   Mon Aug 5 18:41:37 2024 +0200

    [DOCS] tiny article name changes (#25910)

commit 3cf27441ff5cd497499bc37d92e55d901e88ca59
Author: Ilya Lavrenov <ilya.lavrenov@intel.com>
Date:   Mon Aug 5 19:47:38 2024 +0400

    Removed GHA WA for older ONNX versions (#25912)

    ### Details:
    - Removed WA introduced here
    https://github.com/openvinotoolkit/openvino/pull/25234 because ONNX
    version is updated here
    https://github.com/openvinotoolkit/openvino/pull/24242

commit afb194f3747ed56ab524500842cb50281abe41a9
Author: Rinne <AsakusaRinne@gmail.com>
Date:   Mon Aug 5 22:33:17 2024 +0800

    [JAX FE] Add translation for more operations. (#25292)

    ### Details:
    - *Add the translation for reduce_window_max, reduce_window_sum, rsqrt,
    reshape , squeeze, slice, broadcast_in_dim, copy, dot_general and
    transpose of JAX frontend*
     - *Add corresponding test*

    ### Tickets:
     - *CVS-145575*
     - *CVS-145583*
     - *CVS-145580*
     - *CVS-145574*
     - *CVS-145581*
     - *CVS-145579*
     - *CVS-145582*
     - *CVS-145573*
     - *CVS-145578*

    NOTE: this PR should be merged after #25290

    ---

    @mvafin Could you please help to review this PR?
    cc @rkazants

    ---------

    Co-authored-by: Maxim Vafin <maxim.vafin@intel.com>
    Co-authored-by: Roman Kazantsev <roman.kazantsev@intel.com>

commit c30a0bcf6ba4f1b75412c353cefe63f97f6ee33c
Author: Georgy Krivoruchko <georgy.krivoruchko@intel.com>
Date:   Mon Aug 5 18:12:22 2024 +0400

    [ONNX] Aligned behavior for ReduceProd-11,13,18 (#25875)

    ### Details:
     - Aligned behavior of ReduceProd operation

    ### Tickets:
     - 143347

commit 10a2e91d2502bc7bc5aa7c2fbcc5b845c7a00975
Author: Aleksandr Voron <aleksandr.voron@intel.com>
Date:   Mon Aug 5 15:54:19 2024 +0200

    [CPU][ARM] Enable ACL MVN executor for `initAcrossChannels` option in NHWC layout (#25905)

    ### Details:
    - This configuration (initAcrossChannels is true and NHWC is used) was
    disabled for ACL executor to enable `yolo_v3_tiny`. The last check shows
    this restriction is not required anymore.

    ### Tickets:
     - *ticket-id*

commit 12a5e5a505da2d793bed99efdfdc4bda42be9850
Author: Georgy Krivoruchko <georgy.krivoruchko@intel.com>
Date:   Mon Aug 5 17:06:03 2024 +0400

    [ONNX] Switched to ONNX 1.16.0 (#24242)

    ### Details:
     - Switched to ONNX 1.16.0
     - Removed WA for ONNX 1.15.0
     - ONNXRuntime for tests 1.18.1

    ### Tickets:
     - 136748, 138876

    ---------

    Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>

commit 7eedf84ef21918e84f4488a582645a82d921e507
Author: Luo Cheng <cheng.luo@intel.com>
Date:   Mon Aug 5 21:05:08 2024 +0800

    [CPU] Add score output for PagedAttention (#25594)

    ### Details:
     - *Add score output for PagedAttention*
     - *...*

    ### Tickets:
     - *[146969](https://jira.devtools.intel.com/browse/CVS-146969)*

commit a6413b415ff8cc7cd9eb9cf3cfe96334bd1907e4
Author: Przemyslaw Wysocki <przemyslaw.wysocki@intel.com>
Date:   Mon Aug 5 15:00:01 2024 +0200

    [PyOV] Replace `std::stringstream` with `std::fstream` in `import_model` (#25724)

    ### Details:
     - The current implementation breaks when the model size is > 2gb
     - `std::fstream` does not limit the model size
    - Tested in
    https://github.com/openvinotoolkit/openvino/blob/master/src/bindings/python/tests/test_runtime/test_compiled_model.py#L57
     - The fix has been verified

    ### TODO:
     - Should we simulate > 2gb model case in tests?

    ### Tickets:
     - EISW-130771

commit e35acf91e9a953ee081d0bae355a7e848ef41b86
Author: Attila Csok <attila.csok@intel.com>
Date:   Mon Aug 5 15:45:52 2024 +0300

    [intel-npu] Adding NPU_TURBO option to plugin (#25646)

    ### Details:
     - Adding npu_turbo option for intel-npu plugin
     - updating documentation with turbo and other missing properties

    Master backport of
    https://github.com/openvinotoolkit/openvino/pull/25603

    ### Tickets:
     - [*ticket-id*](https://jira.devtools.intel.com/browse/CVS-147038)

commit 64c5f67a5aa31b020d295c210b0345bdd74e4dbb
Author: Ilya Lavrenov <ilya.lavrenov@intel.com>
Date:   Mon Aug 5 14:08:22 2024 +0400

    Fixed compatibility with new version of 'wheel' (#25899)

    ### Details:
     - *item1*
     - *...*

    ### Tickets:
     - *ticket-id*

commit c664ca7f288f59722d82e9bfbb994f0c7c1e232e
Author: Xuejun Zhai <xuejun.zhai@intel.com>
Date:   Sun Aug 4 17:32:04 2024 +0800

    Clean meta plugin tests from CPU/GPU plugin (#24477)

    ### Details:
     - Move BATCH related test out from CPU/GPU func test to BATCH func test
    - Move HETERO related test out from CPU/GPU func test to HETERO func
    test
     - *...*

    ### Tickets:
     - *ticket-id*

    ---------

    Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com>
    Co-authored-by: Chen Peter <peter.chen@intel.com>

commit 59a0f019913681287f82553a07e0b299404de821
Author: Peyara Nando <nandu45@outlook.com>
Date:   Sat Aug 3 05:45:30 2024 +0530

    Implemented getOutputElementType (#25760)

    Implemented Method on c++ side.
    Updated typescript definitions.
    Created unit tests.
    For Issue
    [https://github.com/openvinotoolkit/openvino/issues/25406](https://github.com/openvinotoolkit/openvino/issues/25406)

    Resolved merge errors

    ---------

    Co-authored-by: Alicja Miloszewska <alicja.miloszewska@intel.com>

commit 5f09ab51c00ed0d207bc02963783efe597dda5de
Author: Kadian <ujjayant.kadian@intel.com>
Date:   Fri Aug 2 16:14:42 2024 +0100

    Modified comments

commit 0f1ad2b95de9d7985f8db93e99450bb490c260d0
Merge: 99523fc962 ae454eebbd
Author: Kadian <ujjayant.kadian@intel.com>
Date:   Fri Aug 2 16:08:41 2024 +0100

    Merge branch 'uk/changing-sub-byte-i4-element-order' of github.com:ujjayant-kadian/openvino into uk/changing-sub-byte-i4-element-order

commit 99523fc9624738b9af5fdd1ca58aa301f44d49df
Author: Kadian <ujjayant.kadian@intel.com>
Date:   Fri Aug 2 13:06:18 2024 +0100

    Added a new pattern in pattern matcher

    [CPU] Avoid rounding to zero for Reduce node in quantized models (#25766)

    - *If the Reduce node has both input and output precision to be integers
    from the original model, then rounding to zero should be done before
    converting intermediate floating point value to integer.*
    - *However, if such integer precisions are resulted from quantization,
    then we should not do such rounding, in order to maintain accuracy.*
     - *Add corresponding test cases.*

     - *CVS-147352*

    Correct clang format issues

    Tried to resolve the segmentation fault

    Corrected clang format error

    Tried to correct segmentation fault

    Removed std::move

    Using std::move with much more caution

commit ae454eebbdde2d2582cbe43e5a10e62a7ec61d50
Merge: 46b84b994e b2319a5bea
Author: Ujjayant Kadian <118752727+ujjayant-kadian@users.noreply.github.com>
Date:   Fri Aug 2 16:04:40 2024 +0100

    Merge branch 'openvinotoolkit:master' into uk/changing-sub-byte-i4-element-order

commit d29948c758501bafe807ff0feeed8875574545a6
Author: Roman Kazantsev <roman.kazantsev@intel.com>
Date:   Fri Aug 2 19:02:31 2024 +0400

    [TF FE][SDL] Fix performance inefficiencies (#25884)

    **Details:** Fix performance inefficiencies

    **Ticket:** 148599

    Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

commit 46b84b994e55306f89aa437fd2271b6164e548b1
Author: Kadian <ujjayant.kadian@intel.com>
Date:   Fri Aug 2 15:45:34 2024 +0100

    Using std::move with much more caution

commit cb2814d2ee5e832b7e8a0809f55185f305133cad
Author: Kadian <ujjayant.kadian@intel.com>
Date:   Fri Aug 2 15:43:28 2024 +0100

    Removed std::move

commit ae13bed22c9a222611dce04075f3bbe6ac87091e
Author: Kadian <ujjayant.kadian@intel.com>
Date:   Fri Aug 2 15:34:40 2024 +0100

    Tried to correct segmentation fault

commit a98775ad74bcdf2bcc58820f2373adcaf3d98dff
Author: ujjayant-kadian <ujjayant.kadian@intel.com>
Date:   Fri Aug 2 14:31:54 2024 +0000

    Corrected clang format error

commit c2ba823ef43f6804b07781af3707220be184f541
Author: Kadian <ujjayant.kadian@intel.com>
Date:   Fri Aug 2 15:24:24 2024 +0100

    Tried to resolve the segmentation fault

commit a33afe422f0ff9f655dd9f660d35f441e148433e
Author: Sergey Shlyapnikov <sergey.shlyapnikov@intel.com>
Date:   Fri Aug 2 18:17:31 2024 +0400

    [GPU] Fix Crop->Reshape (Squeeze/Unsqueeze modes) buffer optimization (#25836)

    These changes fix a significant accuracy issue (reducing perplexity from
    120 000 to 17) for Llama models with precalculated constant sin/cos
    values. However, there is still a problem with sin/cos representation in
    FP16 precision, which will be addressed in a separate PR.

    ### Details:
     - Fixed Crop->Reshape (Squeeze/Unsqueeze modes) buffer optimization
     - Update rope_ref kernel to support dynamic paddings for cos/sin inputs
     - Fix propagate_padding() function and update shape infer tests

    ### Tickets:
    - [CVS-148220](https://jira.devtools.intel.com/browse/CVS-148220),
    [CVS-146283](https://jira.devtools.intel.com/browse/CVS-146283)

commit b2319a5bea85fd057d1e3ea102e83d8d6af6c6db
Author: Alexandra Sidorova <alexandra.sidorova@intel.com>
Date:   Fri Aug 2 18:09:25 2024 +0400

    [Snippets][CPU] Added Brgemm FP32 blocking support by dynamic K, N dimensions (#25745)

    ### Details:
    - *Added update support of `K` and `N` dimensions for Brgemm block in
    `BrgemmKernelExecutor::update_config`*

    ### Tickets:
     - *147852*

    ### Prerequisites:
    - [x] https://github.com/openvinotoolkit/openvino/pull/25378

commit b625fcbfbf95be80d1fe57f471a02b8fd31d94ef
Author: Roman Kazantsev <roman.kazantsev@intel.com>
Date:   Fri Aug 2 17:44:16 2024 +0400

    [TF FE] Extend UnsortedSegmentSum for ND indices (#25877)

    **Details:** This extension is needed for some customer model

    **Ticket:** 148750

    ---------

    Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

commit 64072f63e7afc66a3e7a49f2bc00d3ae0f695b02
Author: Maxim Vafin <maxim.vafin@intel.com>
Date:   Fri Aug 2 15:38:01 2024 +0200

    [PT FE] Update GHA tests (#25868)

    ### Details:
     - *item1*
     - *...*

    ### Tickets:
     - *ticket-id*

commit ab1e8dec8341f7ada47d959de895056ddb93ff52
Author: Karol Blaszczak <karol.blaszczak@intel.com>
Date:   Fri Aug 2 15:18:03 2024 +0200

    [DOCS] ovms llm data master (#25880)

commit a9f670a1073b3ce7660b3dac133bed4b45e518d5
Author: mei, yang <yang.mei@intel.com>
Date:   Fri Aug 2 20:55:52 2024 +0800

    [CPU] Align cpu execution order before/after ResolveComplexInplaceConflicts() (#24937)

    ### Details:
    - *Align cpu execution order before/after
    ResolveComplexInplaceConflicts()*
    - *Keep order information of Results and Parameters when dump CPU graph
    to ov::Model*
    - *Let MemoryInput always execute first to avoid potential issue because
    it will update its sibling MemoryOutput memory after execution*

    ### Tickets:
     - *CVS-134638*
     - *CVS-148497*

    ### Description:
    - CPU execution order of some nodes may changes after
    https://github.com/openvinotoolkit/openvino/blob/2024.2.0.dev20240513/src/plugins/intel_cpu/src/graph.cpp#L285.
    Sometimes that may give ResolveComplexInplaceConflicts() incorrect
    execution order information. That may lead to
    ResolveComplexInplaceConflicts() get the wrong conclusion which edge
    memory should be shared. So this PR add SortTopologically() right before
    ResolveComplexInplaceConflicts() to let execution order not change much
    before/after ResolveComplexInplaceConflicts()*
    - *The node order of CPU graph topology is not stable. For example in
    below graph*

    ![image](https://github.com/openvinotoolkit/openvino/assets/37289649/ca14e697-6986-4c30-9c2a-86603cc4a106)
    *If Parameter0 is before than Parameter1 in graphNodes, in original
    SortTopologically(), it will first recurse node down from Parameter0. So
    in final sorted graphNodes, Parameter0 will be sorted after Parameter1.
    Then in second round of SortTopologically(), it will first recurse from
    Parameter1 and in final sorted graphNodes, Parameter0 will be sorted
    before Parameter0 again. This will make sometimes ReduceProd is executed
    before ScatterNDUpdate while sometimes ReduceProd is after
    ScatterNDUpdate. It will mislead ResolveComplexInplaceConflicts()*
    - *MemoryInput will update its sibling MemoryOutput memory after
    execution. To avoid memory changes during the execution of other nodes,
    always let MemoryInput execute first*

commit 2e95269d14cfb7c865f2fd5e2329d6c9523469a4
Author: ujjayant-kadian <ujjayant.kadian@intel.com>
Date:   Fri Aug 2 12:35:17 2024 +0000

    Correct clang format issues

commit 63e9e38413e223e645029baf18359bf5df21b076
Merge: 6bc933a4dd ea6731f8a7
Author: Kadian <ujjayant.kadian@intel.com>
Date:   Fri Aug 2 13:07:13 2024 +0100

    Merge branch 'uk/changing-sub-byte-i4-element-order' of github.com:ujjayant-kadian/openvino into uk/changing-sub-byte-i4-element-order

commit da2a4e770a163af6419e0d9e46594e58dbc8ef64
Author: Aleksandr Voron <aleksandr.voron@intel.com>
Date:   Fri Aug 2 13:33:21 2024 +0200

    [CPU][ARM] Added debug logs to ACL Interpolate executor (#25866)

    ### Details:
     - Added debug logs to ACL Interpolate executor to debug easier
    - Remove redundant check (since it duplicates the check
    https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/nodes/executors/acl/acl_interpolate.cpp#L135_L136)

    ### Tickets:
     - *ticket-id*

commit 34d41aeb078eca2c0d55c556011bea1ba7729fdd
Author: Maksim Doronin <maksim.doronin@intel.com>
Date:   Fri Aug 2 11:40:44 2024 +0100

    Add folder parameter for reference blobs in SIT (#25651)

    ### Details:
    - Adding a new optional parameter for SIT to specify a directory with
    reference outputs. So instead of running NetsVal-CalcRef on CPU or
    TEMPLATE we can re-use predefined reference outputs. However, their
    names must comply with the existing name convention

    ### Tickets:
     - E-131878

commit 3c5a52e30f363cd25122ccd5b8bc081d717e8e03
Author: Damian Kurek <damian.kurek@intel.com>
Date:   Fri Aug 2 10:55:38 2024 +0200

    [GPU] Use parallel sum reduction in MVN BFYX OPT kernel (#25840)

    Optimize MVN BFYX OPT kernel

    ### Details:
     - Use parallel sum reduction in order to improve efficiency

    ### Tickets:
     - 148585

commit ab613e115267feece039e409dea6fb8e10371746
Author: Maxim Vafin <maxim.vafin@intel.com>
Date:   Fri Aug 2 10:55:20 2024 +0200

    [PT FE] Move sending telemetry to stage after conversion is done (#25855)

    ### Details:
    - *Previously telemetry was send every time `FrameworkNode` is created.
    Now we send it only when `FrameworkNode` exist in the model and only
    once per op type*

    ### Tickets:
     - *ticket-id*

commit 7bc7fb0cb64d09fd258724f6e3c5935f162cd129
Author: Georgy Krivoruchko <georgy.krivoruchko@intel.com>
Date:   Fri Aug 2 14:24:47 2024 +0400

    Changed dependency types-setuptools (#25872)

    ### Details:
     - Solution verification

    ### Tickets:
     - N/A

commit a9c004798b56eb5e74502a44ef320ff5333d23dd
Author: Wilson Seok <wilson.seok@intel.com>
Date:   Thu Aug 1 13:39:13 2024 -0700

    [GPU] Add crop check in optimize check of buffer fusing (#25850)

    ### Details:
    - Add crop check in optimize check of buffer fusing to pass through
    simple dynamic shape crop case

    ### Tickets:
     - Follow up PR25737

commit 7b11ba4c1bec8f7e2063f1c43b0c10a314d724bd
Author: Alexey Smirnov <alexey.smirnov@intel.com>
Date:   Thu Aug 1 20:32:47 2024 +0100

    [NPUW] Introduce new passes to online partitioning (#25679)

    Config (internal/extended):
    ```
            "NPU_COMPILATION_MODE_PARAMS" : "compute-layers-with-higher-precision=Sqrt,Power,ReduceMean,Add_RMSNorm",
            "NPU_USE_NPUW" : "YES",
            "NPUW_FOLD" : "YES",
            "NPUW_DCOFF_TYPE" : "f16",
            "NPUW_DCOFF_SCALE" : "YES",
            "NPUW_ONLINE_ISOLATE" : "P:DQMatMulGQ/compute,P:DQMatMulCW/compute,P:RMSNorm/compute",
            "NPUW_ONLINE_NOFOLD" : "compute"
    ```

    Config (user/basic):
    ```
            "NPU_COMPILATION_MODE_PARAMS" : "compute-layers-with-higher-precision=Sqrt,Power,ReduceMean,Add_RMSNorm",
            "NPU_USE_NPUW" : "YES",
            "NPUW_FOLD" : "YES",
            "NPUW_DCOFF_TYPE" : "f16",
            "NPUW_DCOFF_SCALE" : "YES",
            "NPUW_ONLINE_PIPELINE" : "COMPUTE"
    ```

    ---------

    Co-authored-by: Dmitry Matveev <dmitry.matveev@intel.com>

commit 3b4e747c8687d0e11501b63f3e425a335e8c9641
Author: Ilya Lavrenov <ilya.lavrenov@intel.com>
Date:   Thu Aug 1 21:23:30 2024 +0400

    Allow to override CPACK_ARCHIVE_COMPONENT_INSTALL (#25867)

    ### Details:
     - To override by external cmake options
     - Useful for GenAI to create a single archive

commit 605b13fbee58b48cf27cf0e64ac154148dfd8b39
Author: Alicja Miloszewska <alicja.miloszewska@intel.com>
Date:   Thu Aug 1 08:12:30 2024 -0700

    [PyOV] Add more ov.Model constructors  (#25635)

    ### Details:
    - Accept sinks as output ports in addition to generic nodes and op class
    instances in `ov.Model` ctors
     - Add test

     Added support for:
    - `Model(results: List[openvino._pyopenvino.op.Result], sinks:
    List[ov::Output<ov::Node>], parameters:
    List[openvino._pyopenvino.op.Parameter], name: str = '')`
    - `Model(results: List[ov::Output<ov::Node>], sinks:
    List[ov::Output<ov::Node>], parameters:
    List[openvino._pyopenvino.op.Parameter], name: str = '')`

    ### Tickets:
     - *[CVS-131037](https://jira.devtools.intel.com/browse/CVS-131037)*

    ---------

    Co-authored-by: Anastasia Kuporosova <anastasia.kuporosova@intel.com>

commit 754f48a0d96d0451fd7c7cf4a68019dfafd20c5e
Author: Pawel Raasz <pawel.raasz@intel.com>
Date:   Thu Aug 1 17:10:38 2024 +0200

    [core] Unify axis normalization/validation utils (#25614)

    ### Details:
    - Split function for smaller simper utils, responsible for validation or
    normalization or more complex doing both.
     - Unify the functions parameters order
     - Remove redundant check of rank
     - Produce smaller binary size
     - Fix Coverity issue `Improper use of negative value`.

    ### Tickets:
     - CVS-136544

commit 2e399de62eed4ab212e36032380e6972921b5cd9
Author: Alexandra Sidorova <alexandra.sidorova@intel.com>
Date:   Thu Aug 1 18:26:25 2024 +0400

    [Tests] Commented out debug prints in input range generation (#25848)

    ### Details:
    - *Commented out debug prints in input range generation in test
    infrastructure to avoid large outputs during test executions:*

    ![image](https://github.com/user-attachments/assets/8e19df2c-2bd2-4327-91cd-da439d0da544)

    ### Tickets:
     - *N/A*

commit 2f8c265b6cb9b078757b71b0a81d6b95bfd4bcb8
Author: Maciej Smyk <maciejx.smyk@intel.com>
Date:   Thu Aug 1 16:14:30 2024 +0200

    [DOCS] CODEOWNER update for master (#25863)

    JIRA: 148360

    Update of documentation paths for codeowner groups.

commit 81e7b21e6bec757398fdb4074e085799ee5c795c
Author: Andrei Kashchikhin <andrey.kashchikhin@intel.com>
Date:   Thu Aug 1 15:06:59 2024 +0100

    [CI] [GHA] Get VCPKG version from repository (#25862)

    ### Tickets:
     - *132496*

commit 504873014ccc800005504841d9819ccf04abc312
Author: Prakash <qxprakash@gmail.com>
Date:   Thu Aug 1 17:57:11 2024 +0530

    [OV JS] Add vision-background-removal sample notebook (#25714)

    ### Details:
    - added vision-background-removal notebook
    - added comments and formatting

    ### Things Remaining:

     - adding the sample in the readme
     - adding the weights download once the unet model ir gets uploaded

    @vishniakov-nikolai @almilosz  please give feedback

    With Regards
    Prakash

commit fb4e2d3e832d488f94012cc5e4cde1a6d4c4bf44
Author: Vishniakov Nikolai <nikolai.vishniakov@intel.com>
Date:   Thu Aug 1 14:26:31 2024 +0200

    [OVJS] Update openvino-node binaries to 2024.3 in master (#25823)

    ### Details:
     - update openvino-node package version to 2024.3.0 in master branch

commit 7e16d63b042371655f75869890a770aa9c01e703
Author: Andrei Kashchikhin <andrey.kashchikhin@intel.com>
Date:   Thu Aug 1 12:55:11 2024 +0100

    [CI] [GHA] Gather statistics on newly added Ubuntu workflows (#25856)

    New workflows were introduced in
    https://github.com/openvinotoolkit/openvino/pull/25234 but were not
    added to the workflow that gathers statistics.

    ### Tickets:
     - *144917*

commit 18e775ff8d7c56e0ba3bfbdb6c94494eddb2d4ce
Author: Aleksandr Voron <aleksandr.voron@intel.com>
Date:   Thu Aug 1 13:35:38 2024 +0200

    [CPU][ARM] MLAS transpose executor deprioritised (#25854)

    ### Details:
    - The latest performance reports on Ampere show ACL transpose executor
    provides better performance rather than MLAS Transpose executor (details
    are in the ticket). Therefore, MLAS Transpose executor priority has been
    decreased.
     - Redundant check has been deleted in ACL Transpose executor.

    ### Tickets:
     - CVS-148625

commit a0062533f09fc2362004cb7c179ca88d6a4549cd
Author: Ilya Lavrenov <ilya.lavrenov@intel.com>
Date:   Thu Aug 1 16:59:24 2024 +0400

    Added version for OpenVINO developer package local version (#25859)

    ### Details:
     -  To allow to select developer package of specific version
     - Required for GenAI build as part of OpenVINO extra modules

commit eda2f7f40598cce2f970ea635454546844a801ba
Author: Zhang Yi <yi3.zhang@intel.com>
Date:   Thu Aug 1 19:27:09 2024 +0800

    [Core][CPU]markup rope's sin/cos generation with f32 (#25662)

    ### Details:
    - *Sin/Cos table generation must run in f32 otherwise it has accuracy
    issue*
     -  Reference : https://github.com/huggingface/transformers/pull/29285

    ### Tickets:
     - *CVS-146672*

commit 45b4737e706d0b06f5dd5c4e513fc181ddf4c3ba
Author: Karol Blaszczak <karol.blaszczak@intel.com>
Date:   Thu Aug 1 13:06:49 2024 +0200

    [DOCS] supportedmodels table fix 24.3 (#25860)

    port: https://github.com/openvinotoolkit/openvino/pull/25818

commit 546daf2959928457116fcb807337a511da37c8d9
Author: M <mortezaho.1376@gmail.com>
Date:   Thu Aug 1 03:26:00 2024 -0700

    [GSOC][CPU][ARM] Add NEON implementation for attention softmax (#25616)

    ### Details:
     - This PR aims to add NEON implementation for attention softmax

commit 7617b37f047b29c67e5010bc54b40ed6de858d76
Author: Karol Blaszczak <karol.blaszczak@intel.com>
Date:   Thu Aug 1 11:51:53 2024 +0200

    [DOCS] add benchmark results for phi (#25838) (#25851)

    port: https://github.com/openvinotoolkit/openvino/pull/25838

    Co-authored-by: Michael Frank Hansen <michael.f.hansen@intel.com>

commit 508795f44e301d5f848a212dbfc1257d8552a09b
Author: Prakash <qxprakash@gmail.com>
Date:   Thu Aug 1 15:03:25 2024 +0530

    [OV JS] Add vision-background-removal sample script (#25698)

    ### Details:

    - added script code and added the unet model weights inside the
    directory -- ```/openvino/samples/js/node/assets/models```
    @vishniakov-nikolai can you please upload it
     - focused on the implementaion and formatting
    - output images for now will be saved in the same directory , I will
    change it later as per your feedback
    - @vishniakov-nikolai I am a bit doubtful about my naming convention so
    let me know if I need to modify any names

    ### Things remaining

    - [x] Proper comments remaining
    - [x] Bit of refactoring
    - [x] Readme

    Please provide Feedback @vishniakov-nikolai @almilosz

    With Regards
    Prakash

commit dc3eaf0a2b816fc32a59e79455bce33ec54f535c
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Thu Aug 1 07:26:11 2024 +0000

    Bump actions/upload-artifact from 4.3.3 to 4.3.4 (#25846)

    Bumps
    [actions/upload-artifact](https://github.com/actions/upload-artifact)
    from 4.3.3 to 4.3.4.
    <details>
    <summary>Release notes</summary>
    <p><em>Sourced from <a
    href="https://github.com/actions/upload-artifact/releases">actions/upload-artifact's
    releases</a>.</em></p>
    <blockquote>
    <h2>v4.3.4</h2>
    <h2>What's Changed</h2>
    <ul>
    <li>Update <code>@​actions/artifact</code> version, bump dependencies by
    <a href="https://github.com/robherley"><code>@​robherley</code></a> in
    <a
    href="https://redirect.github.com/actions/upload-artifact/pull/584">actions/upload-artifact#584</a></li>
    </ul>
    <p><strong>Full Changelog</strong>: <a
    href="https://github.com/actions/upload-artifact/compare/v4.3.3...v4.3.4">https://github.com/actions/upload-artifact/compare/v4.3.3...v4.3.4</a></p>
    </blockquote>
    </details>
    <details>
    <summary>Commits</summary>
    <ul>
    <li><a
    href="https://github.com/actions/upload-artifact/commit/0b2256b8c012f0828dc542b3febcab082c67f72b"><code>0b2256b</code></a>
    Merge pull request <a
    href="https://redirect.github.com/actions/upload-artifact/issues/584">#584</a>
    from actions/robherley/bump-pkgs</li>
    <li><a
    href="https://github.com/actions/upload-artifact/commit/488dcefb9bf01619ac19bad29c5c5409a1e4dd4c"><code>488dcef</code></a>
    licensed cache</li>
    <li><a
    href="https://github.com/actions/upload-artifact/commit/04c51f57662651dd3333286989e2db1111c0fd07"><code>04c51f5</code></a>
    ncc</li>
    <li><a
    href="https://github.com/actions/upload-artifact/commit/32a9e276a8f8ac18b4b2dce8213ed340ed4e5ed8"><code>32a9e27</code></a>
    bump <code>@​actions/artifact</code> and npm audit</li>
    <li><a
    href="https://github.com/actions/upload-artifact/commit/552bf3722c16e81001aea7db72d8cedf64eb5f68"><code>552bf37</code></a>
    new version</li>
    <li><a
    href="https://github.com/actions/upload-artifact/commit/79616d2ded92999fceefea2ca2e4bdf6101fa919"><code>79616d2</code></a>
    Merge pull request <a
    href="https://redirect.github.com/actions/upload-artifact/issues/565">#565</a>
    from actions/eggyhead/use-artifact-v2.1.6</li>
    <li>See full diff in <a
    href="https://github.com/actions/upload-artifact/compare/v4.3.3...0b2256b8c012f0828dc542b3febcab082c67f72b">compare
    view</a></li>
    </ul>
    </details>
    <br />

    [![Dependabot compatibility
    score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/upload-artifact&package-manager=github_actions&previous-version=4.3.3&new-version=4.3.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

    Dependabot will resolve any conflicts with this PR as long as you don't
    alter it yourself. You can also trigger a rebase manually by commenting
    `@dependabot rebase`.

    [//]: # (dependabot-automerge-start)
    [//]: # (dependabot-automerge-end)

    ---

    <details>
    <summary>Dependabot commands and options</summary>
    <br />

    You can trigger Dependabot actions by commenting on this PR:
    - `@dependabot rebase` will rebase this PR
    - `@dependabot recreate` will recreate this PR, overwriting any edits
    that have been made to it
    - `@dependabot merge` will merge this PR after your CI passes on it
    - `@dependabot squash and merge` will squash and merge this PR after
    your CI passes on it
    - `@dependabot cancel merge` will cancel a previously requested merge
    and block automerging
    - `@dependabot reopen` will reopen this PR if it is closed
    - `@dependabot close` will close this PR and stop Dependabot recreating
    it. You can achieve the same result by closing it manually
    - `@dependabot show <dependency name> ignore conditions` will show all
    of the ignore conditions of the specified dependency
    - `@dependabot ignore this major version` will close this PR and stop
    Dependabot creating any more for this major version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this minor version` will close this PR and stop
    Dependabot creating any more for this minor version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this dependency` will close this PR and stop
    Dependabot creating any more for this dependency (unless you reopen the
    PR or upgrade to it yourself)

    </details>

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit fa949478e149f17cce514ebd0d019e8766ef249d
Author: Karol Blaszczak <karol.blaszczak@intel.com>
Date:   Thu Aug 1 09:04:13 2024 +0200

    [DOCS] rn fixes and model table (#25835)

commit ba681ed72d7e30b2fe94e1cfc5a950a0bcf9bb54
Author: Wilson Seok <wilson.seok@intel.com>
Date:   Wed Jul 31 20:53:18 2024 -0700

    [GPU] Rollback whlie-loop structure for 2nd stage of optimize all crops (#25737)

    ### Details:
    - Rollback while-loop structure for 2nd stage of optimize all crops
    because it has regression for reshape case which has padding.

    ### Tickets:
     - 146653

commit 8cfd586e6128055b600e1abe9dcce263071dec7d
Author: Eddy Kim <eddy.kim@intel.com>
Date:   Thu Aug 1 10:05:32 2024 +0900

    [GPU] group_normalization for bfzyx (#25753)

    ### Details:
    - This PR updates the `group_normalization_bfyx` kernel to support bfzyx
    format.
    - Additionally, this PR fixes the output feature calculation logic of
    the group_norm_fsv16 kernel and a model caching related logic for
    dynamic model.

    ### Tickets:
     - 147841

commit 13b3e4703e32053797099256849b78ebfef6d49c
Author: Roman Kazantsev <roman.kazantsev@intel.com>
Date:   Thu Aug 1 01:44:49 2024 +0400

    [TF FE] Stabilize Bitwise layer tests on all platforms and fix u16 bug (#25843)

    **Details:** Fix u16 bug "Tensor data with element type u16, is not
    representable as pointer to i32"

    **Ticket:** 122716

    ---------

    Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

commit d2ab797a0fff1f95ec9ea39e444798dbba499cf6
Author: Ilya Lavrenov <ilya.lavrenov@intel.com>
Date:   Wed Jul 31 23:22:43 2024 +0400

    Fixed compilation with clang and libc++ (#25813)

    ### Details:
     - *item1*
     - *...*

    ### Tickets:
     - Closes https://github.com/openvinotoolkit/openvino/issues/25420

commit b26c533421b1ca3f3254df1de14300dbe928405b
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Wed Jul 31 21:01:11 2024 +0200

    Update setuptools requirement from <72,>=65.6.1 to >=65.6.1,<73 in /src/bindings/python (#25792)

    Updates the requirements on
    [setuptools](https://github.com/pypa/setuptools) to permit the latest
    version.
    <details>
    <summary>Changelog</summary>
    <p><em>Sourced from <a
    href="https://github.com/pypa/setuptools/blob/main/NEWS.rst">setuptools's
    changelog</a>.</em></p>
    <blockquote>
    <h1>v72.1.0</h1>
    <h2>Features</h2>
    <ul>
    <li>Restore the tests command and deprecate access to the module. (<a
    href="https://redirect.github.com/pypa/setuptools/issues/4519">#4519</a>)
    (<a
    href="https://redirect.github.com/pypa/setuptools/issues/4520">#4520</a>)</li>
    </ul>
    <h1>v72.0.0</h1>
    <h2>Deprecations and Removals</h2>
    <ul>
    <li>The test command has been removed. Users relying on 'setup.py test'
    will need to migrate to another test runner or pin setuptools before
    this version. (<a
    href="https://redirect.github.com/pypa/setuptools/issues/931">#931</a>)</li>
    </ul>
    <h1>v71.1.0</h1>
    <h2>Features</h2>
    <ul>
    <li>
    <p>Added return types to typed public functions -- by
    :user:<code>Avasam</code></p>
    <p>Marked <code>pkg_resources</code> as <code>py.typed</code> -- by
    :user:<code>Avasam</code> (<a
    href="https://redirect.github.com/pypa/setuptools/issues/4409">#4409</a>)</p>
    </li>
    </ul>
    <h2>Misc</h2>
    <ul>
    <li><a
    href="https://redirect.github.com/pypa/setuptools/issues/4492">#4492</a></li>
    </ul>
    <h1>v71.0.4</h1>
    <h2>Bugfixes</h2>
    <ul>
    <li>Removed lingering unused code around Distribution._patched_dist. (<a
    href="https://redirect.github.com/pypa/setuptools/issues/4489">#4489</a>)</li>
    </ul>
    <h1>v71.0.3</h1>
    <h2>Bugfixes</h2>
    <!-- raw HTML omitted -->
    </blockquote>
    <p>... (truncated)</p>
    </details>
    <details>
    <summary>Commits</summary>
    <ul>
    <li><a
    href="https://github.com/pypa/setuptools/commit/441799f8b45a1a01c608db49333403db1b0d7100"><code>441799f</code></a>
    Bump version: 72.0.0 → 72.1.0</li>
    <li><a
    href="https://github.com/pypa/setuptools/commit/59aff448e79415ee3e491a8426553b373d7914e5"><code>59aff44</code></a>
    Merge pull request <a
    href="https://redirect.github.com/pypa/setuptools/issues/4522">#4522</a>
    from pypa/feature/graceful-drop-tests</li>
    <li><a
    href="https://github.com/pypa/setuptools/commit/c437aaa8d5b969a9fe8c8147463bfcb85b31ab26"><code>c437aaa</code></a>
    Restore the tests command and deprecate access to the module.</li>
    <li><a
    href="https://github.com/pypa/setuptools/commit/a6726b95f7a50dc5945e012050f00450c883fdcd"><code>a6726b9</code></a>
    Add celery and requests to the packages that test integration. Ref <a
    href="https://redirect.github.com/pypa/setuptools/issues/4520">#4520</a></li>
    <li><a
    href="https://github.com/pypa/setuptools/commit/5e1b3c414779317bc3e105d9bae82ce70c22dbf9"><code>5e1b3c4</code></a>
    Bump version: 71.1.0 → 72.0.0</li>
    <li><a
    href="https://github.com/pypa/setuptools/commit/4c0b9f3ee6ee47c597572655567f215c08c90137"><code>4c0b9f3</code></a>
    Merge pull request <a
    href="https://redirect.github.com/pypa/setuptools/issues/4458">#4458</a>
    from pypa/debt/remove-test-command</li>
    <li><a
    href="https://github.com/pypa/setuptools/commit/be8e3a09812f0a3717045098ac6ce7b52fc7d202"><code>be8e3a0</code></a>
    Merge pull request <a
    href="https://redirect.github.com/pypa/setuptools/issues/4507">#4507</a>
    from pypa/docs/4483-install-core-extra</li>
    <li><a
    href="https://github.com/pypa/setuptools/commit/99d2c722ca5d58ef1360ed86a3252cc16bd84dfd"><code>99d2c72</code></a>
    Add documentation clarifying how to reliably install setuptools with its
    depe...</li>
    <li><a
    href="https://github.com/pypa/setuptools/commit/63c89f93d6d43ff96ce5f7f5a862395f924905d0"><code>63c89f9</code></a>
    👹 Feed the hobgoblins (delint).</li>
    <li><a
    href="https://github.com/pypa/setuptools/commit/c405ac1bf29b945db9af7ba9b0dd77e4d871f72a"><code>c405ac1</code></a>
    Merge branch 'main' into debt/remove-test-command</li>
    <li>Additional commits viewable in <a
    href="https://github.com/pypa/setuptools/compare/v65.6.1...v72.1.0">compare
    view</a></li>
    </ul>
    </details>
    <br />

    Dependabot will resolve any conflicts with this PR as long as you don't
    alter it yourself. You can also trigger a rebase manually by commenting
    `@dependabot rebase`.

    [//]: # (dependabot-automerge-start)
    [//]: # (dependabot-automerge-end)

    ---

    <details>
    <summary>Dependabot commands and options</summary>
    <br />

    You can trigger Dependabot actions by commenting on this PR:
    - `@dependabot rebase` will rebase this PR
    - `@dependabot recreate` will recreate this PR, overwriting any edits
    that have been made to it
    - `@dependabot merge` will merge this PR after your CI passes on it
    - `@dependabot squash and merge` will squash and merge this PR after
    your CI passes on it
    - `@dependabot cancel merge` will cancel a previously requested merge
    and block automerging
    - `@dependabot reopen` will reopen this PR if it is closed
    - `@dependabot close` will close this PR and stop Dependabot recreating
    it. You can achieve the same result by closing it manually
    - `@dependabot show <dependency name> ignore conditions` will show all
    of the ignore conditions of the specified dependency
    - `@dependabot ignore this major version` will close this PR and stop
    Dependabot creating any more for this major version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this minor version` will close this PR and stop
    Dependabot creating any more for this minor version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this dependency` will close this PR and stop
    Dependabot creating any more for this dependency (unless you reopen the
    PR or upgrade to it yourself)

    </details>

    ---------

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
    Co-authored-by: Anastasia Kuporosova <anastasia.kuporosova@intel.com>

commit a60140ef5c60f60304ad2a70ebff0f80f97cd51c
Author: Dmitry Matveev <dmitry.matveev@intel.com>
Date:   Wed Jul 31 19:16:29 2024 +0100

    Add NPUW to labeler (#25839)

    ### Details:
    - Mark changes under "src/plugins/intel_npu/src/plugin/npuw" with NPUW
    label

    ### Tickets:
     - n/a

commit 3a9464dc34900b8ee11249f6f56f7a3636a796c8
Author: Vladislav Golubev <vladislav.golubev@intel.com>
Date:   Wed Jul 31 20:01:30 2024 +0200

    [Snippets] Support Brgemm with transposed_b via BrgemmCopyB (#24932)

    ### Details:
     - *Support FP32/BF16/I8 matmuls with transpose_b=true via BrgemmCopyB*
    - *BrgemmCopyB emitter: handle tail iteration by N before the main body*
    - *Remove workaround on LDB and N dim rounding in brgemm emitters and
    related buffers*

    ### Tickets:
     - *CVS-114487*

    ## TODO:
    - [ ] BufferAllocation test for FP32 brgemm with repacking
    - [ ] SetBrgemmCopyBBuffersShape tests
    - [ ] MHA with transpose B for low precisions (FP32 already exists)
    - [ ] FuseTransposeBrgemm tests

commit f48b30aab7ae2bb05c9f3709f9398eefe17ff66f
Author: Andrei Kashchikhin <andrey.kashchikhin@intel.com>
Date:   Wed Jul 31 18:39:31 2024 +0100

    [CI] [GHA] Introduce additional Ubuntu versions via separate workflows (#25234)

    ### Details:
    - This is a sister PR to #25202, the idea is the same: test more Linux
    flavours. This PR adds Ubuntu 22/24 as separate workflows instead of a
    matrix used in #25202.
    - The approach with separate workflows seems better as it does not
    require unique names for artefacts for matrix jobs and dependent jobs
    thus making it easier to write and maintain w/o magic strings.

    ### Tickets:
     - *144917*

commit 161fce5d380e6ab3bdf0dcc6109ea904f11672bd
Author: Zlobin Vladimir <vladimir.zlobin@intel.com>
Date:   Wed Jul 31 20:01:50 2024 +0400

    Update open-model_zoo submodule (#25826)

commit 25455a0dd97d9c724522dab43f2a019e2a6643d0
Author: Ujjayant Kadian <118752727+ujjayant-kadian@users.noreply.github.com>
Date:   Wed Jul 31 16:28:45 2024 +0100

    NPUW: Change the sub-byte (i4) element order in the unpack procedure to match OpenVINO 2024.0 (#25827)

    ### Details:
    In the latest versions of OpenVINO the sub-byte order is defined as

    [1,0]

    meaning that first (MSB) 4 bits of an 8-bit vector form 1st element, and
    the last (LSB) 4 bits of an 8-bit vector form 0th element.

    Our unpack procedures for i4 were aligned with the older representation,
    where sub-byte order was defined as

    [0,1]

    meaning that first (MSB) 4 bits of an 8-bit vector form 0th element, and
    the last (LSB) 4 bits were the 1st element.

    **Updated these unpack functions to use this new order.**

    ### Tickets:
     - *121052*

commit 3e058b90a891fee9e707dd9c2859492fa5166f71
Author: Roman Lyamin <Roman.Lyamin@intel.com>
Date:   Wed Jul 31 18:45:15 2024 +0400

    [GPU] Fix lws calculation for reorder_kernel_bfyx_to_blocked_format kernel (#25830)

    ### Tickets:
     - *[146165](https://jira.devtools.intel.com/browse/CVS-146165)*

commit a5d82f2ebf15bb11b452a4027c6b7ae54ca2951c
Author: Sebastian Golebiewski <sebastianx.golebiewski@intel.com>
Date:   Wed Jul 31 15:04:21 2024 +0200

    [DOCS] Updating Edit Button for articles for master (#25832)

    Porting: https://github.com/openvinotoolkit/openvino/pull/25831

commit 98956aa41354f0402bc7e84ad993efef21cb8cf8
Author: Alexandra Sidorova <alexandra.sidorova@intel.com>
Date:   Wed Jul 31 16:54:52 2024 +0400

    [CPU][RISCV64] Fixed onednn build for RVV case (#24151)

    ### Details:
     - *Missed include `primitive.hpp` in RVV pooling implementation*
     - *oneDNN PR: https://github.com/openvinotoolkit/oneDNN/pull/259*
    - *It's not seen in CI since OV is built with default
    `-march=rv64imafdc` - without vector intrinsic support. Need to build
    with RVV support (`-march=rv64gcv0p7`)*

    ### Tickets:
     - *N/A*

commit 10620e9fd68cbfb2f6ae2a1298e6af8425367bfe
Author: Sun Xiaoxia <xiaoxia.sun@intel.com>
Date:   Wed Jul 31 19:54:29 2024 +0800

    Fix executor memory leak when "-nstreams 0" (#25778)

    ### Details:
     - *create executor config when streams=0*

    ### Tickets:
     - *146686*

commit cae739b96354aff83945767d2fad094e03ebebce
Author: Edward Shogulin <edward.shogulin@intel.com>
Date:   Wed Jul 31 12:28:41 2024 +0100

    [LPT] Dequantization precision reusage (#25668)

    ### Details:
     - *NNCF quantized fp16 model on GPU support*

    ### Tickets:
     - *CVS-126300*

commit 3e49c22ff76f55304ea2bb1a832fce8b2a04ea69
Author: Alexandra Sidorova <alexandra.sidorova@intel.com>
Date:   Wed Jul 31 15:24:23 2024 +0400

    [Snippets] Added auto sorting of LoopPorts (#25623)

    ### Details:
    - *Added support of expression enumeration - new attribute `m_exec_num`
    of `Expression`. Calculated as `exec_num_left + (exec_num_right -
    exec_num_left) / 2`. Now we can figure out which expression is executed
    earlier than another using `m_exec_num O(1)` instead of `find(begin(),
    end(), smth) == end() O(n)`*
    - *Refactored LoopInfo interface: united all `update` and `replace` into
    one `replace_with_new_ports`.*
    - *Added auto sorting of ports in LoopInfo: after port replacing, new
    expression/node insertion using helpers - loop ports are automatically
    reordered by expression execution numbers*
    - *Removed previous workarounds with `GetTopologicalOrder` from
    tokenization pass*

    ### Tickets:
     - *113536*
     - *142990*
     - *137819*

commit 89b49c10ca719505712b53cf44370dbdb3782fbc
Author: Karol Blaszczak <karol.blaszczak@intel.com>
Date:   Wed Jul 31 13:12:50 2024 +0200

    [DOCS] 24.3 archives and final touches (#25829)

    port: https://github.com/openvinotoolkit/openvino/pull/25828

commit f0d7cd8c22e2a994a4371cc5e15d6be33c9e6785
Author: Sebastian Golebiewski <sebastianx.golebiewski@intel.com>
Date:   Wed Jul 31 13:05:07 2024 +0200

    [DOCS] Updating Tool Ecosystem article (#25824)

    Adding information on OpenVINO-based AI projects.

    Co-authored-by: Maciej Smyk <maciejx.smyk@intel.com>
    Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>

commit ea6731f8a75b907eea1ee9317c2cd89a2d54e4c4
Merge: 70b8346d72 3c713d4aec
Author: Ujjayant Kadian <118752727+ujjayant-kadian@users.noreply.github.com>
Date:   Wed Jul 31 11:56:25 2024 +0100

    Merge branch 'master' into uk/changing-sub-byte-i4-element-order

commit 11c01898f507c1abb7d64d70f89ffcc281081373
Author: Roman Kazantsev <roman.kazantsev@intel.com>
Date:   Wed Jul 31 14:19:01 2024 +0400

    [TF FE] Support TensorListConcatV2 operation for multiple undefined dims in element_shape (#25814)

    **Details:** Support TensorListConcatV2 operation for multiple undefined
    dims in element_shape

    **Ticket:** 105671

    ---------

    Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

commit 3c713d4aec23c825baa71fd524f93140bc928ce9
Author: Chen Xu <chen.xu@intel.com>
Date:   Wed Jul 31 17:32:10 2024 +0800

    [CPU] Avoid rounding to zero for Reduce node in quantized models (#25766)

    ### Details:
    - *If the Reduce node has both input and output precision to be integers
    from the original model, then rounding to zero should be done before
    converting intermediate floating point value to integer.*
    - *However, if such integer precisions are resulted from quantization,
    then we should not do such rounding, in order to maintain accuracy.*
     - *Add corresponding test cases.*

    ### Tickets:
     - *CVS-147352*
akladiev pushed a commit that referenced this pull request Aug 21, 2024
### Details:
- *The PR enables dynamic FP32 MHA tokenization on x64 platforms 🎉*
- *`std::vector.resize()` which was used for buffer scratchpad
allocation is very expensive operation due to default constructor of
elements. This PR replace `std::vector.resize()` with CPU Node
Scratchpad memory which can be shared between nodes. Also since each
thread must have the own scratchpad memory, we allocated `size *
threads_max` - however, in execution thread count can be less (depends
on parallel work amount). Now we allocate only `size * n_threads` where
`nthreads` is real count of working threads.*
- *Fixed dimension K validation in `BrgemmBlocking` pass: one of inputs
can have dynamic value of this dimension*
- *Fixed `utils::broadcast_merge_dim()` and supported broadcasting of
integer values in IterHandlers. Added unit tests for
`utils::broadcast_merge_dim()`*

### Tickets:
 - *149900*


### Prerequisites:
- [x] #25326
- [x] #25378
- [x] #25623
- [x] #25638
- [x] #25745
- [x] #25957
- [x] #25733
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants