Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][spirv] Reset LocalWorkSize when the device is changed #346

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

jjfumero
Copy link
Member

@jjfumero jjfumero commented Mar 7, 2024

Description

This PR solves the problem of getting the wrong local work group sizes when the device is changed from the CPU using the OpenCL backend to the GPU using the SPIR-V backend.

Problem description

The problem was that, in the SPIR-V backend, once the number of blocks is computed, is then stored in the TaskMetaData object. However, while the global work size for the same task will not change, there is no guarantee that the local work group size is the same. This patch solves the issue by resetting the local work size only when the device is updated.

Backend/s tested

Mark the backends affected by this PR.

  • OpenCL
  • PTX
  • SPIRV

OS tested

Mark the OS where this PR is tested.

  • Linux
  • OSx
  • Windows

Did you check on FPGAs?

If it is applicable, check your changes on FPGAs.

  • Yes
  • No

How to test the new patch?

The use case for this is the task-migration in Client/Server setup:

https://github.com/jjfumero/tornadovm-examples?tab=readme-ov-file#live-task-migration-client-server-app

## Run Server in one terminal
./runServer.sh

## Client in another terminal
./runClient.sh  ## Change device during runtime 

## Note: the application selects the backend 0:0 (default backend)

# type different <backend:deviceNumber> version from the client. 
# Examples:
# 0:1 
# 1:0 
## etc

@jjfumero jjfumero added OpenCL spirv fix Provides a fix labels Mar 7, 2024
@jjfumero jjfumero self-assigned this Mar 7, 2024
Copy link
Collaborator

@mairooni mairooni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@stratika stratika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jjfumero jjfumero merged commit 0edc2ae into beehive-lab:develop Mar 7, 2024
2 checks passed
@jjfumero jjfumero deleted the fix/scheduler/cpu branch March 7, 2024 09:17
jjfumero added a commit to jjfumero/TornadoVM that referenced this pull request Mar 27, 2024
Improvements
~~~~~~~~~~~~~~~~~~

- `beehive-lab#344 <https://github.com/beehive-lab/TornadoVM/pull/344>`_: Support for Multi-threaded Execution Plans.
- `beehive-lab#347 <https://github.com/beehive-lab/TornadoVM/pull/347>`_: Enhanced examples.
- `beehive-lab#350 <https://github.com/beehive-lab/TornadoVM/pull/350>`_: Obtain internal memory segment for the Tornado Native Arrays without the object header.
- `beehive-lab#357 <https://github.com/beehive-lab/TornadoVM/pull/357>`_: API extensions to query and apply filters to backends and devices from the ``TornadoExecutionPlan``.
- `beehive-lab#359 <https://github.com/beehive-lab/TornadoVM/pull/359>`_: Support Factory Methods for FFI-based array collections to be used/composed in TornadoVM Task-Graphs.

Compatibility
~~~~~~~~~~~~~~~~~~

- `beehive-lab#351 <https://github.com/beehive-lab/TornadoVM/pull/351>`_: Compatibility of TornadoVM Native Arrays with the Java Vector API.
- `beehive-lab#352 <https://github.com/beehive-lab/TornadoVM/pull/352>`_: Refactor memory limit to take into account primitive types and wrappers.
- `beehive-lab#354 <https://github.com/beehive-lab/TornadoVM/pull/354>`_: Add DFT-sample benchmark in FP32.
- `beehive-lab#356 <https://github.com/beehive-lab/TornadoVM/pull/356>`_: Initial support for Windows 11 using Visual Studio Development tools.
- `beehive-lab#361 <https://github.com/beehive-lab/TornadoVM/pull/361>`_: Compatibility with the SPIR-V toolkit v0.0.4.
- `beehive-lab#366 <https://github.com/beehive-lab/TornadoVM/pull/363>`_: Level Zero JNI Dependency updated to 0.1.3.

Bug Fixes
~~~~~~~~~~~~~~~~~~

- `beehive-lab#346 <https://github.com/beehive-lab/TornadoVM/pull/346>`_: Computation of local-work group sizes for the Level Zero/SPIR-V backend fixed.
- `beehive-lab#360 <https://github.com/beehive-lab/TornadoVM/pull/358>`_: Fix native tests to check the JIT compiler for each backend.
- `beehive-lab#355 <https://github.com/beehive-lab/TornadoVM/pull/355>`_: Fix custom exceptions when a driver/device is not found.
@jjfumero jjfumero mentioned this pull request Mar 27, 2024
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants