Skip to content

Conversation

SS-JIA
Copy link
Contributor

@SS-JIA SS-JIA commented Sep 4, 2024

Stack from ghstack (oldest at bottom):

Context

Currently, in shaders we have to declare the binding slot that layout bindings will bind to explicitly, i.e.

${layout_declare_tensor(0, "w", "t_out", DTYPE, STORAGE)}
${layout_declare_buffer(1, "r", "nchw_in", DTYPE)}
${layout_declare_ubo(2, "ivec4", "sizes")}

However, this can get a little tedious when making many layout declarations. This diff improves the situation by adding the B variable which will automatically increment the binding slot whenever a layout binding is declared. Now we can write

${layout_declare_tensor(B, "w", "t_out", DTYPE, STORAGE)}
${layout_declare_buffer(B, "r", "nchw_in", DTYPE)}
${layout_declare_ubo(B, "ivec4", "sizes")}

I may make a follow up diff to change all layout declarations to use B across all shaders in the codebase later on.

Differential Revision: D62210119

## Context

Add a simple test to track the sizes of various important objects in the Vulkan compute graph API over time. The test uses some loose thresholds to alert when an object has grown unexpectedly large.

Differential Revision: [D62144400](https://our.internmc.facebook.com/intern/diff/D62144400/)

[ghstack-poisoned]
## Context

Introduce the `SymInt` class which allows representation of symbolic integers in a Vulkan graph.

Please see the comments documentation of the `SymInt` class for more details regarding why the `Int` type is not sufficient for symbolic integers.

Differential Revision: [D62144399](https://our.internmc.facebook.com/intern/diff/D62144399/)

[ghstack-poisoned]
## Context

Normally, tensor memory is planned during the export stage; tensors that do not overlap in lifetimes may share a memory allocation. However, memory planning requires knowledge of the lifetime of the tensors.

However, some complex operators may not be able to perform all the necessary computations in one shader, or the implementation of the operator may require that some temporary tensors be created during the execution of the op. Since these temporary tensors are not visible to the memory planning algorithm, they will not be memory planned.

This diff introduces the `TmpTensorVRef` object which facilitates memory sharing between temporary tensors. The design principle is that the lifetime of temporary tensors is restricted to the execution of the op within which they are created; thus, that knowledge can be used to implement memory planning. Please see the comments documentation of `TmpTensorVRef` for more details.

Differential Revision: [D62144398](https://our.internmc.facebook.com/intern/diff/D62144398/)

[ghstack-poisoned]
…cle temporary tensor memory"

## Context

Normally, tensor memory is planned during the export stage; tensors that do not overlap in lifetimes may share a memory allocation. However, memory planning requires knowledge of the lifetime of the tensors.

However, some complex operators may not be able to perform all the necessary computations in one shader, or the implementation of the operator may require that some temporary tensors be created during the execution of the op. Since these temporary tensors are not visible to the memory planning algorithm, they will not be memory planned.

This diff introduces the `TmpTensorVRef` object which facilitates memory sharing between temporary tensors. The design principle is that the lifetime of temporary tensors is restricted to the execution of the op within which they are created; thus, that knowledge can be used to implement memory planning. Please see the comments documentation of `TmpTensorVRef` for more details.

Differential Revision: [D62144398](https://our.internmc.facebook.com/intern/diff/D62144398/)

[ghstack-poisoned]
…nsor memory"

## Context

Normally, tensor memory is planned during the export stage; tensors that do not overlap in lifetimes may share a memory allocation. However, memory planning requires knowledge of the lifetime of the tensors.

However, some complex operators may not be able to perform all the necessary computations in one shader, or the implementation of the operator may require that some temporary tensors be created during the execution of the op. Since these temporary tensors are not visible to the memory planning algorithm, they will not be memory planned.

This diff introduces the `TmpTensorVRef` object which facilitates memory sharing between temporary tensors. The design principle is that the lifetime of temporary tensors is restricted to the execution of the op within which they are created; thus, that knowledge can be used to implement memory planning. Please see the comments documentation of `TmpTensorVRef` for more details.

Differential Revision: [D62144398](https://our.internmc.facebook.com/intern/diff/D62144398/)

[ghstack-poisoned]
…cle temporary tensor memory"

## Context

Normally, tensor memory is planned during the export stage; tensors that do not overlap in lifetimes may share a memory allocation. However, memory planning requires knowledge of the lifetime of the tensors.

However, some complex operators may not be able to perform all the necessary computations in one shader, or the implementation of the operator may require that some temporary tensors be created during the execution of the op. Since these temporary tensors are not visible to the memory planning algorithm, they will not be memory planned.

This diff introduces the `TmpTensorVRef` object which facilitates memory sharing between temporary tensors. The design principle is that the lifetime of temporary tensors is restricted to the execution of the op within which they are created; thus, that knowledge can be used to implement memory planning. Please see the comments documentation of `TmpTensorVRef` for more details.

Differential Revision: [D62144398](https://our.internmc.facebook.com/intern/diff/D62144398/)

[ghstack-poisoned]
…nsor memory"

## Context

Normally, tensor memory is planned during the export stage; tensors that do not overlap in lifetimes may share a memory allocation. However, memory planning requires knowledge of the lifetime of the tensors.

However, some complex operators may not be able to perform all the necessary computations in one shader, or the implementation of the operator may require that some temporary tensors be created during the execution of the op. Since these temporary tensors are not visible to the memory planning algorithm, they will not be memory planned.

This diff introduces the `TmpTensorVRef` object which facilitates memory sharing between temporary tensors. The design principle is that the lifetime of temporary tensors is restricted to the execution of the op within which they are created; thus, that knowledge can be used to implement memory planning. Please see the comments documentation of `TmpTensorVRef` for more details.

Differential Revision: [D62144398](https://our.internmc.facebook.com/intern/diff/D62144398/)

[ghstack-poisoned]
…cle temporary tensor memory"

## Context

Normally, tensor memory is planned during the export stage; tensors that do not overlap in lifetimes may share a memory allocation. However, memory planning requires knowledge of the lifetime of the tensors.

However, some complex operators may not be able to perform all the necessary computations in one shader, or the implementation of the operator may require that some temporary tensors be created during the execution of the op. Since these temporary tensors are not visible to the memory planning algorithm, they will not be memory planned.

This diff introduces the `TmpTensorVRef` object which facilitates memory sharing between temporary tensors. The design principle is that the lifetime of temporary tensors is restricted to the execution of the op within which they are created; thus, that knowledge can be used to implement memory planning. Please see the comments documentation of `TmpTensorVRef` for more details.

Differential Revision: [D62144398](https://our.internmc.facebook.com/intern/diff/D62144398/)

[ghstack-poisoned]
…nsor memory"

## Context

Normally, tensor memory is planned during the export stage; tensors that do not overlap in lifetimes may share a memory allocation. However, memory planning requires knowledge of the lifetime of the tensors.

However, some complex operators may not be able to perform all the necessary computations in one shader, or the implementation of the operator may require that some temporary tensors be created during the execution of the op. Since these temporary tensors are not visible to the memory planning algorithm, they will not be memory planned.

This diff introduces the `TmpTensorVRef` object which facilitates memory sharing between temporary tensors. The design principle is that the lifetime of temporary tensors is restricted to the execution of the op within which they are created; thus, that knowledge can be used to implement memory planning. Please see the comments documentation of `TmpTensorVRef` for more details.

Differential Revision: [D62144398](https://our.internmc.facebook.com/intern/diff/D62144398/)

[ghstack-poisoned]
## Context

Currently, in shaders we have to declare the binding slot that layout bindings will bind to explicitly, i.e.

```
${layout_declare_tensor(0, "w", "t_out", DTYPE, STORAGE)}
${layout_declare_buffer(1, "r", "nchw_in", DTYPE)}
${layout_declare_ubo(2, "ivec4", "sizes")}
```

However, this can get a little tedious when making many layout declarations. This diff improves the situation by adding the `B` variable which will automatically increment the binding slot whenever a layout binding is declared. Now we can write

```
${layout_declare_tensor(B, "w", "t_out", DTYPE, STORAGE)}
${layout_declare_buffer(B, "r", "nchw_in", DTYPE)}
${layout_declare_ubo(B, "ivec4", "sizes")}
```

I may make a follow up diff to change all layout declarations to use `B` across all shaders in the codebase later on.

Differential Revision: [D62210119](https://our.internmc.facebook.com/intern/diff/D62210119/)

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Sep 4, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5091

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1de43d2 with merge base 0c78a9d (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 4, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D62210119

Base automatically changed from gh/SS-JIA/67/head to main September 4, 2024 23:22
@facebook-github-bot facebook-github-bot merged commit d23548b into main Sep 5, 2024
36 of 38 checks passed
@facebook-github-bot facebook-github-bot deleted the gh/SS-JIA/68/head branch September 5, 2024 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants