Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number proxies #250

Draft
wants to merge 87 commits into
base: main
Choose a base branch
from
Draft

Conversation

jjsjann123
Copy link
Collaborator

Before submitting
  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

What does this PR do?

Fixes # (issue).

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@@ -595,6 +595,9 @@ def proxify(self, value: WrappedValue) -> Any:
co: CACHE_OPTIONS = get_cache_option()
if co is CACHE_OPTIONS.CONSTANT_VALUES:
self.add_constraint((clang.check_tensor_shape_and_metadata, p_orig))
elif co is CACHE_OPTIONS.SYMBOLIC_VALUES:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is our medium term plan w.r.t. defaul caching? If we need this for correctly handling #231 , it would seem that symbolic values should be the default but that in turn would mean that we want to have it work for our supported use cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping that #231 wouldn't need to the whole enable CACHE_OPTIONS.SYMBOLIC_VALUES thing. Looks like #231 has passed all CI, which feels promising.

The first step of this is to get number proxies to be plumbed through, I think that might be helpful for #231.
In terms of dynamic shape, we can slowly expanding its support.

Start off with allowing scalar input as number proxies, while still requiring tensor to be constant shape. I'm trying to figure out how/where to properly insert prologue_trace guard. Right now considering doing that from executor. i.e. nvfuser would require reduction dim(s) to be baked in as constant, while torchex doesn't care.

I'll try to come up with a design doc for review.

@jjsjann123
Copy link
Collaborator Author

jjsjann123 commented Apr 24, 2024

Just adding anote to myself. nvfuserex trace looked a bit misleading:

import torch
from thunder.executors.torchex import no_autocast

@torch.no_grad()
@no_autocast()
def computation(a, i0, i1):
  # a: "cuda:0 f32[8, 16, 32]"
  # i0: "int 0"
  # i1: "int 1"
  [t2] = nvFusion0(a, i0, i1)
    # t2 = prims.sum(a, (i0, i1))  # t2: "cuda:0 f32[32]"
  del a, i0, i1
  return t2

It looks like it's taking a dynamic reduction axis, but it's just baked in as constants.

Similarly, i0 and i1 are runtime value, we probably shouldn't have the trace indicate them as constant number in the trace where we have.

# i0: "int 0"
# i1: "int 1"

Similarly, output tensor t2 needs to show a symbolic shape.

jjsjann123 added a commit to jjsjann123/lightning-thunder that referenced this pull request Jun 4, 2024
@jjsjann123
Copy link
Collaborator Author

Note this one is still on-going.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants