Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NumberProxy is no Number #272

Closed
jjsjann123 opened this issue Apr 25, 2024 · 8 comments · Fixed by #286
Closed

NumberProxy is no Number #272

jjsjann123 opened this issue Apr 25, 2024 · 8 comments · Fixed by #286
Labels

Comments

@jjsjann123
Copy link
Collaborator

🚀 Feature

Derivatives of NumberProxy should not inherit from Number directly.

Motivation

It's error prone to have derivatives of NumberProxy identified as Number at the same time. We would want to identify NumberProxy as dynamic value, so interpreter would know not to bake in numbers or accidentally constant fold things away.

Pitch

We should remove the inheritance of float from FloatProxy here:

class FloatProxy(NumberProxy, float):

Same thing would apply to IntegerProxy / ComplexProxy. After this change, we will be able to distinguish NumberProxy from constant number in the program.

Alternatives

Alternative: we'll have to update existing checks where isinstance(x, Number) to not isinstance(x, NumberProxy) and isinstance(x, Number) to properly identify runtime values. The worse part is, if we miss anything in the code base, it'll ended up as an unintended behavior (like bake in static number) and possibly a silent wrong result.

comparing to the proposed approach here, if we change the inheritance hierarchy, existing legacy check isinstance(x, Number) would return false and we would NOT treating a NumberProxy as a regular compile time constant. This likely would lead us to some different code path where we'll try to use it as a TensorProxy instead, that'll give us an exception, which is much easier to patch.

Additional context

This is an incremental work supporting #262

@t-vi
Copy link
Collaborator

t-vi commented Apr 25, 2024

Makes total sense to me to follow TensorProxys here. We should think about what behaviour we want to show externally and implement that in jit_ext.py, e.g. TensorProxy instances report isinstance(t, torch.Tensor) as True:

def _general_jit_isinstance_lookaside(obj: Any, cls: type | UnionType | tuple[type | UnionType]):
uobj = unwrap(obj)
ucls = unwrap(cls)
if isinstance(uobj, TensorProxy):
res = issubclass(torch.Tensor, ucls)

@mruberry
Copy link
Collaborator

Removing the base types could require developers think carefully about their code and might remove the opportunity for some errors. I see a couple challenges with actually doing this:

  • a lot of code is ignorant to the distinction between a number and a numberproxy, just like it's ignorant to the split between a tensor and a tensorproxy. Code in the torch language definition and the clang language definition, for example. I wouldn't want to complicate writing operations in those languages. Maybe splitting these two things wouldn't actually complicate it, however?
  • we'll need to add explicit conversions from NumberProxy to the corresponding Python number type if we pass a NumberProxy to an opaque callable — but maybe we want to do that anyway?

There is already some code today that explicitly reasons about NumberProxies vs. Numbers, and it just queries for an object being a NumberProxy first. I guess what motivates this is that there'd be no opportunity to mistakenly identify a NumberProxy as a Number, and all the conversions to a Number would have to be explicit?

@jjsjann123
Copy link
Collaborator Author

  • a lot of code is ignorant to the distinction between a number and a numberproxy, just like it's ignorant to the split between a tensor and a tensorproxy. Code in the torch language definition and the clang language definition, for example. I wouldn't want to complicate writing operations in those languages. Maybe splitting these two things wouldn't actually complicate it, however?

That's a good point and I don't know the answer there yet. I would hope that it's just going to be some type check changes, which is what I'm pushing for in this issue anyway.
What would be a good way to check how much this would come into play? I'm planning to just play with some torch functions and see if anything shows up.

  • we'll need to add explicit conversions from NumberProxy to the corresponding Python number type if we pass a NumberProxy to an opaque callable — but maybe we want to do that anyway?

This sounds like a very reasonable thing to add.

There is already some code today that explicitly reasons about NumberProxies vs. Numbers, and it just queries for an object being a NumberProxy first. I guess what motivates this is that there'd be no opportunity to mistakenly identify a NumberProxy as a Number, and all the conversions to a Number would have to be explicit?

Yes that's the intention. It's too easy for existing code base to accidentally treat NumberProxy as a number.

@mruberry
Copy link
Collaborator

Sounds good! I think trying it and seeing what breaks would be a good start

@jjsjann123
Copy link
Collaborator Author

In the prototype PR #262 I managed to pass the CI at least. But there's a few hacky things I did that I'm not super happy with.

  1. I'm currently just spamming isinstance(x, (Number, NumberProxy)) in places where we have isinstance(x, Number) to make sure that we pass all tests. So on paper this isn't much different from what @mruberry suggested earlier on having checks as isinstance(x, NumberProxy) before isinstance(x, Number). But I still argue that it's cleaner to have the two not mixed-up.

  2. Since we are now going to have NumberProxy in the trace, we need to fill up how we'd propagate NumberProxy's meta info in the trace. It's done very ugly at this point and I'm sure there are so many rules I wasn't propagating properly. (currently doing that in NumberProxy)

  3. NumberProxy showing up in the trace exposed some holes in our grad rule. This isn't that big of a deal and we'll just need to patch those slowly.

Two trickier topic:

  1. how do we handle number checks in torch/__init__.py? when numbers are static, it's a straightforward check. But do we want to impose and bury that check in the trace when we have dynamic input? Overhead might become a problem in the future?

  2. I see a hint of memory usage increase, since I have to change the memory check here.
    I'm also seeing OOM failure on CI. But not sure if those are just flaky test or real issue.
    But we are only adding number in the trace, I'm surprised that I have to touch anything related to memory foot-print.

@mruberry
Copy link
Collaborator

  1. how do we handle number checks in torch/__init__.py? when numbers are static, it's a straightforward check. But do we want to impose and bury that check in the trace when we have dynamic input? Overhead might become a problem in the future?

What about supporting isinstance(x, NumberClasses), and x: NumberLike?

  1. I see a hint of memory usage increase, since I have to change the memory check here.
    I'm also seeing OOM failure on CI. But not sure if those are just flaky test or real issue.
    But we are only adding number in the trace, I'm surprised that I have to touch anything related to memory foot-print.

It would also be very surprising to me if these changes were causing OOMs because they shouldn't use any device memory

@jjsjann123
Copy link
Collaborator Author

  1. how do we handle number checks in torch/__init__.py? when numbers are static, it's a straightforward check. But do we want to impose and bury that check in the trace when we have dynamic input? Overhead might become a problem in the future?

What about supporting isinstance(x, NumberClasses), and x: NumberLike?

I think you mean NumberClasses = (Number, NumberProxy) and NumberLike = Number | NumberProxy?

It's common to have something like utils.check(number_variable >= 0). If we don't distinguish the two, I think you are suggesting that we should just treat NumberProxy and Number consistently and have the existing checks to be part of the trace (whether it's prologue or compute)?

@mruberry
Copy link
Collaborator

  1. how do we handle number checks in torch/__init__.py? when numbers are static, it's a straightforward check. But do we want to impose and bury that check in the trace when we have dynamic input? Overhead might become a problem in the future?

What about supporting isinstance(x, NumberClasses), and x: NumberLike?

I think you mean NumberClasses = (Number, NumberProxy) and NumberLike = Number | NumberProxy?

Yep! We're on the same page.

It's common to have something like utils.check(number_variable >= 0). If we don't distinguish the two, I think you are suggesting that we should just treat NumberProxy and Number consistently and have the existing checks to be part of the trace (whether it's prologue or compute)?

I think so? This might be best to review over a VC with a shared Colab notebook!

jjsjann123 added a commit that referenced this issue May 16, 2024
Fixes #272.

Changes in this PR:

Removing the inheritance of Number kinds in NumberProxy derivatives. The motivation is to avoid mistakenly identify a NumberProxy as Numbers. Specifically, we are removing complex / int / float from the inheritance of ComplexProxy / IntegerProxy / FloatProxy.
changing existing checks from isinstance(x, Number) to isinstance(x, (Number, NumberProxy)).
Handling proper type promotion in NumberProxy's binary/unary operations.

Co-authored-by: Thomas Viehmann <tv@beamnet.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants