Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpInfo: diag_embed, diagonal #58642

Closed
wants to merge 25 commits into from
Closed

OpInfo: diag_embed, diagonal #58642

wants to merge 25 commits into from

Conversation

krshrimali
Copy link
Contributor

See: #54261.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 20, 2021

💊 CI failures summary and remediations

As of commit a65db01 (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 1/2 non-scanned failure(s)

1 failure not recognized by patterns:

Job Step Action
CircleCI pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_test2 Run tests 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Comment on lines 2917 to 2925
make_arg = partial(make_tensor, dtype=dtype, device=device, requires_grad=requires_grad)

cases = ((S, S),)

def generator():
for shape in cases:
yield(SampleInput(make_arg(shape)))

return list(generator())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping it in this format to maintain consistency with all the OpInfos. In case there is any addition required to the cases, this keeps it flexible.

Copy link
Collaborator

@kshitij12345 kshitij12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

However looking at the docs and the signature of torch.diag_embed, we should probably add more cases which cover all arguments for the function (to improve the coverage). What do you think about it?

Thanks!!

cc: @mruberry what do you think?

Docs: https://pytorch.org/docs/stable/generated/torch.diag_embed.html

@mruberry
Copy link
Collaborator

Nice!

However looking at the docs and the signature of torch.diag_embed, we should probably add more cases which cover all arguments for the function (to improve the coverage). What do you think about it?

Thanks!!

cc: @mruberry what do you think?

Extending the sample inputs to test offset, dim1, and dim2 sounds like a great idea

@krshrimali krshrimali changed the title OpInfo: diag_embed OpInfo: diag_embed, diagonal May 20, 2021
@krshrimali krshrimali changed the title OpInfo: diag_embed, diagonal [WIP] OpInfo: diag_embed, diagonal May 20, 2021
@krshrimali
Copy link
Contributor Author

Updating this PR's scope after discussing with @kshitij12345, to try and merge sample inputs of diag_embed with diagonal. Also taking a look if diag can be merged as well, keeping the functions neat & clean. Will remove the WIP tag once all tests pass + the solution is ready to be reviewed.

@krshrimali krshrimali changed the title [WIP] OpInfo: diag_embed, diagonal OpInfo: diag_embed, diagonal May 20, 2021
@krshrimali
Copy link
Contributor Author

krshrimali commented May 20, 2021

Hi, @kshitij12345 - this is ready for review. There are a couple tests failing, which I think should be resolved after merging with master + submodule update.

Also, I decided not to merge sample_inputs_diag with sample_inputs_diagonal (for diagonal, diag_embed) because diag requires vec_sample = SampleInput(make_tensor((M, ), ...)) as an input sample which isn't accepted for diagonal, diag_embed. We can still do something like this though:

def sample_inputs_diagonal_functions(op_info, device, dtype, requires_grad, limit_to_2d=False, supports_vec=True, **kwargs):
    make_arg = partial(make_tensor, dtype=dtype, device=device, requires_grad=requires_grad, low=None, high=None)
     
    tensors_2d = ( ... 2d tensors ... )
    tensors_3d = ( ... 3d tensors ... )
    args_2d = ( ... args 2d ... )
    args_3d = ( ... args 3d ... ) 
   
    if limit_to_2d:
        # True for diag
        tensors = [*product(tensors_2d, args_2d)]
        samples = [SampleInput(tensor, args=arg) for tensor, arg in tensors]
    else:
        # For diagonal, diag_embed
        tensors = [*product(tensors_2d, args_2d), *product(tensors_3d, args_3d)]
        samples = [SampleInput(tensor, args=arg) for tensor, arg in tensors]
    if supports_vec:
        # For diag, diag_embed
        return samples + [SampleInput(make_arg((M, )))]
    return samples

But I'm not sure how neat is this. Note that:

  • diag only takes tensors till 2D dim. (it only accepts matrix or a vector).
  • diag_embed accepts vector input. (accepts any dim input)
  • diagonal doesn't accept vector input. (accepts any dim input > 1)
>>> torch.diag(torch.randn(1,))
tensor([[-0.9189]])
>>> torch.diag_embed(torch.randn(1,))
tensor([[-0.5603]])
>>> torch.diagonal(torch.randn(1,))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

@krshrimali
Copy link
Contributor Author

Hi, @kshitij12345 - this is ready for review. There are a couple tests failing, which I think should be resolved after merging with master + submodule update.

Also, I decided not to merge sample_inputs_diag with sample_inputs_diagonal (for diagonal, diag_embed) because diag requires vec_sample = SampleInput(make_tensor((M, ), ...)) as an input sample which isn't accepted for diagonal, diag_embed. We can still do something like this though:

def sample_inputs_diagonal_functions(op_info, device, dtype, requires_grad, limit_to_2d=False, supports_vec=True, **kwargs):
    make_arg = partial(make_tensor, dtype=dtype, device=device, requires_grad=requires_grad, low=None, high=None)
     
    tensors_2d = ( ... 2d tensors ... )
    tensors_3d = ( ... 3d tensors ... )
    args_2d = ( ... args 2d ... )
    args_3d = ( ... args 3d ... ) 
   
    if limit_to_2d:
        # True for diag
        tensors = [*product(tensors_2d, args_2d)]
        samples = [SampleInput(tensor, args=arg) for tensor, arg in tensors]
    else:
        # For diagonal, diag_embed
        tensors = [*product(tensors_2d, args_2d), *product(tensors_3d, args_3d)]
        samples = [SampleInput(tensor, args=arg) for tensor, arg in tensors]
    if supports_vec:
        # For diag, diag_embed
        return samples + [SampleInput(make_arg((M, )))]
    return samples

But I'm not sure how neat is this. Note that:

  • diag only takes tensors till 2D dim. (it only accepts matrix or a vector).
  • diag_embed accepts vector input. (accepts any dim input)
  • diagonal doesn't accept vector input. (accepts any dim input > 1)
>>> torch.diag(torch.randn(1,))
tensor([[-0.9189]])
>>> torch.diag_embed(torch.randn(1,))
tensor([[-0.5603]])
>>> torch.diagonal(torch.randn(1,))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

On re-thinking, it might just be a good idea to merge all the sample inputs into one. It's okay to add more args as long as their names make sense, what do you think @kshitij12345 @mruberry ?

Here is how the sample_inputs func for diag, diag_embed, diagonal can look like:

def sample_inputs_diagonal_functions(op_info, device, dtype, requires_grad, limit_to_2d=False, supports_vec=True, **kwargs):
    make_arg = partial(make_tensor, dtype=dtype, device=device, requires_grad=requires_grad, low=None, high=None)

    # 2D Tensors
    tensors_2d = (
        make_arg((M, M), low=None, high=None),
        make_arg((3, 5), low=None, high=None),
        make_arg((5, 3), low=None, high=None),
    )

    # 3D Tensors
    tensors_3d = (
        make_arg((M, M, M), low=None, high=None),
    )

    args_2d = ((), (2,), (-2,), (1,), (2,))
    args_3d = ((1, 1, 2), (2, 0, 1), (-2, 0, 1))

    if limit_to_2d:
        # True for diag
        tensors = [*product(tensors_2d, args_2d)]
        samples = [SampleInput(tensor, args=arg) for tensor, arg in tensors]
    else:
        # For diagonal, diag_embed
        tensors = [*product(tensors_2d, args_2d), *product(tensors_3d, args_3d)]
        samples = [SampleInput(tensor, args=arg) for tensor, arg in tensors]
    if supports_vec:
        # For diag, diag_embed
        return samples + [SampleInput(make_arg((M, )))]
    return samples

Copy link
Collaborator

@kshitij12345 kshitij12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Left a few comments!

Thanks!!

@@ -2913,6 +2913,28 @@ def sample_inputs_diag(op_info, device, dtype, requires_grad, **kwargs):

return samples + [vec_sample]

def sample_inputs_diagonal(op_info, device, dtype, requires_grad, **kwargs):
make_arg = partial(make_tensor, dtype=dtype, device=device, requires_grad=requires_grad, low=None, high=None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def make_tensor(size, device: torch.device, dtype: torch.dtype, *, low=None, high=None,

You can skip passing low and high

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done, thanks!


# 2D Tensors
tensors_2d = (
make_arg((M, M), low=None, high=None),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can skip passing low and high

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

args_2d = ((), (2,), (-2,), (1,), (2,))
args_3d = ((1, 1, 2), (2, 0, 1), (-2, 0, 1))

tensors = [*product(tensors_2d, args_2d), *product(tensors_3d, args_3d)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using product with already materialised tensor is not good.

I tried without the clone_inputs fuction in the patch. But it looks to be necessary as the sample_inputs for tile/repeat uses itertools.product

Ref: #52135 (comment)

So I'd suggest taking product of shape_2d and args_2d and then iterating of the product and materializing unique tensor for each arg (like we do usually).

Not sure if it isn't an issue anymore.
NOTE: This will lead to more tensors being materialised than the current approach.
cc: @mruberry

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should prioritize each sample input having an independent set of tensors that, if modified, won't affect other sample inputs. So I agree that taking the product over the non-tensor constructor args is preferable, even though it will generate more tensors. Reusing tensors in the test suite is a challenge I don't think we want (at least not now).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be resolved in recent commit.

sample_inputs_func=sample_inputs_diagonal),
OpInfo('diagonal',
dtypes=all_types_and_complex_and(torch.bool, torch.bfloat16, torch.float16),
supports_out=False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprise! I thought method_tests always had entries for operators which supported autograd.

I wonder what they do for operators which don't support autograd.

cc: @mruberry

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do you see no autograd support?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch! I saw it as support_autograd. My bad 😅

make_arg((M, M, M), low=None, high=None),
)

args_2d = ((), (2,), (-2,), (1,), (2,))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 is repeated here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, should be fixed in the recent commit.

@kshitij12345
Copy link
Collaborator

Here is how the sample_inputs func for diag, diag_embed, diagonal can look like:

I think it is better to seperate diag as the operator looks to be fairly different from diag_embed and diagonal to me.

@kshitij12345
Copy link
Collaborator

@krshrimali
diag already has an OpInfo (so no need to worry about the same).

OpInfo('diag',
dtypes=all_types_and_complex_and(torch.bool),
dtypesIfCPU=all_types_and_complex_and(torch.bool),
dtypesIfCUDA=all_types_and_complex_and(torch.bool, torch.half, torch.bfloat16),
sample_inputs_func=sample_inputs_diag),

@krshrimali
Copy link
Contributor Author

krshrimali commented May 20, 2021

@krshrimali
diag already has an OpInfo (so no need to worry about the same).

OpInfo('diag',
dtypes=all_types_and_complex_and(torch.bool),
dtypesIfCPU=all_types_and_complex_and(torch.bool),
dtypesIfCUDA=all_types_and_complex_and(torch.bool, torch.half, torch.bfloat16),
sample_inputs_func=sample_inputs_diag),

Yes, thanks @kshitij12345. I'm aware of this, was just thinking if we should merge the sample_inputs functions for all 3. Looks like it's okay for them to be separate, as per your inputs. :)

Copy link
Collaborator

@kshitij12345 kshitij12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Have a made a few minor nits.

Thanks!

return samples
def generator():
for shapes, args in zip([shapes_2d, shapes_3d], [args_2d, args_3d]):
for shape, arg in product(shapes, args):
Copy link
Collaborator

@kshitij12345 kshitij12345 May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor:
How about

for shape, arg in chain(product(shapes_2d, args_2d), product(shapes_3d, args_3d)):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!

@@ -2913,6 +2913,25 @@ def sample_inputs_diag(op_info, device, dtype, requires_grad, **kwargs):

return samples + [vec_sample]

def sample_inputs_diagonal(op_info, device, dtype, requires_grad, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: sample_inputs_diagonal_diag_embed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! :)

@krshrimali krshrimali requested a review from mruberry May 20, 2021 18:08
Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, @krshrimali and thank you for reviewing @kshitij12345!

This just needs a rebase; ping me when the tests are passing, @krshrimali

@facebook-github-bot
Copy link
Contributor

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@krshrimali
Copy link
Contributor Author

Nice work, @krshrimali and thank you for reviewing @kshitij12345!

This just needs a rebase; ping me when the tests are passing, @krshrimali

Thanks, @mruberry! Not sure if the failing test is because of this PR. test_nccl_high_priority_stream is failing, and doesn't look relevant to this PR. Do you think this PR is ready to be merged?

@facebook-github-bot
Copy link
Contributor

@mruberry merged this pull request in b9d1ad9.

deniskokarev pushed a commit to deniskokarev/pytorch that referenced this pull request Jun 9, 2021
Summary:
See: pytorch#54261.

Pull Request resolved: pytorch#58642

Reviewed By: ngimel

Differential Revision: D28627226

Pulled By: mruberry

fbshipit-source-id: b96fa8410bd53937ddb72a46c02b949691ee9458
@github-actions github-actions bot deleted the diag-embed branch February 11, 2024 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants