More Nested Tensor Functionality (layer_norm, cross_entropy / log_softmax&nll_loss)

### 🚀 The feature, motivation and pitch

I am working on Graphs. Right now I have a model running that takes a subgraph and does some predictions.
To improve throughput I want to batch multiple subgraphs of different sizes together.
Padding them to the same size does not work in my case as I use an aggregation operation where I don't want to aggregate the padded neighbours but masking some (the padded) neighbours is not possible.

I tried modifiying my model to support nested tensors as input which somewhat worked, but I had to cut out some unsupported operations, specifically layer_norm.
Also currently there are no supported loss functions, so a cross_entropy or nll_loss (and log_softmax) that supports nested tensors would be a big usability upgrade.

Also some error messages related to nested tensors point to [https://github.com/pytorch/nestedtensor](https://github.com/pytorch/nestedtensor) which I suspect is not correct anymore since nested tensors were moved to the core.

### Alternatives

I tried implementing layer_norm myself using the currently supported nested ops, but was not sucessfull.
The issue is the "a/sqrt(b)" calculation, which I did not get to work without a .pow() or element wise division of two nested tensors.

For the loss function I can work around it by unbinding and stacking the output nested tensors, but this is very ugly.

### Additional context

_No response_

cc @cpuhrsch @jbschlosser @bhosmer @drisspg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

More Nested Tensor Functionality (layer_norm, cross_entropy / log_softmax&nll_loss) #99142

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

More Nested Tensor Functionality (layer_norm, cross_entropy / log_softmax&nll_loss) #99142

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions