Skip to content

More Nested Tensor Functionality (layer_norm, cross_entropy / log_softmax&nll_loss) #99142

@Foisunt

Description

@Foisunt

🚀 The feature, motivation and pitch

I am working on Graphs. Right now I have a model running that takes a subgraph and does some predictions.
To improve throughput I want to batch multiple subgraphs of different sizes together.
Padding them to the same size does not work in my case as I use an aggregation operation where I don't want to aggregate the padded neighbours but masking some (the padded) neighbours is not possible.

I tried modifiying my model to support nested tensors as input which somewhat worked, but I had to cut out some unsupported operations, specifically layer_norm.
Also currently there are no supported loss functions, so a cross_entropy or nll_loss (and log_softmax) that supports nested tensors would be a big usability upgrade.

Also some error messages related to nested tensors point to https://github.com/pytorch/nestedtensor which I suspect is not correct anymore since nested tensors were moved to the core.

Alternatives

I tried implementing layer_norm myself using the currently supported nested ops, but was not sucessfull.
The issue is the "a/sqrt(b)" calculation, which I did not get to work without a .pow() or element wise division of two nested tensors.

For the loss function I can work around it by unbinding and stacking the output nested tensors, but this is very ugly.

Additional context

No response

cc @cpuhrsch @jbschlosser @bhosmer @drisspg

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: nestedtensorNestedTensor tag see issue #25032topic: new featurestopic categorytriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions