You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Test memory use of NeMo model with out-of-place add instead of the in-place variant
Wondering how we are going to evaluate this one.
i.e. if we are just using eager mode for the time being, an inplace version does save memory footprint, but that might not translate to a compiled program. Since we might have a chance to fuse out-of-place operation.
Test memory use of NeMo model with out-of-place add instead of the in-place variant
Wondering how we are going to evaluate this one. i.e. if we are just using eager mode for the time being, an inplace version does save memory footprint, but that might not translate to a compiled program. Since we might have a chance to fuse out-of-place operation.
Yeah, it's definitely not an apples-to-apples comparison.
But actually that's not the comparison we're interested in here. Here we'd want to evaluate NeMo alone (i.e. w/o thunder) using in-place ops vs. NeMo alone using out-of-place ops.
While we can always modify any network to use Thunder (and this is the approach I would recommend to most users!), NeMo isn't going to adopt Thunder instantly. First we need to show some value to NeMo.
Until then, the NeMo team is happy to consider changes that make networks more amenable to Thunder, but only if they improve or at least do not regress non-Thunder-based use cases.
Until then, the NeMo team is happy to consider changes that make networks more amenable to Thunder, but only if they improve or at least do not regress non-Thunder-based use cases.
Got'ya. That makes sense. I guess that's also why the last item doesn't have a corresponding thunder issue. Sorry for the noise.
馃殌 Feature
NeMo's "MegatronLatentDiffusion" model implements Stable Diffusion
Initial
examine
:Found 37 distinct operations, of which 30 (81.1%) are supported
Motivation
Pitch
Work items
mse_loss
聽#174TensorBase.is_contiguous
聽#172Tensor.type
聽#177add
instead of the in-place variantcc @tfogal
The text was updated successfully, but these errors were encountered: