Skip to content

Conversation

Narsil
Copy link
Contributor

@Narsil Narsil commented Sep 13, 2022

While investigating performance improvements, the main culprit I found was

autocast which actually adds a LOT of copies of tensors a bit everywhere
to cast to float16. I added a lot of custom made casts for now in an offbranch.

This PR tries to check 1 of the changes which is this .float() cast which break
in fp16 mode (because the model is in fp16, while the tensor is now in float.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 13, 2022

The documentation is not available anymore as the PR was closed or merged.

@patrickvonplaten
Copy link
Contributor

Just checked. You're 100% right @Narsil . .float() makes no different with autocast. Checked for both fp32 and fp16 weights. Identical results.

Feel free to merge whenever :-)

@Narsil Narsil merged commit 7c4b38b into huggingface:main Sep 14, 2022
@NouamaneTazi
Copy link
Member

NouamaneTazi commented Sep 19, 2022

Sorry to re-open this, but another way to solve this is by using

# make sure hidden states is in float32
# when running in half-precision
 hidden_states = self.norm2.float()(hidden_states.float()).type(hidden_states.dtype)

which respects the comment above saying that Normalization must be run in fp16 (idk if the comment is true though).

My PR does use this fix for example and it works fine #371

@patrickvonplaten @Narsil

PhaneeshB pushed a commit to nod-ai/diffusers that referenced this pull request Mar 1, 2023
Co-authored-by: vivian <vivian@nod-labs.com>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants