Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bfloat16 cannot utilize some codes #114

Closed
AmitMY opened this issue Mar 11, 2024 · 6 comments
Closed

bfloat16 cannot utilize some codes #114

AmitMY opened this issue Mar 11, 2024 · 6 comments

Comments

@AmitMY
Copy link

AmitMY commented Mar 11, 2024

When using FSQ with [8, 5, 5, 5] levels, and in pytorch-lightning specifying bfloat16 training, the codebook utilization scratches 50% from below, while when training with float32 it scratches 100%.

I don't know if there is any issue with the implementation or just a limitation with the FSQ, in any case I would guess that this library should force float32 for the quantization step.

Example:

torch.tensor([1000,1001,1002,1003], dtype=torch.bfloat16).to(torch.int32)

tensor([1000, 1000, 1000, 1004], dtype=torch.int32)

AmitMY added a commit to AmitMY/vector-quantize-pytorch that referenced this issue Mar 16, 2024
lucidrains added a commit that referenced this issue Mar 19, 2024
@lucidrains
Copy link
Owner

@AmitMY hey Amit! i put in a quick fix in 1.14.4

curious how well FSQ is performing for you otherwise. are you training an autoencoder?

@AmitMY
Copy link
Author

AmitMY commented Mar 23, 2024

Hi! Was waiting for some compute to try this, but actually it fails: (network is now BFloat16, input is cast as float)

RuntimeError: mat1 and mat2 must have the same dtype, but got Float and BFloat16

FSQ is performing amazingly well for me. Basically 100% codebook util, and the autoencoder can predict the input very well. I did have to normalize my data, but once that was done it was smooth sailing.

@lucidrains
Copy link
Owner

@AmitMY besides code utilization, have you tried running it against regular VQ as an ablation / to compare?

@AmitMY
Copy link
Author

AmitMY commented Mar 29, 2024

I have only tried regular VQ in the beginning, saw that FSQ was better/more stable for my problem, and then scaled up data/model size - so no, for my current problem I did not fully compare FSQ and VQ

@lucidrains
Copy link
Owner

@AmitMY ah got it, no biggie. just curious

@lucidrains
Copy link
Owner

lucidrains commented Apr 16, 2024

@AmitMY finally had the chance to train FSQ myself yesterday evening and wow, it works great! so much more stable than VQ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants