Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix dimensions: the codebook must look at data by taking each time fr… #5

Closed
wants to merge 1 commit into from

Conversation

wesbz
Copy link
Contributor

@wesbz wesbz commented Oct 26, 2021

…ame individually.
In SoundStream article: "This vector quantizer learns a
codebook of N vectors to encode each D-dimensional frame
of enc(x)."

@lucidrains
Copy link
Owner

lucidrains commented Oct 27, 2021

@wesbz ohh, i'm actually operating with the input having dimensions batch x duration x features (see readme examples)

did you prefer to have it the other way around? (durations and features transposed)

@wesbz
Copy link
Contributor Author

wesbz commented Oct 27, 2021

Oh! Alright, my bad, then!
Yes, it makes more sense to me to have batch × channels × duration ^^
Or I can just take that into account and correct it in my SoundStream implem ;) I'll just try that, do not close the PR yet, please ;)

@lucidrains
Copy link
Owner

@wesbz I can introduce a channel_last which can be set to False and do the transpose for you, if you wish

@wesbz
Copy link
Contributor Author

wesbz commented Oct 27, 2021

This sounds perfect to me!

@lucidrains
Copy link
Owner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants