Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any more information about audio text aligner #6

Closed
YimengZhu opened this issue Aug 28, 2023 · 3 comments
Closed

Any more information about audio text aligner #6

YimengZhu opened this issue Aug 28, 2023 · 3 comments

Comments

@YimengZhu
Copy link

Is there any more information about how the audio text aligner is implemented? As they have different length, hard to image how they could be trained into the same embedding space.

Thanks.

@kobenaxie
Copy link

Q-former in BLIP-2?

@YimengZhu
Copy link
Author

Q-former in BLIP-2?

Seems it needs queries in Q-former, but from the diagram, I don't think they have it...

@TCL606
Copy link
Member

TCL606 commented Oct 8, 2023

It's Q-Former, but used a little differently.

@TCL606 TCL606 closed this as completed Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants