Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

longT5 BetterTransformer implementation #1506

Open
omri-sap opened this issue Nov 1, 2023 · 5 comments
Open

longT5 BetterTransformer implementation #1506

omri-sap opened this issue Nov 1, 2023 · 5 comments

Comments

@omri-sap
Copy link

omri-sap commented Nov 1, 2023

Feature request

longT5 BetterTransformer implementation

Motivation

Encoder-decoder model trained on large context allow machine translation tasks

Your contribution

I looked at the implementation of regular T5 and it doesnt look to complex, i tried to implement myself but didnt succeed. If i can contribute please let me know.

Thank you,
Omri

@pszemraj
Copy link

pszemraj commented Nov 8, 2023

seconding this! would be great

@matvey-kolbasov-hs
Copy link

Totally on board with this! Would love to see this feature added!

@omri-sap
Copy link
Author

@fxmarty can we try tackle this together?

Thanks in advance

@fxmarty
Copy link
Collaborator

fxmarty commented Dec 13, 2023

Hi for reference we are upstreaming SDPA in Transformers, maybe it would be a better fit for longT5: huggingface/transformers#28005

Leaving this open as we may leverage nested tensors for longt5 (which are not in Transformers).

@ENate
Copy link

ENate commented Dec 19, 2023

Hi @ALL. Is this still open or you guys will work on it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants