Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fine_tune to a text classification task #32

Open
WilliamHoo opened this issue Dec 8, 2023 · 6 comments
Open

fine_tune to a text classification task #32

WilliamHoo opened this issue Dec 8, 2023 · 6 comments

Comments

@WilliamHoo
Copy link

I am trying to get mamba working for a text classification task by adding a classification head after the model.

For transformer models, people usually the last_hidden_state as the input to the classification head, any suggestions for mamba?

@WilliamHoo
Copy link
Author

also, any recommendations on tokenizers?

@albertfgu
Copy link
Contributor

I don't know much about tokenizers for fine tuning. For the classification head, many variations are possible. You could grab the final recurrent state of the model (although that might be unsupported with the current released version); you could grab the output at the last timestep; you can average the outputs at all timesteps.

@turian
Copy link

turian commented Dec 20, 2023

@WilliamHoo If you do figure out how to do this, I would be curious

@jmunozmendiFunditec
Copy link

I would also be interested. Any news?

@maksymdolgikh
Copy link

See my comment here #163 for a suggestion.

@getorca
Copy link

getorca commented Apr 29, 2024

I've put this, https://github.com/getorca/mamba_for_sequence_classification together to be compatible with HF to use mamba for sequence classification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants