Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training with no pretrained encoder - just projection from ready embeddings #20

Open
tehila17-meet opened this issue May 21, 2024 · 6 comments

Comments

@tehila17-meet
Copy link

tehila17-meet commented May 21, 2024

Do you have an example of training a modality that has no pretrained encoder? I want to only train the projector on ready embeddings.

My use case is a dataset of an array of numbers (each number indicating a voxel (from fmri data) intensity) and their corresponding english sentence.
I want to treat the voxel array as an embedding vector that needs to be projected into a higher dimension according to the textual embeddings of each array and its corresponding sentence.

Any help would be appreciated.

@sshh12
Copy link
Owner

sshh12 commented May 28, 2024

Hey that sounds super cool!

I don't have an example off hand but this is very do-able. Youd essentially just have "preprocess rows" return the raw voxel data, then "forward" just do nothing and return the voxel data, and then have the build projector function create a custom torch module that converts your voxel data into the same shape at the tokens (your custom embedding + dense layer to get it to the right token shape).

@tehila17-meet
Copy link
Author

Hey, so it works but with a relatively high loss and im thinking bc the input dimension is an embedding of size 249 and its trying to be projected into a a dimension of [8, 4096] (8 tokens).
Do you have any ideas how i can optimize this projector?

@sshh12
Copy link
Owner

sshh12 commented Jun 10, 2024

More data? In theory 249 to 8 tokens will actually overfit easily (so low training loss but high test).

You can also try pre-training the projector on some proxy task (e.g. train 249 - part of projector -> classifier and then chop the classifier off). This could help debug the embeding quality as well.

@sshh12
Copy link
Owner

sshh12 commented Jun 10, 2024

Will also note that loss especially in the context of lora fine-tuning like this can be misleading / not an accurate representation of efficiency. It's worth just sampling/testing your weights and seeing what's getting spit out and if it's anyway coherent.

@tehila17-meet
Copy link
Author

thanks for replying :)

I have another question regarding the generate parameters - is there a reason you didnt configure top_p, top_k and a specific temperature? and if so why?

@sshh12
Copy link
Owner

sshh12 commented Jun 13, 2024

This library was mainly to proof of concept these different modalities so didn't mess with decoding params too much. Not reason it's not included (they'd work the same as any other huggingface model).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants