-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training with no pretrained encoder - just projection from ready embeddings #20
Comments
Hey that sounds super cool! I don't have an example off hand but this is very do-able. Youd essentially just have "preprocess rows" return the raw voxel data, then "forward" just do nothing and return the voxel data, and then have the build projector function create a custom torch module that converts your voxel data into the same shape at the tokens (your custom embedding + dense layer to get it to the right token shape). |
Hey, so it works but with a relatively high loss and im thinking bc the input dimension is an embedding of size 249 and its trying to be projected into a a dimension of [8, 4096] (8 tokens). |
More data? In theory 249 to 8 tokens will actually overfit easily (so low training loss but high test). You can also try pre-training the projector on some proxy task (e.g. train 249 - part of projector -> classifier and then chop the classifier off). This could help debug the embeding quality as well. |
Will also note that loss especially in the context of lora fine-tuning like this can be misleading / not an accurate representation of efficiency. It's worth just sampling/testing your weights and seeing what's getting spit out and if it's anyway coherent. |
thanks for replying :) I have another question regarding the generate parameters - is there a reason you didnt configure top_p, top_k and a specific temperature? and if so why? |
This library was mainly to proof of concept these different modalities so didn't mess with decoding params too much. Not reason it's not included (they'd work the same as any other huggingface model). |
Do you have an example of training a modality that has no pretrained encoder? I want to only train the projector on ready embeddings.
My use case is a dataset of an array of numbers (each number indicating a voxel (from fmri data) intensity) and their corresponding english sentence.
I want to treat the voxel array as an embedding vector that needs to be projected into a higher dimension according to the textual embeddings of each array and its corresponding sentence.
Any help would be appreciated.
The text was updated successfully, but these errors were encountered: