Training with no pretrained encoder - just projection from ready embeddings #20

tehila17-meet · 2024-05-21T12:46:18Z

Do you have an example of training a modality that has no pretrained encoder? I want to only train the projector on ready embeddings.

My use case is a dataset of an array of numbers (each number indicating a voxel (from fmri data) intensity) and their corresponding english sentence.
I want to treat the voxel array as an embedding vector that needs to be projected into a higher dimension according to the textual embeddings of each array and its corresponding sentence.

Any help would be appreciated.

sshh12 · 2024-05-28T00:48:31Z

Hey that sounds super cool!

I don't have an example off hand but this is very do-able. Youd essentially just have "preprocess rows" return the raw voxel data, then "forward" just do nothing and return the voxel data, and then have the build projector function create a custom torch module that converts your voxel data into the same shape at the tokens (your custom embedding + dense layer to get it to the right token shape).

tehila17-meet · 2024-06-05T11:34:03Z

Hey, so it works but with a relatively high loss and im thinking bc the input dimension is an embedding of size 249 and its trying to be projected into a a dimension of [8, 4096] (8 tokens).
Do you have any ideas how i can optimize this projector?

sshh12 · 2024-06-10T04:10:46Z

More data? In theory 249 to 8 tokens will actually overfit easily (so low training loss but high test).

You can also try pre-training the projector on some proxy task (e.g. train 249 - part of projector -> classifier and then chop the classifier off). This could help debug the embeding quality as well.

sshh12 · 2024-06-10T04:12:15Z

Will also note that loss especially in the context of lora fine-tuning like this can be misleading / not an accurate representation of efficiency. It's worth just sampling/testing your weights and seeing what's getting spit out and if it's anyway coherent.

tehila17-meet · 2024-06-13T09:34:27Z

thanks for replying :)

I have another question regarding the generate parameters - is there a reason you didnt configure top_p, top_k and a specific temperature? and if so why?

sshh12 · 2024-06-13T22:36:17Z

This library was mainly to proof of concept these different modalities so didn't mess with decoding params too much. Not reason it's not included (they'd work the same as any other huggingface model).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training with no pretrained encoder - just projection from ready embeddings #20

Training with no pretrained encoder - just projection from ready embeddings #20

tehila17-meet commented May 21, 2024 •

edited

Loading

sshh12 commented May 28, 2024

tehila17-meet commented Jun 5, 2024

sshh12 commented Jun 10, 2024

sshh12 commented Jun 10, 2024

tehila17-meet commented Jun 13, 2024

sshh12 commented Jun 13, 2024

Training with no pretrained encoder - just projection from ready embeddings #20

Training with no pretrained encoder - just projection from ready embeddings #20

Comments

tehila17-meet commented May 21, 2024 • edited Loading

sshh12 commented May 28, 2024

tehila17-meet commented Jun 5, 2024

sshh12 commented Jun 10, 2024

sshh12 commented Jun 10, 2024

tehila17-meet commented Jun 13, 2024

sshh12 commented Jun 13, 2024

tehila17-meet commented May 21, 2024 •

edited

Loading