Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training sample from custom dataset #6

Open
axel588 opened this issue Feb 15, 2023 · 3 comments
Open

Training sample from custom dataset #6

axel588 opened this issue Feb 15, 2023 · 3 comments

Comments

@axel588
Copy link

axel588 commented Feb 15, 2023

I could'nt manage to train on a custom dataset, many parts in the code in the sample training call external dataset.
Is it possible to have a sample training code on custom datasets, using utils_load_dataset didn't work for the training case.
The embedding is for what I've understood a clip encoded list of strings using their tokeniser. But much of this is hard to setup.
The idea would be to have a simple, possible to train on custom dataset, training sample, it's something truely missing in many repositories.

Thanks for the work you've done !

@apapiu
Copy link
Owner

apapiu commented Feb 16, 2023

Hi - what dataset are you trying to use? For the text embedding you can use the get_text_encodings function and the images can just be resized to the appropriate size and saved as an numpy file.

@axel588
Copy link
Author

axel588 commented Feb 18, 2023

thanks @apapiu for your answer
the main issue I have is with 16_16_latent_embeddings.npy I have no idea how to reproduce this kind of file, Not sure how to transform images to 'latent embedding'. I have a folder of images.png and images.txt ... , I don't know how to convert this to a latent embedding, my attempt until then was to create a dataset that return a numpy array of the imahe a,d called get _ text encodings for the text in getitem without success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@apapiu @axel588 and others