-
Notifications
You must be signed in to change notification settings - Fork 232
how to generate embeddings for all entities after we have the model? #21
Comments
Also, are you going to release instructions on how to train the models? Thanks! |
Hello All, I would also like to get some tips on training this architecture from scratch, and information to use this architecture as a pre-trained network on any custom dataset |
Yes, same issue here! I would like to know how to use this for a custom dataset, and how to generate embeddings from the linked documents. |
Hello, I just re-implemented hard-negative mining and scripts for encoding entities with zeshel dataset from [Logeswaran et al., '19]. |
Hi all, You can refer to my comment #106 (comment) with regards to generating embeddings for new candidates for an existing model. |
I wonder whether you have trained this model using Chinese dataset. If so, can you share me your Chinese training dataset? I also want to use this model in Chinese, but I lack Chinese dataset. Thank you very much! |
With regards to training a new model with custom data, yes, it is indeed possible to do so. I would recommend first training a zero-shot learning (zeshel) model first just to get hang of the training process. The scripts to download and pre-process zeshel data are in the repository. You can then replicate the same steps, bring your data in the same format as zeshel, modify any hyperparameters (such as context length or choice of bert base model) and train your own model. |
I'm trying to train a biencoder model to support Chinese. After I got the trained model for biencoder, how can I get the embeddings for all entities like the given file models/all_entities_large.t7?
The text was updated successfully, but these errors were encountered: