-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generating embeddings for Python and Java #104
Comments
Hi Avra, Yes, in order to use code2vec for python, you will have to train the model on a python dataset. Uri |
@urialon. astminer team helped me producing training, testing and validation |
I am not sure how your |
Thank you. So I will train your model on 150k python dataset. How to please save the model to use it later on another python dataset to generate embeddings? Does the We would like too once we train the model on 150k python dataset you specified to use to generate later on one embedding vector for each python file we have in our own dataset, can we do that please? We don't want to generate method name but one embedding that is representative of a file. We would like to do the same for our 20k python files. |
Hi @Avra2 ,
|
Hello,
Thanks again for your work.
Can you please explain how to use the model to generate embeddings for a source file in Python and also for Java? Do we have to train your model on Java dataset and Python dataset in order to use the model to generate embeddings of source code? Also is it possible to have embeddings in a fixed size of let us say 100 represented as numerical data for each file?
The text was updated successfully, but these errors were encountered: