-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source code parsing pipeline #4
Comments
I have the same problem. |
@wangyu1997 - You may want to take a look at https://github.com/JetBrains-Research/astminer. They have a great parsing pipeline and a small implementation of Code2Vec, which can get you started |
@dhas Thank you for reply, after i reviewing you code, I notice that the all variables in you terminal_idxs.txt are represented like "@var_xx", could you tell more about the detail? thanks! |
Hi @dhas, @wangyu1997,
|
Hi @sonoisa,
I wasn't able to understand how you arrived at the dataset you provide in your code2vec/data directory. Could you clarify your source code parsing pipeline? If I understand correctly, you seem to have started with the parsed tokens serialized as JSON from http://groups.inf.ed.ac.uk/cup/codeattention/ and you have converted into *.txt in code2vec/data. Am I right?
Would you be able to add the code for doing this into the repo? I need to parse sources written in C which is why I'm seeking a clearer picture of parsing.
Thanks
The text was updated successfully, but these errors were encountered: