Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the difference between these three models? #110

Closed
wuyifan18 opened this issue Sep 17, 2021 · 4 comments
Closed

What is the difference between these three models? #110

wuyifan18 opened this issue Sep 17, 2021 · 4 comments

Comments

@wuyifan18
Copy link

code2seq
typed-code2seq
code2class

@SpirinEgor
Copy link
Contributor

Hi!

  • Code2seq is a vanilla model that used LSTM to embed paths into vectors and then uses another LSTM to generate output sequence (e.g. method name)
  • Code2class uses the same encoding method, but as a decoder, it has MLP to the number of classes. It is useful for classification tasks or when you need to build embedding of the code.
  • Typed-code2seq is extended code2seq model. We describe it in paper about PSIMiner

@wuyifan18
Copy link
Author

Thank you!

@Avv22
Copy link

Avv22 commented Nov 29, 2021

@SpirinEgor. Thank you very much. Do you have documentation please of Code2class or published paper so that we can read more and cite it?

@SpirinEgor
Copy link
Contributor

Currently, we don't have a paper that uses the code2class model. Hope to have one soon :)
To better understand how it works, I may suggest you study the difference between vanilla code2seq and code2vec models. Code2Class indeed uses code2seq encoder (path embedding algorithm) and code2vec decoder (path aggregation and processing output vector).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants