A curated list for awesome machine learning methods for neural code intelligence.
- code2vec: Learning Distributed Representations of Code, Alon et al., Proc. ACM Program. Lang. (2019): 40:1-40:29 [arXiv] [GitHub] [Demo]
- code2seq: Generating Sequences from Structured Representations of Code, Alon et al., ICLR (2019) [arXiv] [GitHub] [Demo]
- CodeBERT: A Pre-Trained Model for Programming and Natural Languages, Feng et al., EMNLP (2020): 1536-1547 [arXiv] [GitHub]
- GraphCodeBERT: Pre-training Code Representations with Data Flow, Guo et al., ICLR (2021) [OpenReview] [GitHub]
- CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation., Wang et al., arXiv (2021) [arXiv] [GitHub]
- PyMT5: multi-mode translation of natural language and Python code with transformers, Clement et al., EMNLP (2020): 9052-9065 [arXiv]
- Evaluating Large Language Models Trained on Code, Chen et al., CoRR (2021) [arXiv] [GitHub]
- Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks, Mastropaolo et al., ICSE (2021): 336-347 [arXiv] [GitHub]
- Multi-task Learning based Pre-trained Language Model for Code Completion, Liu et al., ASE (2020): 473-485 [arXiv] [GitHub]
- Unsupervised Translation of Programming Languages, Rozière et al., NeurIPS (2020) [arXiv] [GitHub]
- DOBF: A Deobfuscation Pre-Training Objective for Programming Languages, Rozière et al., CoRR (2021) [arXiv] [GitHub]
- Leveraging Automated Unit Tests for Unsupervised Code Translation, Rozière et al., CoRR (2021) [arXiv] [GitHub]
- IntelliCode compose: code generation using transformer, Svyatkovskiy et al., ESEC/SIGSOFT FSE (2020): 1433-1443 [arXiv]
- Exploring Software Naturalness through Neural Language Models, Buratti et al., CoRR (2020) [arXiv]
- Unified Pre-training for Program Understanding and Generation, Ahmad et al., NAACL-HLT (2021): 2655-2668 [arXiv]
- Learning Execution through Neural Code Fusion, Shi et al., ICLR (2020) [arXiv] [Talk]
- Learning to Represent Programs with Graphs, Allamanis et al., ICLR (2018) [arXiv] [GitHub]
- CodeBLEU: a Method for Automatic Evaluation of Code Synthesis, Ren et al., CoRR (2020) [arXiv]
- CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation, Lu et al., CoRR (2021) [arXiv] [GitHub] [Website]
- Measuring Coding Challenge Competence With APPS, Hendrycks et al., CoRR (2021) [arXiv]