Table of Contents
Title | ArXiv Link | GitHub Link | Last Update |
---|---|---|---|
GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis | 2308.03314 | https://github.com/metatrustlabs/falcon-metatrust | 2024-05-06 |
On Training a Neural Network to Explain Binaries | 2404.19631 | None | 2024-04-30 |
A Critical Review of Large Language Model on Software Engineering: An Example from ChatGPT and Automated Program Repair | 2310.08879 | None | 2024-04-17 |
How Far Have We Gone in Stripped Binary Code Understanding Using Large Language Models | 2404.09836 | None | 2024-04-16 |
Analyzing the Performance of Large Language Models on Code Summarization | 2404.08018 | https://github.com/rajarshihaldar/analyze-llm-code-summarization | 2024-04-10 |
Open-Source AI-based SE Tools: Opportunities and Challenges of Collaborative Software Learning | 2404.06201 | None | 2024-04-09 |
Testing the Effect of Code Documentation on Large Language Model Code Understanding | 2404.03114 | None | 2024-04-03 |
A Survey of using Large Language Models for Generating Infrastructure as Code | 2404.00227 | None | 2024-03-30 |
HiRoPE: Length Extrapolation for Code Models | 2403.19115 | None | 2024-03-28 |
Evaluating Large Language Models with Runtime Behavior of Program Execution | 2403.16437 | None | 2024-03-25 |
Read between the lines -- Functionality Extraction From READMEs | 2403.10205 | None | 2024-03-15 |
RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation | 2402.16667 | https://github.com/openbmb/repoagent | 2024-02-26 |
Automated Smart Contract Summarization via LLMs | 2402.04863 | None | 2024-02-21 |
Scaling Laws Behind Code Understanding Model | 2402.12813 | None | 2024-02-20 |
CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation | 2311.08588 | https://github.com/weixiangyan/codescope | 2024-02-06 |
DocChecker: Bootstrapping Code Large Language Model for Detecting and Resolving Code-Comment Inconsistencies | 2306.06347 | https://github.com/fsoft-ai4code/docchecker | 2024-02-03 |
Investigating the Efficacy of Large Language Models for Code Clone Detection | 2401.13802 | https://github.com/mkhfring/largelanguagemodels | 2024-01-30 |
Using an LLM to Help With Code Understanding | 2307.08177 | None | 2024-01-16 |
Automatic Semantic Augmentation of Language Model Prompts (for Code Summarization) | 2304.06815 | None | 2024-01-11 |
Mutation-based Consistency Testing for Evaluating the Code Understanding Capability of LLMs | 2401.05940 | None | 2024-01-11 |
Breaking the Silence: the Threats of Using LLMs in Software Engineering | 2312.08055 | https://github.com/llm4se/obfuscated-chatgpt-experiments | 2024-01-08 |
A Prompt Learning Framework for Source Code Summarization | 2312.16066 | https://github.com/wssun/promptcs | 2023-12-26 |
Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models | 2312.09601 | https://github.com/xinjin95/binsum | 2023-12-15 |
Exploring Distributional Shifts in Large Language Models for Code Analysis | 2303.09128 | None | 2023-12-05 |
Transfer Attacks and Defenses for Large Language Models on Coding Tasks | 2311.13445 | None | 2023-11-22 |
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval | 2303.03004 | https://github.com/ntunlp/xCodeEval | 2023-11-06 |
REAL: Resilience and Adaptation using Large Language Models on Autonomous Aerial Robots | 2311.01403 | None | 2023-11-02 |
The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation | 2305.06156 | https://github.com/fsoft-ai4code/thevault | 2023-10-30 |
Exploring Large Language Models for Code Explanation | 2310.16673 | None | 2023-10-25 |
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs | 2310.09668 | None | 2023-10-14 |
An Assessment of ChatGPT on Log Data | 2309.07938 | None | 2023-09-14 |
Title | ArXiv Link | GitHub Link | Last Update |
---|---|---|---|
PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning | 2406.01587 | None | 2024-06-04 |
LDB: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step | 2402.16906 | https://github.com/floridsleeves/llmdebugger | 2024-06-04 |
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models | 2406.01359 | None | 2024-06-04 |
Clover: Closed-Loop Verifiable Code Generation | 2310.17807 | https://github.com/ChuyueSun/Clover | 2024-06-03 |
Generate and Pray: Using SALLMS to Evaluate the Security of LLM Generated Code | 2311.00889 | None | 2024-06-03 |
Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search | 2401.04514 | https://github.com/alex-haochenli/reco | 2024-06-03 |
SemCoder: Training Code Language Models with Comprehensive Semantics | 2406.01006 | None | 2024-06-03 |
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models | 2404.18400 | https://github.com/deep-symbolic-mathematics/llm-sr | 2024-06-02 |
A Survey on Large Language Models for Code Generation | 2406.00515 | None | 2024-06-01 |
Benchmarking the Communication Competence of Code Generation for LLMs and LLM Agent | 2406.00215 | None | 2024-05-31 |
Towards LLM-Powered Verilog RTL Assistant: Self-Verification and Self-Correction | 2406.00115 | None | 2024-05-31 |
Grammar-Aligned Decoding | 2405.21047 | None | 2024-05-31 |
Self-planning Code Generation with Large Language Models | 2303.06689 | None | 2024-05-31 |
From Symbolic Tasks to Code Generation: Diversification Yields Better Task Performers | 2405.19787 | None | 2024-05-31 |
Students' Perspective on AI Code Completion: Benefits and Challenges | 2311.00177 | None | 2024-05-31 |
AnalogCoder: Analog Circuit Design via Training-Free Code Generation | 2405.14918 | https://github.com/laiyao1/AnalogCoder | 2024-05-30 |
Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation | 2405.20092 | None | 2024-05-30 |
KNOW: A Real-World Ontology for Knowledge Capture with Large Language Models | 2405.19877 | None | 2024-05-30 |
DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories | 2405.19856 | None | 2024-05-30 |
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff | 2405.17503 | None | 2024-05-30 |
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data | 2405.19265 | https://github.com/internlm/alchemistcoder | 2024-05-29 |
Large Language Models for Code Summarization | 2405.19032 | None | 2024-05-29 |
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement | 2402.06700 | https://github.com/morning9393/etpo | 2024-05-29 |
Sketch-Plan-Generalize: Continual Few-Shot Learning of Inductively Generalizable Spatial Concepts | 2404.07774 | None | 2024-05-29 |
Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap | 2401.10034 | https://github.com/wuxingyu-ai/llm4ec | 2024-05-29 |
Efficient Model-agnostic Alignment via Bayesian Persuasion | 2405.18718 | None | 2024-05-29 |
Assessing Economic Viability: A Comparative Analysis of Total Cost of Ownership for Domain-Adapted Large Language Models versus State-of-the-art Counterparts in Chip Design Coding Assistance | 2404.08850 | None | 2024-05-28 |
Exploiting LLM Quantization | 2405.18137 | None | 2024-05-28 |
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning | 2402.17453 | https://github.com/guosyjlu/ds-agent | 2024-05-28 |
RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects | 2405.17378 | https://github.com/AUCOHL/RTL-Repo | 2024-05-27 |
LLM-Assisted Static Analysis for Detecting Security Vulnerabilities | 2405.17238 | None | 2024-05-27 |
Title | ArXiv Link | GitHub Link | Last Update |
---|---|---|---|
Code Comparison Tuning for Code Large Language Models | 2403.19121 | None | 2024-06-05 |
SemCoder: Training Code Language Models with Comprehensive Semantics | 2406.01006 | None | 2024-06-03 |
Mining Action Rules for Defect Reduction Planning | 2405.13740 | None | 2024-05-22 |
Test Oracle Automation in the era of LLMs | 2405.12766 | None | 2024-05-21 |
Fight Fire with Fire: How Much Can We Trust ChatGPT on Source Code-Related Tasks? | 2405.12641 | None | 2024-05-21 |
Automatic Programming: Large Language Models and Beyond | 2405.02213 | None | 2024-05-15 |
A Systematic Literature Review on Large Language Models for Automated Program Repair | 2405.01466 | https://github.com/isenglab/awesomellm4apr | 2024-05-12 |
Large Language Models for Blockchain Security: A Systematic Literature Review | 2403.14280 | None | 2024-05-11 |
A Deep Dive into Large Language Models for Automated Bug Localization and Repair | 2404.11595 | None | 2024-05-10 |
Automated Program Repair: Emerging trends pose and expose problems for benchmarks | 2405.05455 | None | 2024-05-08 |
Benchmarking Educational Program Repair | 2405.05347 | https://github.com/koutchemecharles/gaied_nips23 | 2024-05-08 |
NAVRepair: Node-type Aware C/C++ Code Vulnerability Repair | 2405.04994 | None | 2024-05-08 |
Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study with Practitioners of GitHub Copilot | 2311.01020 | None | 2024-04-28 |
Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models | 2404.15236 | None | 2024-04-23 |
NExT: Teaching Large Language Models to Reason about Code Execution | 2404.14662 | None | 2024-04-23 |
Multi-Objective Fine-Tuning for Enhanced Program Repair with LLMs | 2404.12636 | None | 2024-04-22 |
How Far Can We Go with Practical Function-Level Program Repair? | 2404.12833 | https://github.com/ghabix/srepair | 2024-04-19 |
CigaR: Cost-efficient Program Repair with LLMs | 2402.06598 | https://github.com/assert-kth/cigar | 2024-04-18 |
A Critical Review of Large Language Model on Software Engineering: An Example from ChatGPT and Automated Program Repair | 2310.08879 | None | 2024-04-17 |
An Empirical Evaluation of Pre-trained Large Language Models for Repairing Declarative Formal Specifications | 2404.11050 | https://github.com/mohannadcse/alloyspecrepair | 2024-04-17 |
AutoCodeRover: Autonomous Program Improvement | 2404.05427 | https://github.com/nus-apr/auto-code-rover | 2024-04-15 |
Aligning LLMs for FL-free Program Repair | 2404.08877 | None | 2024-04-13 |
LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks | 2312.12575 | None | 2024-04-13 |
The Fact Selection Problem in LLM-Based Program Repair | 2404.05520 | https://github.com/pyrepair/maniple | 2024-04-09 |
Large Language Model for Vulnerability Detection and Repair: Literature Review and the Road Ahead | 2404.02525 | None | 2024-04-06 |
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models | 2404.03543 | None | 2024-04-06 |
Peer-aided Repairer: Empowering Large Language Models to Repair Advanced Student Assignments | 2404.01754 | None | 2024-04-02 |
RAPGen: An Approach for Fixing Code Inefficiencies in Zero-Shot | 2306.17077 | None | 2024-03-28 |
Untangling Knots: Leveraging LLM for Error Resolution in Computational Notebooks | 2405.01559 | None | 2024-03-26 |
RepairAgent: An Autonomous, LLM-Based Agent for Program Repair | 2403.17134 | None | 2024-03-25 |
ChatDBG: An AI-Powered Debugging Assistant | 2403.16354 | https://github.com/plasma-umass/chatdbg | 2024-03-25 |
Title | ArXiv Link | GitHub Link | Last Update |
---|---|---|---|
A Deep Dive into Large Language Models for Automated Bug Localization and Repair | 2404.11595 | None | 2024-05-10 |
A Unified Debugging Approach via LLM-Based Multi-Agent Synergy | 2404.17153 | None | 2024-04-26 |
Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models | 2404.15236 | None | 2024-04-23 |
How Far Can We Go with Practical Function-Level Program Repair? | 2404.12833 | https://github.com/ghabix/srepair | 2024-04-19 |
A Critical Review of Large Language Model on Software Engineering: An Example from ChatGPT and Automated Program Repair | 2310.08879 | None | 2024-04-17 |
Can Large Language Models Transform Natural Language Intent into Formal Method Postconditions? | 2310.01831 | None | 2024-04-15 |
AutoCodeRover: Autonomous Program Improvement | 2404.05427 | https://github.com/nus-apr/auto-code-rover | 2024-04-15 |
Aligning LLMs for FL-free Program Repair | 2404.08877 | None | 2024-04-13 |
AgentFL: Scaling LLM-based Fault Localization to Project-Level Context | 2403.16362 | None | 2024-03-25 |
Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization | 2403.10507 | None | 2024-03-15 |
Feedback-Generation for Programming Exercises With GPT-4 | 2403.04449 | None | 2024-03-07 |
Exploring Interaction Patterns for Debugging: Enhancing Conversational Capabilities of AI-assistants | 2402.06229 | None | 2024-02-09 |
Generalized Planning in PDDL Domains with Pretrained Large Language Models | 2305.11014 | https://github.com/tomsilver/llm-genplan | 2023-12-18 |
ConDefects: A New Dataset to Address the Data Leakage Concern for LLM-based Fault Localization and Program Repair | 2310.16253 | None | 2023-10-25 |
AI-enhanced Auto-correction of Programming Exercises: How Effective is GPT-3.5? | 2311.10737 | None | 2023-10-24 |
Automatic Generation of Test Cases based on Bug Reports: a Feasibility Study with Large Language Models | 2310.06320 | None | 2023-10-10 |
Large Language Models for Test-Free Fault Localization | 2310.01726 | https://github.com/squareslab/llmao | 2023-10-03 |
Large Language Models in Fault Localisation | 2308.15276 | https://github.com/tempupload/flofchatgpt | 2023-10-02 |
A Preliminary Evaluation of LLM-Based Fault Localization | 2308.05487 | None | 2023-08-26 |
Explainable Automated Debugging via Large Language Model-driven Scientific Debugging | 2304.02195 | None | 2023-04-05 |
Automated Repair of Programs from Large Language Models | 2205.10583 | None | 2023-01-02 |
Title | ArXiv Link | GitHub Link | Last Update |
---|---|---|---|
Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study | 2405.15614 | None | 2024-05-24 |
Multi-role Consensus through LLMs Discussions for Vulnerability Detection | 2403.14274 | https://github.com/rockmao45/llmvulndetection | 2024-05-18 |
Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls | 2405.09318 | None | 2024-05-15 |
METAREFLECTION: Learning Instructions for Language Agents using Past Reflections | 2405.13009 | None | 2024-05-13 |
A Deep Dive into Large Language Models for Automated Bug Localization and Repair | 2404.11595 | None | 2024-05-10 |
Large Language Models for Cyber Security: A Systematic Literature Review | 2405.04760 | None | 2024-05-09 |
GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis | 2308.03314 | https://github.com/metatrustlabs/falcon-metatrust | 2024-05-06 |
DLAP: A Deep Learning Augmented Large Language Model Prompting Framework for Software Vulnerability Detection | 2405.01202 | https://github.com/yang-yanjing/dlap | 2024-05-02 |
Do Neutral Prompts Produce Insecure Code? FormAI-v2 Dataset: Labelling Vulnerabilities in Code Generated by Large Language Models | 2404.18353 | None | 2024-04-29 |
Evolutionary Large Language Models for Hardware Security: A Comparative Survey | 2404.16651 | None | 2024-04-25 |
When Fuzzing Meets LLMs: Challenges and Opportunities | 2404.16297 | None | 2024-04-25 |
Tasks People Prompt: A Taxonomy of LLM Downstream Tasks in Software Verification and Falsification Approaches | 2404.09384 | None | 2024-04-14 |
Prompt-Enhanced Software Vulnerability Detection Using ChatGPT | 2308.12697 | https://github.com/kdegroup/llmvulnerabilitydetection | 2024-04-12 |
Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models | 2310.00322 | None | 2024-04-06 |
Large Language Model for Vulnerability Detection and Repair: Literature Review and the Road Ahead | 2404.02525 | None | 2024-04-06 |
The FormAI Dataset: Generative AI in Software Security Through the Lens of Formal Verification | 2307.02192 | None | 2024-03-28 |
SCALE: Constructing Structured Natural Language Comment Trees for Software Vulnerability Detection | 2403.19096 | https://github.com/xin-cheng-wen/comment4vul | 2024-03-28 |
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs | 2403.18403 | https://github.com/ch3nye/foc | 2024-03-27 |
A Comprehensive Study of the Capabilities of Large Language Models for Vulnerability Detection | 2403.17218 | None | 2024-03-25 |
A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly | 2312.02003 | None | 2024-03-20 |
Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing | 2403.03897 | https://github.com/asmitaj08/fuzzingbusybox_llm | 2024-03-06 |
Finetuning Large Language Models for Vulnerability Detection | 2401.17010 | https://github.com/rmusab/vul-llm-finetune | 2024-03-01 |
LuaTaint: A Static Taint Analysis System for Web Interface Framework Vulnerability of IoT Devices | 2402.16043 | None | 2024-02-25 |
CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation | 2402.12222 | None | 2024-02-19 |
LLbezpeky: Leveraging Large Language Models for Vulnerability Detection | 2401.01269 | None | 2024-02-13 |
ReposVul: A Repository-Level High-Quality Vulnerability Dataset | 2401.13169 | https://github.com/eshe0922/reposvul | 2024-02-08 |
Effective Bug Detection in Graph Database Engines: An LLM-based Approach | 2402.00292 | None | 2024-02-01 |
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning | 2401.16185 | None | 2024-01-29 |
Language Models are Better Bug Detector Through Code-Pair Classification | 2311.07957 | https://github.com/kamel773/code_pair_classification | 2024-01-28 |
LLM4Fuzz: Guided Fuzzing of Smart Contracts with Large Language Models | 2401.11108 | None | 2024-01-20 |
How Far Have We Gone in Vulnerability Detection Using Large Language Models | 2311.12420 | https://github.com/hustcw/vulbench | 2023-12-22 |
Title | ArXiv Link | GitHub Link | Last Update |
---|---|---|---|
Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks | 2406.02550 | https://github.com/ablghtianyi/ICL_Modular_Arithmetic | 2024-06-04 |
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding | 2401.07851 | https://github.com/hemingkx/speculativedecodingpapers | 2024-06-04 |
Multiple Choice Questions and Large Languages Models: A Case Study with Fictional Medical Data | 2406.02394 | https://github.com/maximegmd/glianorex-gen | 2024-06-04 |
NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism | 2403.00862 | https://github.com/iaar-shanghai/newsbench | 2024-06-04 |
CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models | 2402.13109 | None | 2024-06-04 |
In-Context Unlearning: Language Models as Few Shot Unlearners | 2310.07579 | https://github.com/MartinPawel/In-Context-Unlearning | 2024-06-04 |
A Framework for Neurosymbolic Robot Action Planning using Large Language Models | 2303.00438 | https://github.com/alessiocpt/teriyaki | 2024-06-04 |
Distortions in Judged Spatial Relations in Large Language Models | 2401.04218 | None | 2024-06-04 |
Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data | 2406.02100 | None | 2024-06-04 |
I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering | 2406.02060 | None | 2024-06-04 |
QROA: A Black-Box Query-Response Optimization Attack on LLMs | 2406.02044 | https://github.com/qroa/qroa | 2024-06-04 |
Efficient Detection of LLM-generated Texts with a Bayesian Surrogate Model | 2305.16617 | None | 2024-06-04 |
LDB: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step | 2402.16906 | https://github.com/floridsleeves/llmdebugger | 2024-06-04 |
Iterative Forward Tuning Boosts In-Context Learning in Language Models | 2305.13016 | None | 2024-06-04 |
C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models | 2402.03181 | https://github.com/kangmintong/c-rag | 2024-06-04 |
Enhancing Trust in LLMs: Algorithms for Comparing and Interpreting LLMs | 2406.01943 | None | 2024-06-04 |
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models | 2406.01359 | None | 2024-06-04 |
REBUS: A Robust Evaluation Benchmark of Understanding Symbols | 2401.05604 | https://github.com/cvndsh/rebus | 2024-06-03 |
Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs | 2402.17649 | None | 2024-06-03 |
What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores | 2406.01538 | https://github.com/ebrahimfeghhi/beyond-brainscore | 2024-06-03 |
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering | 2402.08277 | https://github.com/EdisonNi-hku/Robust_Evidence_Based_QA | 2024-06-03 |
Understanding Preference Fine-Tuning Through the Lens of Coverage | 2406.01462 | None | 2024-06-03 |
LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation | 2406.01441 | None | 2024-06-03 |
Large Language Models are Zero-Shot Next Location Predictors | 2405.20962 | https://github.com/ssai-trento/llm-zero-shot-nl | 2024-06-03 |
Human vs. Machine: Behavioral Differences Between Expert Humans and Language Models in Wargame Simulations | 2403.03407 | https://github.com/ancorso/llmwargaming | 2024-06-03 |
Aligner: Efficient Alignment by Learning to Correct | 2402.02416 | None | 2024-06-03 |
BELLS: A Framework Towards Future Proof Benchmarks for the Evaluation of LLM Safeguards | 2406.01364 | None | 2024-06-03 |
The Dawn of Natural Language to SQL: Are We Fully Ready? | 2406.01265 | None | 2024-06-03 |
Using EEG to investigate the effectiveness of applying ChatGPT | 2403.16687 | None | 2024-06-03 |
CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge | 2402.07688 | None | 2024-06-03 |
Generating Exceptional Behavior Tests with Reasoning Augmented Large Language Models | 2405.14619 | None | 2024-05-24 |
Title | ArXiv Link | GitHub Link | Last Update |
---|---|---|---|
Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code | 2402.09299 | https://github.com/commissarsilver/trawic | 2024-02-14 |
Investigating the Efficacy of Large Language Models for Code Clone Detection | 2401.13802 | https://github.com/mkhfring/largelanguagemodels | 2024-01-30 |
Greening Large Language Models of Code | 2309.04076 | https://github.com/soarsmu/Avatar | 2024-01-12 |
Towards Understanding the Capability of Large Language Models on Code Clone Detection: A Survey | 2308.01191 | None | 2023-08-06 |
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review | 2307.02503 | None | 2023-07-04 |
Understanding Programs by Exploiting (Fuzzing) Test Cases | 2305.13592 | https://github.com/rabbitjy/fuzztuning | 2023-06-12 |
Title | ArXiv Link | GitHub Link | Last Update |
---|---|---|---|
Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search | 2401.04514 | https://github.com/alex-haochenli/reco | 2024-06-03 |
ACES: Generating Diverse Programming Puzzles with with Autotelic Generative Models | 2310.10692 | None | 2024-05-29 |
Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large Language Models | 2405.11196 | https://github.com/gksajy/slimcode | 2024-05-18 |
REINFOREST: Reinforcing Semantic Code Similarity for Cross-Lingual Code Search Models | 2305.03843 | https://github.com/reinforest-team/reinforest | 2024-04-15 |
AutoCodeRover: Autonomous Program Improvement | 2404.05427 | https://github.com/nus-apr/auto-code-rover | 2024-04-15 |
GenCodeSearchNet: A Benchmark Test Suite for Evaluating Generalization in Programming Language Understanding | 2311.09707 | None | 2023-11-16 |
The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation | 2305.06156 | https://github.com/fsoft-ai4code/thevault | 2023-10-30 |
Language Models are Universal Embedders | 2310.08232 | https://github.com/izhx/uni-rep | 2023-10-12 |
Code Representation Pre-training with Complements from Program Executions | 2309.09980 | None | 2023-09-04 |
Explainable AI for Pre-Trained Code Models: What Do They Learn? When They Do Not Work? | 2211.12821 | None | 2023-08-28 |
Title | ArXiv Link | GitHub Link | Last Update |
---|---|---|---|
ChatDev: Communicative Agents for Software Development | 2307.07924 | https://github.com/openbmb/chatdev | 2024-06-05 |
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data | 2405.19265 | https://github.com/internlm/alchemistcoder | 2024-05-29 |
AI-Assisted Assessment of Coding Practices in Modern Code Review | 2405.13565 | None | 2024-05-22 |
Multi-role Consensus through LLMs Discussions for Vulnerability Detection | 2403.14274 | https://github.com/rockmao45/llmvulndetection | 2024-05-18 |
PrAIoritize: Automated Early Prediction and Prioritization of Vulnerabilities in Smart Contracts | 2308.11082 | None | 2024-05-15 |
An Empirical Study on Code Review Activity Prediction and Its Impact in Practice | 2404.10703 | None | 2024-05-13 |
Fine-Tuning and Prompt Engineering for Large Language Models-based Code Review Automation | 2402.00905 | None | 2024-05-02 |
AI-powered Code Review with LLMs: Early Results | 2404.18496 | None | 2024-04-29 |
When LLM-based Code Generation Meets the Software Development Process | 2403.15852 | None | 2024-03-23 |
Software Vulnerability and Functionality Assessment using LLMs | 2403.08429 | None | 2024-03-13 |
CodeAgent: Collaborative Agents for Software Engineering | 2402.02172 | None | 2024-02-15 |
How to Refactor this Code? An Exploratory Study on Developer-ChatGPT Refactoring Conversations | 2402.06013 | None | 2024-02-08 |
Assured LLM-Based Software Engineering | 2402.04380 | None | 2024-02-06 |
Improving Automated Code Reviews: Learning from Experience | 2402.03777 | None | 2024-02-06 |
CodePori: Large Scale Model for Autonomous Software Development by Using Multi-Agents | 2402.01411 | None | 2024-02-02 |
The role of library versions in Developer-ChatGPT conversations | 2401.16340 | None | 2024-01-29 |
Security Code Review by LLMs: A Deep Dive into Responses | 2401.16310 | None | 2024-01-29 |
Code Review Automation: Strengths and Weaknesses of the State of the Art | 2401.05136 | https://github.com/codereviewautomationsota/code_review_automation_sota | 2024-01-10 |
The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model | 2312.17485 | None | 2023-12-29 |
Explaining Explanation: An Empirical Study on Explanation in Code Reviews | 2311.09020 | None | 2023-11-15 |
ChatGPT for Vulnerability Detection, Classification, and Repair: How Far Are We? | 2310.09810 | https://github.com/awsm-research/chatgpt4vul | 2023-10-15 |
LLaMA-Reviewer: Advancing Code Review Automation with Large Language Models through Parameter-Efficient Fine-Tuning | 2308.11148 | None | 2023-09-05 |
Enhancing Automated Program Repair through Fine-tuning and Prompt Engineering | 2304.07840 | None | 2023-07-21 |
ChatGPT: A Study on its Utility for Ubiquitous Software Engineering Tasks | 2305.16837 | None | 2023-05-26 |
CoditT5: Pretraining for Source Code and Natural Language Editing | 2208.05446 | https://github.com/engineeringsoftware/coditt5 | 2022-09-14 |