RobustAPI

The official repo for the paper Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation (AAAI'24).

In this dataset, we collect 1208 coding questions (dataset/question.jsonl) from StackOverflow on 24 representative Java APIs (see the details in dataset/api_list.txt). We summarize the use patterns of these APIs (eval/pat_list.txt) and evaluate them on popular LLMs including GPT-3.5, GPT-4, Llama, PolyCoder and Vicuna. Hugging Face

Setup

See llama, Vicuna, GPT for setting up instructions.

Install the dependencies:

pip install -r requirements.txt

Prompts

To generate responses from the large language models, see scripts in scripts/ask*.

Evaluator

To evaluate the API misuse rate in the question answers, see scripts in scripts/eval*

Since the API checker is written in Java, you need to have Java Runtime Environment installed on your machine. In our experimetns, it is validated to work under version OpenJDK 11.0.20.1.

We acknowledge ICSE'18 paper ExampleCheck, based on which we build the checker.

Results of Evaluation

The code responses are in results/. Each model has a directory, in which every json file corresponds to the response from the large language model to the Stack Overflow questions. The numbering follows the same numbering in dataset/question.jsonl.

If you find our work useful, please cite the paper:

@misc{zhong2023chatgpt,
      title={Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation}, 
      author={Li Zhong and Zilong Wang},
      year={2023},
      eprint={2308.10335},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
dataset		dataset
eval		eval
results		results
scripts		scripts
.gitignore		.gitignore
README.md		README.md
askGPT.py		askGPT.py
askHF.py		askHF.py
image.png		image.png
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RobustAPI

Setup

Prompts

Evaluator

Results of Evaluation

About

Releases

Packages

Languages

FloridSleeves/RobustAPI

Folders and files

Latest commit

History

Repository files navigation

RobustAPI

Setup

Prompts

Evaluator

Results of Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages