Skip to content

code_repair task evaluation #2

@scorpio-nova

Description

@scorpio-nova

Dear Authors,
Thanks for the great work! I'm planning to do some work based on this dataset, and encounter a small problem:
Following README, I've run the code_repair inference for several open-source LLMs, and the results are saved to CodeScope/code_repair/inference/result/code_repair_eval_{model_name}.jsonl by default.
I'm confused about the next step to take.

  1. The jsonl file looks like this
    image
    If I'm not mistaken, source_code is the code for LLM to debug, code_repairing_0 is the output of the LLM, with code and some explanations.
  2. However, the next step shows that the Evaluator seems to need an input jsonl in a different format?
    623901717054517_ pic

The question is how to change the jsonl in 1 to the required input format of 2, Specifically:

  1. Should the "source_code" in 2 replaced by the code extracted from the model's output, are there some code to enable this conversion?
  2. for "lang_cluster", "lang" etc. should I replace the value to "{model_name}" in the README? Or it's just a placeholder and I just keep the original value in the code_repair_eval_{model_name}.jsonl without changing anything?

Thanks for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions