Critic Training pre-processing steps #47

xylankant · 2023-06-27T14:58:08Z

Hello,

Thanks for making the code for this great project open source, this is really great!

We are using CodeRL as a really nice starting point for student projects, and there are some questions for understanding:
In the "Critic Training" section, you say the following:

We can train a critic model as a classifier that predicts the test outcomes of generated samples. For each training sample, we can follow the prior processes (generating programs and running unit tests) to obtain synthetic samples and their annotations of unit test outcomes. On average, we generate 20 programs per training sample (we provided some example generated programs in data/APPS/train/).

You don't explicitly say, but from context I think you are using the CodeT5-large-ntp-py model for this?
What do you mean by "on average" 20 programs per training sample? The generation code does not allow for "average" number of generated solutions, but will always produce the specified number of outputs per instance.
Related to that, when comparing the provided example outputs in data/APPS/train/, we see that all of the solutions provided in the gen_solutions.json files look like "good" code, and sometimes there are less than n=20. However, when using the CodeT5-large-ntp-py model to generate solutions ourselves, there are always n solutions, where sometimes the model outputs code, but a lot of the time the model produces no code at all but some other output such as repeated natural language descriptions, e.g:

print(gen_data['0']['code'][0])
�� the number of words that played the game.


ANSWER:


"""

class Solution(object):
    def reverse(self, n):
        """
        :type n: int
        :rtype: int
        """
        if n == 0:
            return -1
        l = list(bin(n))
        l.reverse()
        return sum(l)

if __name__ == '__main__':
    print Solution().reverse(int(raw_input()))

[...]

print(gen_data['0']['code'][2])
�� the answer.

ANSWER:

for all the test cases in the input, print answer for all the test cases in the order they appear.

for all the test cases in the input, print answer for all the test cases in the order they appear.

for all the test cases in the input, print answer for all the test cases in the order they appear.

for all the test cases in the input, print answer for all the test cases in the order they appear.
[...]

Is there some post-processing going on that we are overlooking?

The text was updated successfully, but these errors were encountered:

Mucalinda2436 · 2024-03-06T08:24:44Z

Hello, I also have a question in this Section. I noticed that the generated programs and their evaluations are stored in the folder 'outputs/codes/' and 'outputs/test_results'. And they also said:"For each training sample, we can follow the prior processes (generating programs and running unit tests) to obtain synthetic samples and their annotations of unit test outcomes." But why they then use the data in 'data/APPS/train' to train the critic model? I've noticed that you were asking the question in the same Section, maybe you can answer my question, thanks a lot

xylankant · 2024-03-06T09:25:40Z

When training in critic mode, the dataset will load the generated solutions as well: see here in APPSBaseDataset.
For this, you'll need to have the gen_solutions.json that contains solutions to training problems generated by the original model, which you can obtain by generating programs and running unit tests.

Hope this helps.

Mucalinda2436 · 2024-03-07T03:38:56Z

When training in critic mode, the dataset will load the generated solutions as well: see here in APPSBaseDataset. For this, you'll need to have the gen_solutions.json that contains solutions to training problems generated by the original model, which you can obtain by generating programs and running unit tests.

Hope this helps.

Thanks for helping! But after I've read the code in 'generating programs' and 'running unit tests', I noticed that they save the program generated by the actor-model in the file path 'outputs/codes/' and the evaluation results of these programs in the file path 'outputs/test_results'. So it seems that the 'gen_solutions.json' you mentioned under the path 'data/APPS/train/prob_path/' doesn't exist since they didn't save anything to this file in the 'generating programs' and 'running unit tests'. So I wonder is there any code I missed? Which is used for adding content in the file 'gen_solutions.json'? Thanks for your answering again🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Critic Training pre-processing steps #47

Critic Training pre-processing steps #47

xylankant commented Jun 27, 2023

Mucalinda2436 commented Mar 6, 2024

xylankant commented Mar 6, 2024

Mucalinda2436 commented Mar 7, 2024

Critic Training pre-processing steps #47

Critic Training pre-processing steps #47

Comments

xylankant commented Jun 27, 2023

Mucalinda2436 commented Mar 6, 2024

xylankant commented Mar 6, 2024

Mucalinda2436 commented Mar 7, 2024