Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Raw Data:

Description:

We release our validation and test dataset. You can download the raw data here.

The format of each line in each file is <FUNCTION_ID> | <function>. The function are tokenized. You can detokenize them with the script preprocessing/detokenize.py. You can extract the function id and use it to find the corresponding test script in data/evaluation/geeks_for_geeks_successful_test_scripts/<language> if it exists.

For instance, for the line COUNT_SET_BITS_IN_AN_INTEGER_3 | <function> in the file test.cpp.shuf.valid.tok, the corresponding test script can be found in data/evaluation/geeks_for_geeks_successful_test_scripts/cpp/COUNT_SET_BITS_IN_AN_INTEGER_3.cpp. If the script is missing, it means there was an issue with our automatically created tests for the corresponding function.

The code generated by your model can be tested by injecting it where the TO_FILL comment is in the test script.

wget https://dl.fbaipublicfiles.com/transcoder/TransCoder_tokenized_test_set_functions.zip

Test & Evaluation Datasets:

wget https://dl.fbaipublicfiles.com/transcoder/TransCoder_test_val_data.zip