Skip to content

zkcpku/HiT-hierarchy-transformer

Repository files navigation

Implant Global and Local Hierarchy Information to Sequence based Code Representation Models

Accepted by ICPC 2023: https://conf.researchr.org/details/icpc-2023/icpc-2023-research/1/Implant-Global-and-Local-Hierarchy-Information-to-Sequence-based-Code-Representation-

Link to Preprint: https://arxiv.org/pdf/2303.07826.pdf

  • We list source code for four tasks, including code classificaiton, clone detection, method name prediction and variable scope detection.
  • We edit config files which contain personal information. So be careful! We will release our full version of source code as soon as possible. There might be some small adjustments to the file structure on Github that could cause certain bugs. I will organize it when I have some free time later. If you find any issues, feel free to create an issue or send me an email.
  • The preprocessing script for each task is in each dir.
  • RAW DATASET Link are from open-source repo:

Clone Detection

In the dictionary Clone-detection-POJ-104\

python preprocess_path.py to process the datasets

python run.py to train our model

You can also change the config file in config.py for testing, just set args.test = True

Code Classification

In the dictionary classification\

python code_classfication_preprocess.py and python processed_tokens_with_path.py to process the datasets, be careful to change the dataset path in each python file

python generate_vocab.py to create the vocab file.

python run_xxx.py to train or test different models, we give a example sh in run.sh

Method Name Prediction

In the dictionary methodname\

python preprocess_seq_path.py to process the datasets

python __main__.py to train the model

You can also change the config file in config.py for testing, just set args.test = True

Variable Scope Detection

In the dictionary variable scope detection\

Use the same datasets with the classification task and make tiny changes

python probe_for_HiT.py to train and test the model

You can also change the config file in probe_config.py for testing, just set args.test = True

About

code for "Implant Global and Local Hierarchy Information to Sequence based Code Representation Models"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published