Skip to content

linrenmeng/Trans_code

Repository files navigation

TransformCode

文章来源:TransformCode: A Contrastive Learning Framework for Code Embedding via Subtree Transformation | IEEE Transactions on Software Engineering

原仓库链接:iamfaith/TransformCode

所用模型为其开源的下面链接使用的clone检测模型,用于识别相似代码片段

CodeBERT

method name prediction with CodeBERT

The model file is codebert_predictor.py

Weight can be downloaded from here (Github lfs space is limited to 1Gb, so we use netdisk): 链接:https://pan.baidu.com/s/1IMaBapXZ6_tXSdxYMbQIdg?pwd=csci 提取码:csci

Useage

下载环境

pip install -r requirements.txt

然后修改code_embedder_full.py中的路径

这里的codebert_clone_model.bin需要下载,也是上面那个链接,也可见https://drive.google.com/drive/folders/1KhRi9evmwf-GvydsobV73f5uW3xAi89z?usp=drive_link

class CodeEmbedder:
    def __init__(
        self,
        tokenizer_path: str = "/yourpath/TransformCode/custom_tokenizer/WordPiece_tokenizer.json",
        weight_path: str = "/yourpath/TransformCode/weight/codebert_clone_model.bin",


然后直接用code_resource_pool.py即可得到优化片段参考

if __name__ == "__main__":
    
    # 初始化资源池
    pool = CodeResourcePool(json_path="/data/junwan/TransformCode/data/api.json")
    
    test_code = "user_input = input()"

    # 测试获取前3相似代码
    top3_codes = pool.get_top3_similar_codes(test_code)
    print("\n仅供参考的3个优化方案片段:")
    for i, code in enumerate(top3_codes, 1):
        # print(code)
        before = code["before"]
        after = code["after"]
        print(f"{i}. {before} -> {after}")

llm_test.py直接推理

    # 初始化资源池
    pool = CodeResourcePool(json_path="/yourpath/TransformCode/data/api.json")
    
    # 待优化的代码示例
    code_to_optimize = "xxx"
    
    # 执行优化
    optimized_result = code_optimization_pipeline(code_to_optimize, pool)
    
    if optimized_result:
        print(optimized_result)

About

source code

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages