Code translation aims to convert source code from one programming language (PL) to another. Given the promising abilities of large language models (LLMs) in code synthesis, researchers are exploring their potential to automate code translation. In our recent paper [https://dl.acm.org/doi/10.1145/3597503.3639226] published at ICSE'24, we found that LLM-based code translation is very promising. In this example, we will walk through the steps of translating each Java class to Python and checking various properties of translated code, such as the number of methods, number of fields, formal arguments, etc.

(Step 1) First, we will import all the necessary libraries

In [None]:
import ollama
from cldk import CLDK
from cldk.analysis import AnalysisLevel

(Step 2) Second, we will form the prompt for the model, which will include the body of the Java class after removing all the comments and the import statements.

In [None]:
def format_inst(code, focal_class, language):
    """
    Format the instruction for the given focal method and class.
    """
    inst = f"Question: Can you translate the Java class `{focal_class}` below to Python and generate under code block (```)?\n"

    inst += "\n"
    inst += f"```{language}\n"
    inst += code
    inst += "```" if code.endswith("\n") else "\n```"
    inst += "\n"
    return inst

(Step 3) Create a function to call LLM. There are various ways to achieve that. However, for illustrative purpose, we use ollama, a library to communicate with models downloaded locally.

In [None]:
def prompt_ollama(message: str, model_id: str = "granite-code:8b-instruct") -> str:
    """Prompt local model on Ollama"""
    response_object = ollama.generate(model=model_id, prompt=message)
    return response_object["response"]