# **Hands-On: Legacy Code Translation ‚Äî Intern Exercise Workbook**

---

### üìã Problem Statement

You are given a piece of **legacy code** written in an outdated programming language (COBOL). Your tasks are:

1. **Understand** the code's functionality ‚Äî aim, core features, and logic
2. **Explain** it clearly for a modern developer
3. **Translate** it into a modern programming language (Java)
4. **Generate test cases** to verify the translation

You will use **prompt engineering** to accomplish each task with an LLM.

---

### üìã Instructions
- Fill in all `___` blanks and `# TODO` sections
- üéØ = Hint | üèÜ = Bonus Challenge

---

## Section 1: OpenAI Setup

### Q1.1 ‚Äî Install the OpenAI Library

**Task:** Install the latest `openai` package.

In [None]:
# TODO: Install the openai package

!pip install ___

### Q1.2 ‚Äî Create the Completion Helper

**Task:** Complete the helper function that:
- Uses the newer `openai.chat.completions.create()` API
- Uses `gpt-4o` model
- Sends the prompt as a user message
- Sets `max_tokens=1000`
- Returns only the text content

In [None]:
import openai

# TODO: Set your API key
openai.api_key = "___"

def completion(prompt):
    """Send a prompt to GPT-4o and return the response text."""

    # TODO: Call the chat completions API
    response = openai.chat.completions.create(
        model="___",  # Which model?
        messages=[{"role": "___", "content": ___}],
        max_tokens=___
    )

    # TODO: Return the text content
    # üéØ Hint: response.choices[0].message.content
    return ___

---

## Section 2: Legacy Code (Small Example)

Here is a small COBOL program that demonstrates Language Environment callable services for date/time handling and message output.

### Q2.1 ‚Äî Load the Legacy Code

**Task:** Run this cell to load the COBOL code. Read through it carefully.

In [None]:
# Run as-is ‚Äî this is the legacy COBOL code

legacy_code = """
****************************************************************
      *  This program demonstrates the following Language            *
      *  Environment callable                                        *
      *  services : CEEMOUT, CEELOCT, CEEDATE                        *
      ****************************************************************
      ****************************************************************
      **           I D          D I V I S I O N                    ***
      ****************************************************************
       Identification Division.
       Program-id.    AWIXMP.
      ****************************************************************
      **           D A T A      D I V I S I O N                    ***
      ****************************************************************
       Data Division.
       Working-Storage Section.
      ****************************************************************
      **  Declarations for the local date/time service.
      ****************************************************************
       01   Feedback.
       COPY CEEIGZCT
        02   Fb-severity      PIC 9(4) Binary.
        02   Fb-detail        PIC X(10).
       77   Dest-output       PIC S9(9) Binary.
       77   Lildate           PIC S9(9) Binary.
       77   Lilsecs           COMP-2.
       77   Greg              PIC X(17).
      ****************************************************************
      **  Declarations for messages and pattern for date formatting.
      ****************************************************************
       01   Pattern.
        02                    PIC 9(4) Binary Value 45.
        02                    PIC X(45) Value
            "Today is Wwwwwwwwwwwwz, Mmmmmmmmmmz ZD, YYYY.".

       77   Start-Msg         PIC X(80) Value
            "Callable Service example starting.".

       77   Ending-Msg        PIC X(80) Value
            "Callable Service example ending.".

       01 Msg.
         02 Stringlen         PIC S9(4) Binary.
         02 Str               .
          03                  PIC X Occurs 1 to 80 times
                                     Depending on Stringlen.
      ****************************************************************
      **           P R O C      D I V I S I O N                    ***
      ****************************************************************
       Procedure Division.
       000-Main-Logic.
           Perform 100-Say-Hello.
           Perform 200-Get-Date.
           Perform 300-Say-Goodbye.
           Stop Run.
      **
      ** Setup initial values and say we are starting.
      **
       100-Say-Hello.
           Move 80 to Stringlen.
           Move 02 to Dest-output.
           Move Start-Msg to Str.
           CALL "CEEMOUT" Using Msg   Dest-output Feedback.
           Move Spaces to Str.        CALL "CEEMOUT" Using Msg Dest-output Feedback.
      **
      ** Get the local date and time and display it.
      **
       200-Get-Date.
           CALL "CEELOCT" Using Lildate Lilsecs     Greg      Feedback.
           CALL "CEEDATE" Using Lildate Pattern     Str       Feedback.
           CALL "CEEMOUT" Using Msg     Dest-output Feedback.
           Move Spaces to Str.
           CALL "CEEMOUT" Using Msg     Dest-output Feedback.
      **
      ** Say Goodbye.
      **
       300-Say-Goodbye.
           Move Ending-Msg to Str.
           CALL "CEEMOUT" Using Msg     Dest-output Feedback.
       End program AWIXMP.
"""

print("‚úÖ Legacy COBOL code loaded! Length:", len(legacy_code), "characters")

### Q2.2 ‚Äî Write a Basic Translation Prompt

**Task:** Write a simple prompt that asks the LLM to convert the COBOL code to Java. Use an f-string to inject the `legacy_code` variable and the target `language`.

üéØ **Hint:** This is a minimal first attempt ‚Äî we'll improve it in later questions.

In [None]:
# TODO: Write a basic translation prompt
language = "java"

prompt = f"""___
  {legacy_code}
  code:"""

response = completion(prompt)
print(response)

### Q2.3 ‚Äî üèÜ Improved Prompt with P.I.A.R.O

**Task:** Rewrite the prompt using the **P.I.A.R.O** framework to get a better translation:

- **Persona:** Expert legacy code migration specialist
- **Information:** The COBOL code + target language
- **Action:** Analyze the code, explain its purpose, then translate it
- **Rules:** Preserve all functionality, add comments explaining the mapping, use modern Java best practices
- **Output:** First a brief explanation, then the complete Java code

In [None]:
# TODO: Write an improved prompt using P.I.A.R.O framework
language = "java"

prompt = f"""
[PERSONA]: ___

[INFORMATION]:
Target language: {language}
Legacy COBOL code:
```{legacy_code}```

[ACTION]: ___

[RULES]:
- ___
- ___
- ___

[OUTPUT]:
- ___
- ___
"""

response = completion(prompt)
print(response)

### Q2.4 ‚Äî üèÜ Generate Test Cases

**Task:** Write a prompt that generates **JUnit test cases** for the translated Java code to verify correctness.

Your prompt should:
- Reference the original COBOL functionality
- Ask for test cases that verify the translation is accurate
- Request at least 3 test methods

In [None]:
# TODO: Write a test case generation prompt

prompt = f"""
___
"""

response = completion(prompt)
print(response)

---

## Section 3: Bigger Legacy Code Challenge

Now tackle a **much larger** COBOL program ‚Äî an IMS batch processing application that reads database segments and builds output records.

### Q3.1 ‚Äî Load the Bigger Legacy Code

**Task:** Run this cell to load the larger COBOL program.

In [None]:
# Run as-is ‚Äî this is the larger legacy COBOL code

bigger_legacy_code = """
******************************************************************
IDENTIFICATION DIVISION.
******************************************************************
PROGRAM-ID. IMSBATCH.
AUTHOR. ABC.
INSTALLATION. DEPARTMENT.
DATE-WRITTEN. DATE XX/XX/XX.
DATE-COMPILED.
******************************************************************
ENVIRONMENT DIVISION.
****************************************************************
CONFIGURATION SECTION.
SOURCE-COMPUTER. IBM-370.
OBJECT-COMPUTER. IBM-370.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT OUTFILE ASSIGN TO UT-S-OUTFILE.
******************************************************************
DATA DIVISION.
******************************************************************
FILE SECTION.
FD OUTFILE
RECORD IS VARYING IN SIZE FROM 383 TO 22727 CHARACTERS RECORDING MODE IS V
BLOCK CONTAINS 0 RECORDS
LABEL RECORDS ARE STANDARD
DATA RECORD IS O-PLT.
01 O-PLT.
05 O-PLT-DATA-AREA.
 10 O-PLTROOT-PREFIX PIC X(001).
 10 O-PLTROOT-NAME PIC X(008).
 10 O-PLTROOT-TITLE PIC X(050).
 10 O-PLTROOT-TERM-PRT-TABLE PIC X(200).
 10 O-PLT-SEQ PIC 9(008).
 10 O-PLTPE-SEQ-LAST PIC 9(008).
 10 O-SEGPROF-SEQ-LAST PIC 9(008).
05 O-PLTPE.
 10 O-PLTPE-KEY-SEQ PIC 9(03).
 10 O-PLTPE-TYPE PIC X(01).
 10 O-PLTPE-NAME PIC X(08).
 10 O-PLTPE-ACCESS PIC X(01).
 10 O-PLTPE-TITLE PIC X(30).
 10 O-PLTPE-COURTPRF PIC X(01).
05 O-SEGPROF OCCURS 400 TIMES DEPENDING ON W-SEGPROF-SEQ-LAST
INDEXED BY O-SEGPROF-INDEX.
 10 O-SEGPROF-SEQ PIC 9(08).
 10 O-SEGPROF-KEY-PREFIX PIC X(01).
 10 O-SEGPROF-KEY-SUFFIX PIC X(08).
 10 O-SEGPROF-PLT-NAME PIC X(08).
 10 O-SEGPROF-NOTE PIC X(30).
 10 O-SEGPROF-ACCESS PIC X(01).
WORKING-STORAGE SECTION.
... [truncated for display ‚Äî full code available in variable]
PROCEDURE DIVISION.
ENTRY 'DLITCBL' USING DLIPCB1 DLIPCB2.
DISPLAY '*** START IMSBATCH'.
OPEN OUTPUT OUTFILE.
MOVE 0 TO O-PLT-SEQ.
MOVE GU TO READ-PLT-OPCODE.
MOVE GU TO READ-SEC-OPCODE.
... [full program reads IMS segments, builds records, writes output]
END-OF-DATABASE.
DISPLAY '*** END-OF-DATABASE'.
CLOSE OUTFILE.
DISPLAY '*** END-OF-IMSBATCH'.
GOBACK.
"""

print("‚úÖ Bigger legacy code loaded! Length:", len(bigger_legacy_code), "characters")

### Q3.2 ‚Äî Step 1: Understand the Code

Before translating, we need to **understand** what this program does.

**Task:** Write a prompt that asks the LLM to:
1. Explain the **aim** of this COBOL program
2. List the **core functionalities**
3. Describe the **logic flow** (step by step)
4. Keep the explanation suitable for a developer unfamiliar with COBOL

In [None]:
# TODO: Write a prompt to understand/explain the legacy code

prompt = f"""
___

```{bigger_legacy_code}```

___
"""

response = completion(prompt)
print(response)

### Q3.3 ‚Äî Step 2: Translate to Java

**Task:** Write a comprehensive prompt that translates the bigger COBOL program to Java. Since this is a complex IMS batch program, your prompt should address:
- How to handle IMS DL/I calls (suggest modern alternatives like JDBC or JPA)
- How to map COBOL data structures to Java classes
- How to handle the file output

In [None]:
# TODO: Write a comprehensive translation prompt for the bigger code
language = "java"

prompt = f"""
___

```{bigger_legacy_code}```

___
"""

response = completion(prompt)
print(response)

### Q3.4 ‚Äî üèÜ Step 3: Generate Test Cases

**Task:** Write a prompt to generate test cases that verify the translated Java code preserves the original COBOL logic.

In [None]:
# TODO: Write a test case generation prompt for the bigger code

prompt = f"""
___
"""

response = completion(prompt)
print(response)

---

## üìù Self-Assessment Checklist

| # | Checkpoint | Done? |
|---|-----------|-------|
| 1 | OpenAI setup works with `gpt-4o` | ‚òê |
| 2 | Basic translation prompt produces Java code | ‚òê |
| 3 | Improved P.I.A.R.O prompt produces better, documented code | ‚òê |
| 4 | Test case generation prompt produces runnable JUnit tests | ‚òê |
| 5 | Code explanation prompt identifies aim, features, and logic | ‚òê |
| 6 | Bigger code translation handles IMS/DL-I concepts | ‚òê |