# RPGLE to Spring Boot Java Converter

This notebook provides a tool to convert modern RPGLE programs to Spring Boot Java applications using large language models (LLMs).

## Overview
1. Upload and analyze RPGLE files
2. Detect and classify formats
3. Extract program metadata
4. Build program metadata
5. Plot and analyze program relationships
6. Display analysis results

## Setup and Dependencies

First, let's install the required libraries.

In [3]:
!pip install openai pandas matplotlib networkx plotly ipywidgets google-generativeai

Collecting jedi>=0.16 (from ipython>=4.0.0->ipywidgets)
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Downloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m14.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: jedi
Successfully installed jedi-0.19.2


In [4]:
import os
import re
import json
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import ipywidgets as widgets
from IPython.display import display, HTML, clear_output
import openai
from google.colab import files
import glob
from collections import defaultdict
import time
import shutil
try:
    import google.generativeai as genai
except ImportError:
    pass

## Set API Key Directly

Enter your OpenAI API Key below to use for RPGLE analysis.

In [1]:
# Install the required packages first
!pip install -q google-generativeai

In [2]:
# Simple Gemini API setup
import google.generativeai as genai
from IPython.display import display
import ipywidgets as widgets

# Input for Gemini API key
api_key_input = widgets.Password(
    description='Gemini API Key:',
    placeholder='Enter your Gemini API key here',
    layout=widgets.Layout(width='500px')
)

submit_button = widgets.Button(description="Set API Key")
output_area = widgets.Output()

# Global variable to track if API is configured
api_configured = False

def on_submit_button_clicked(b):
    global api_configured
    with output_area:
        output_area.clear_output()
        api_key = api_key_input.value

        if not api_key:
            print("Error: API key cannot be empty")
            api_configured = False
            return

        try:
            # Configure the Gemini API
            genai.configure(api_key=api_key)

            # Simple test to see if it works
            model = genai.GenerativeModel('gemini-1.5-flash')
            response = model.generate_content("Say hello")

            print("✅ Gemini API configured successfully!")
            api_configured = True

        except Exception as e:
            print(f"❌ Error configuring Gemini API: {e}")
            api_configured = False

submit_button.on_click(on_submit_button_clicked)

display(api_key_input)
display(submit_button)
display(output_area)

Password(description='Gemini API Key:', layout=Layout(width='500px'), placeholder='Enter your Gemini API key h…

Button(description='Set API Key', style=ButtonStyle())

Output()

In [7]:
# Simple function to query Gemini API
def query_llm(prompt, max_tokens=None):
    """Simple function to query Google's Gemini API"""
    if not api_configured:
        print("Error: API not configured. Please set your Gemini API key first.")
        return None

    try:
        # Create Gemini model (using a simpler model for reliability)
        model = genai.GenerativeModel('gemini-1.5-flash')

        # Call the API
        response = model.generate_content(prompt)

        # Return the response text
        return response.text

    except Exception as e:
        print(f"Error querying Gemini API: {e}")
        return None

In [6]:
## Simple Test of API Connection

# Run this cell to test your API connection
def test_api():
    if not OPENAI_API_KEY:
        print("API key is not set. Please run the cell above to set your API key.")
        return

    test_prompt = "Say hello and confirm you can process RPGLE code. Keep it very brief."

    print("Testing API connection...")
    result = query_llm(test_prompt, max_tokens=100)

    if result:
        print("\nAPI Test Successful! Response:")
        print("----------------------------")
        print(result)
        print("----------------------------")
        print("You can now proceed with the RPGLE analysis.")
    else:
        print("\nAPI test failed. Please check your API key and try again.")

# Run the test
test_api()

NameError: name 'OPENAI_API_KEY' is not defined

In [8]:
uploaded_files = []
file_contents = {}

def upload_rpgle_files():
    global uploaded_files, file_contents
    uploaded = files.upload()

    for filename, content in uploaded.items():
        if filename not in file_contents:
            file_contents[filename] = content.decode('utf-8')
            uploaded_files.append(filename)

    print(f"Total files uploaded: {len(uploaded_files)}")
    for filename in uploaded_files:
        print(f"- {filename}")

upload_button = widgets.Button(description="Upload RPGLE Files")
upload_output = widgets.Output()

def on_upload_button_clicked(b):
    with upload_output:
        clear_output()
        upload_rpgle_files()

upload_button.on_click(on_upload_button_clicked)

display(upload_button, upload_output)

Button(description='Upload RPGLE Files', style=ButtonStyle())

Output()

## Step 2: Analyze Dependencies Between Files

Identify relationships and dependencies between RPGLE programs using LLM.

In [39]:
def analyze_dependencies(files_dict):
    """Analyze dependencies between RPGLE files using LLM."""
    dependencies = {}
    file_summaries = {}

    # For each file, ask LLM to identify dependencies
    for filename, content in files_dict.items():
        print(f"Analyzing dependencies for {filename}...")

        # Prepare a prompt for the LLM
        prompt = f"""Analyze this RPGLE code and identify all dependencies:
        1. External program calls (CALL, CALLP)
        2. File/database accesses
        3. Data structure includes or copybooks
        4. Other module imports

        Return the results in JSON format with these keys:
        - program_calls: [list of called programs]
        - file_accesses: [list of files/tables accessed]
        - copybooks: [list of included copybooks/data structures]
        - imports: [list of imported modules]
        - brief_summary: short description of what this program does

        Here's the RPGLE code:
        ```
        {content[:15000]}
        ```
        """

        result = query_llm(prompt)

        if result:
            try:
                # Extract the JSON part from the response
                json_match = re.search(r'\{[\s\S]*\}', result)
                if json_match:
                    json_str = json_match.group(0)
                    deps = json.loads(json_str)
                    dependencies[filename] = deps
                    file_summaries[filename] = deps.get('brief_summary', 'No summary available')
                else:
                    print(f"Could not extract JSON from LLM response for {filename}")
            except Exception as e:
                print(f"Error processing dependencies for {filename}: {e}")
                print("LLM Response:", result)
        else:
            print(f"No response from LLM for {filename}")

    return dependencies, file_summaries

analyze_deps_button = widgets.Button(description="Analyze Dependencies")
analyze_deps_output = widgets.Output()

dependencies_result = {}
file_summaries = {}

def on_analyze_deps_button_clicked(b):
    global dependencies_result, file_summaries
    with analyze_deps_output:
        clear_output()
        if not file_contents:
            print("Please upload RPGLE files first.")
            return

        print("Analyzing dependencies between files...")
        dependencies_result, file_summaries = analyze_dependencies(file_contents)
        print("\nDependency analysis complete!")
        print(f"Analyzed {len(dependencies_result)} files.")
        print(f"Analyzed {dependencies_result} files.")

analyze_deps_button.on_click(on_analyze_deps_button_clicked)

display(analyze_deps_button, analyze_deps_output)

Button(description='Analyze Dependencies', style=ButtonStyle())

Output()

## Step 3: Detect and Classify Formats

Determine complex format detection and classification using LLM.

In [26]:
def detect_formats(files_dict):
    """Detect and classify formats in RPGLE files using LLM."""
    format_results = {}

    for filename, content in files_dict.items():
        print(f"Detecting formats in {filename}...")

        # Important: Escape the curly braces in the JSON example with double braces
        prompt = f"""Analyze this RPGLE code and identify all format specifications and their usage:
        1. F-spec (file specifications)
        2. D-spec (definition specifications)
        3. P-spec (procedure specifications)
        4. C-spec (calculation specifications)
        5. Modern free-format statements

        Return the results in JSON format with these keys:
        - format_type: "fixed" or "free" or "mixed"
        - spec_counts: {{\"F\": 0, \"D\": 0, \"P\": 0, \"C\": 0, \"free\": 0}}
        - complex_formats: [list of complex format types found]
        - data_structures: [list of data structure names and their purpose]
        - file_formats: [list of file formats used]

        Here's the RPGLE code:
        ```
        {content[:15000]}
        ```
        """

        result = query_llm(prompt)

        if result:
            try:
                # Extract the JSON part from the response
                json_match = re.search(r'\{[\s\S]*\}', result)
                if json_match:
                    json_str = json_match.group(0)
                    formats = json.loads(json_str)
                    format_results[filename] = formats
                else:
                    print(f"Could not extract JSON from LLM response for {filename}")
                    print("LLM response:", result[:500])
            except Exception as e:
                print(f"Error processing formats for {filename}: {e}")
                print("LLM Response:", result[:500] if result else "None")
        else:
            print(f"No response from LLM for {filename}")

    return format_results

formats_button = widgets.Button(description="Detect Formats")
formats_output = widgets.Output()

format_results = {}

def on_formats_button_clicked(b):
    global format_results
    with formats_output:
        clear_output()
        if not file_contents:
            print("Please upload RPGLE files first.")
            return

        print("Detecting and classifying formats...")
        format_results = detect_formats(file_contents)
        print("\nFormat detection complete!")
        print(f"Analyzed formats in {len(format_results)} files.")
        print(f"Analyzed formats in {format_results} files.")

formats_button.on_click(on_formats_button_clicked)

display(formats_button, formats_output)

Button(description='Detect Formats', style=ButtonStyle())

Output()

## Step 4: Extract Program Metadata

Extract detailed metadata from each RPGLE program using LLM.

In [28]:
def extract_metadata(files_dict):
    """Extract program metadata from RPGLE files using LLM."""
    metadata_results = {}

    for filename, content in files_dict.items():
        print(f"Extracting metadata from {filename}...")

        prompt = f"""Extract detailed metadata from this RPGLE program:
        1. Program name and purpose
        2. Author information (if available)
        3. Creation date and modification history (if available)
        4. Input parameters and return values
        5. Global variables and constants
        6. Main procedures/subroutines and their purpose
        7. Business rules implemented

        Return the results in JSON format with these keys:
        - program_name: name of the program
        - purpose: main purpose of the program
        - author: author information
        - creation_date: creation date
        - parameters: [list of input parameters]
        - return_values: [list of return values]
        - globals: [list of global variables]
        - procedures: [list of procedures and their purpose]
        - business_rules: [list of business rules implemented]

        Here's the RPGLE code:
        ```
        {content[:15000]}
        ```
        """

        result = query_llm(prompt)

        try:
            # Extract the JSON part from the response
            json_match = re.search(r'\{[\s\S]*\}', result)
            if json_match:
                json_str = json_match.group(0)
                metadata = json.loads(json_str)
                metadata_results[filename] = metadata
            else:
                print(f"Could not extract JSON from LLM response for {filename}")
        except Exception as e:
            print(f"Error processing metadata for {filename}: {e}")
            print("LLM Response:", result)

    return metadata_results

metadata_button = widgets.Button(description="Extract Metadata")
metadata_output = widgets.Output()

metadata_results = {}

def on_metadata_button_clicked(b):
    global metadata_results
    with metadata_output:
        clear_output()
        if not file_contents:
            print("Please upload RPGLE files first.")
            return

        print("Extracting program metadata...")
        metadata_results = extract_metadata(file_contents)
        print("\nMetadata extraction complete!")
        print(f"Extracted metadata from {len(metadata_results)} files.")
        print(f"Extracted metadata from {metadata_results} files.")

metadata_button.on_click(on_metadata_button_clicked)

display(metadata_button, metadata_output)

Button(description='Extract Metadata', style=ButtonStyle())

Output()

## Step 5: Build Program Relationship Graph

Create a visualization of program relationships and dependencies.

In [29]:
def build_relationship_graph(dependencies_data):
    """Build a graph representing program relationships."""
    G = nx.DiGraph()

    # Add nodes for all files
    for filename in dependencies_data.keys():
        G.add_node(filename, type='program')

    # Add nodes and edges for dependencies
    for filename, deps in dependencies_data.items():
        # Add program calls
        for called_program in deps.get('program_calls', []):
            if called_program not in G:
                G.add_node(called_program, type='external_program')
            G.add_edge(filename, called_program, type='calls')

        # Add file accesses
        for file_access in deps.get('file_accesses', []):
            if file_access not in G:
                G.add_node(file_access, type='file')
            G.add_edge(filename, file_access, type='accesses')

        # Add copybooks
        for copybook in deps.get('copybooks', []):
            if copybook not in G:
                G.add_node(copybook, type='copybook')
            G.add_edge(filename, copybook, type='includes')

        # Add imports
        for import_module in deps.get('imports', []):
            if import_module not in G:
                G.add_node(import_module, type='module')
            G.add_edge(filename, import_module, type='imports')

    return G

def plot_relationship_graph(G, file_summaries):
    """Plot the program relationship graph using Plotly."""
    # Node positions using a spring layout
    pos = nx.spring_layout(G, seed=42)

    # Node colors based on type
    node_colors = {
        'program': 'blue',
        'external_program': 'green',
        'file': 'red',
        'copybook': 'purple',
        'module': 'orange'
    }

    # Edge colors based on type
    edge_colors = {
        'calls': 'blue',
        'accesses': 'red',
        'includes': 'purple',
        'imports': 'orange'
    }

    # Create the plot
    fig = make_subplots(rows=1, cols=1, specs=[[{'type': 'scatter'}]])

    # Add nodes
    node_trace_data = {}
    for node_type in node_colors.keys():
        node_trace_data[node_type] = {
            'x': [],
            'y': [],
            'text': [],
            'hovertext': []
        }

    for node in G.nodes():
        node_type = G.nodes[node].get('type', 'program')
        x, y = pos[node]
        node_trace_data[node_type]['x'].append(x)
        node_trace_data[node_type]['y'].append(y)
        node_trace_data[node_type]['text'].append(node)

        # Add summary to hovertext if available
        hover_text = node
        if node in file_summaries:
            hover_text += f"<br>{file_summaries[node]}"
        node_trace_data[node_type]['hovertext'].append(hover_text)

    # Create a trace for each node type
    for node_type, data in node_trace_data.items():
        if data['x']:
            fig.add_trace(
                go.Scatter(
                    x=data['x'],
                    y=data['y'],
                    mode='markers',
                    marker=dict(size=15, color=node_colors[node_type]),
                    text=data['hovertext'],
                    hoverinfo='text',
                    name=node_type
                )
            )

    # Add edges
    edge_trace_data = {}
    for edge_type in edge_colors.keys():
        edge_trace_data[edge_type] = {
            'x': [],
            'y': [],
            'text': []
        }

    for edge in G.edges(data=True):
        source, target, attr = edge
        edge_type = attr.get('type', 'calls')
        x0, y0 = pos[source]
        x1, y1 = pos[target]

        # Add two points for a line
        edge_trace_data[edge_type]['x'].extend([x0, x1, None])
        edge_trace_data[edge_type]['y'].extend([y0, y1, None])
        edge_trace_data[edge_type]['text'].append(f"{source} {edge_type} {target}")

    # Create a trace for each edge type
    for edge_type, data in edge_trace_data.items():
        if data['x']:
            fig.add_trace(
                go.Scatter(
                    x=data['x'],
                    y=data['y'],
                    mode='lines',
                    line=dict(width=1, color=edge_colors[edge_type]),
                    hoverinfo='none',
                    name=edge_type
                )
            )

    # Update layout
    fig.update_layout(
        title='Program Relationship Graph',
        showlegend=True,
        hovermode='closest',
        margin=dict(b=20, l=5, r=5, t=40),
        xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
        yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
        height=800,
        legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01)
    )

    return fig

graph_button = widgets.Button(description="Build Relationship Graph")
graph_output = widgets.Output()

relationship_graph = None

def on_graph_button_clicked(b):
    global relationship_graph
    with graph_output:
        clear_output()
        if not dependencies_result:
            print("Please analyze dependencies first.")
            return

        print("Building program relationship graph...")
        relationship_graph = build_relationship_graph(dependencies_result)
        print(f"Created graph with {len(relationship_graph.nodes())} nodes and {len(relationship_graph.edges())} edges.")

        print("\nGenerating visualization...")
        fig = plot_relationship_graph(relationship_graph, file_summaries)
        display(fig)

graph_button.on_click(on_graph_button_clicked)

display(graph_button, graph_output)

Button(description='Build Relationship Graph', style=ButtonStyle())

Output()

## Step 6: Display Analysis Results

Summarize and display all analysis results.

In [ ]:
def display_analysis_results():
    """Display comprehensive analysis results."""
    if not file_contents or not dependencies_result or not format_results or not metadata_results:
        print("Please complete all analysis steps first.")
        return
    
    # Create a tabbed interface for results
    tab_titles = ['Overview', 'Dependencies', 'Formats', 'Metadata', 'Spring Boot Conversion']
    tabs = widgets.Tab()
    tabs.children = [widgets.Output() for _ in tab_titles]
    for i, title in enumerate(tab_titles):
        tabs.set_title(i, title)
    
    # Overview tab
    with tabs.children[0]:
        print(f"## RPGLE Analysis Overview")
        print(f"Total files analyzed: {len(file_contents)}")
        for filename in file_contents.keys():
            print(f"\n### {filename}")
            if filename in file_summaries:
                print(f"Purpose: {file_summaries[filename]}")
    
    # Dependencies tab
    with tabs.children[1]:
        print(f"## Program Dependencies")
        for filename, deps in dependencies_result.items():
            print(f"\n### {filename}")
            print(f"Program calls: {', '.join(deps.get('program_calls', ['None']))}")
            print(f"File accesses: {', '.join(deps.get('file_accesses', ['None']))}")
            print(f"Copybooks: {', '.join(deps.get('copybooks', ['None']))}")
            print(f"Imports: {', '.join(deps.get('imports', ['None']))}")
    
    # Formats tab
    with tabs.children[2]:
        print(f"## Format Classification")
        for filename, fmt in format_results.items():
            print(f"\n### {filename}")
            print(f"Format type: {fmt.get('format_type', 'Unknown')}")
            if 'spec_counts' in fmt:
                print(f"Specification counts: {json.dumps(fmt['spec_counts'], indent=2)}")
            
            # Complex formats
            if 'complex_formats' in fmt:
                if isinstance(fmt['complex_formats'], list):
                    print(f"Complex formats: {', '.join([str(cf) for cf in fmt.get('complex_formats', ['None'])])}")
                else:
                    print(f"Complex formats: {fmt['complex_formats']}")
            
            # Data structures - handle both list of strings and list of dicts
            if 'data_structures' in fmt:
                if isinstance(fmt['data_structures'], list):
                    data_structures_formatted = []
                    for ds in fmt['data_structures']:
                        if isinstance(ds, dict) and 'name' in ds:
                            ds_str = ds['name']
                            if 'purpose' in ds:
                                ds_str += f" - {ds['purpose']}"
                            data_structures_formatted.append(ds_str)
                        elif isinstance(ds, str):
                            data_structures_formatted.append(ds)
                        else:
                            data_structures_formatted.append(str(ds))
                    
                    # Truncate long strings
                    data_structures_display = [ds[:50] + '...' if len(ds) > 50 else ds for ds in data_structures_formatted]
                    print(f"Data structures: {', '.join(data_structures_display)}")
                else:
                    print(f"Data structures: {fmt['data_structures']}")
    
    # Metadata tab
    with tabs.children[3]:
        print(f"## Program Metadata")
        for filename, meta in metadata_results.items():
            print(f"\n### {filename}")
            print(f"Program name: {meta.get('program_name', 'Unknown')}")
            print(f"Purpose: {meta.get('purpose', 'Unknown')}")
            print(f"Author: {meta.get('author', 'Unknown')}")
            print(f"Creation date: {meta.get('creation_date', 'Unknown')}")
            
            # Handle parameters
            if 'parameters' in meta:
                if isinstance(meta['parameters'], list):
                    params_str = ', '.join([str(p) for p in meta['parameters']])
                else:
                    params_str = str(meta['parameters'])
                print(f"\nParameters: {params_str}")
            else:
                print(f"\nParameters: None")
            
            # Handle return values
            if 'return_values' in meta:
                if isinstance(meta['return_values'], list):
                    returns_str = ', '.join([str(rv) for rv in meta['return_values']])
                else:
                    returns_str = str(meta['return_values'])
                print(f"Return values: {returns_str}")
            else:
                print(f"Return values: None")
            
            # Handle procedures
            if 'procedures' in meta:
                if isinstance(meta['procedures'], list):
                    procedures_formatted = []
                    for proc in meta['procedures']:
                        if isinstance(proc, dict) and 'name' in proc:
                            proc_str = proc['name']
                            if 'purpose' in proc:
                                proc_str += f" - {proc['purpose']}"
                            procedures_formatted.append(proc_str)
                        elif isinstance(proc, str):
                            procedures_formatted.append(proc)
                        else:
                            procedures_formatted.append(str(proc))
                    
                    # Truncate long strings
                    procedures_display = [p[:50] + '...' if len(p) > 50 else p for p in procedures_formatted]
                    print(f"\nProcedures: {', '.join(procedures_display)}")
                else:
                    print(f"\nProcedures: {meta['procedures']}")
            else:
                print(f"\nProcedures: None")
    
    # Spring Boot Conversion tab
    with tabs.children[4]:
        generate_spring_boot_conversion()
    
    # Save results to markdown files
    save_results_to_markdown()
    
    return tabs

def save_results_to_markdown():
    """Save analysis results to markdown files."""
    print("Saving analysis results to markdown files...")
    
    # Save architecture recommendations
    arch_md = "# RPGLE to Spring Boot Architecture Recommendations\n\n"
    arch_md += "## Analysis Overview\n\n"
    arch_md += f"Total files analyzed: {len(file_contents)}\n\n"
    
    for filename in file_contents.keys():
        if filename in file_summaries:
            arch_md += f"### {filename}\n"
            arch_md += f"Purpose: {file_summaries[filename]}\n\n"
    
    arch_md += "## Recommended Architecture\n\n"
    arch_md += "### Layered Architecture\n\n"
    arch_md += "Based on the RPGLE analysis, a standard Spring Boot layered architecture is recommended:\n\n"
    arch_md += "1. **Presentation Layer**: REST controllers\n"
    arch_md += "2. **Service Layer**: Business logic (converted from RPGLE procedures)\n"
    arch_md += "3. **Repository Layer**: Data access (converted from RPGLE file operations)\n"
    arch_md += "4. **Domain Layer**: Entity classes (converted from RPGLE data structures)\n\n"
    
    arch_md += "### Component Mapping\n\n"
    arch_md += "| RPGLE Component | Spring Boot Component |\n"
    arch_md += "| --- | --- |\n"
    arch_md += "| Program | Service Class |\n"
    arch_md += "| File Declaration | Repository Interface |\n"
    arch_md += "| Data Structure | Entity/DTO Class |\n"
    arch_md += "| Procedure | Service Method |\n"
    arch_md += "| Subroutine | Private Helper Method |\n"
    
    with open("architecture_recommendations.md", "w") as f:
        f.write(arch_md)
    print("✅ Architecture recommendations saved to architecture_recommendations.md")
    
    # Save domain package recommendations
    domain_md = "# Domain Package Recommendations\n\n"
    domain_md += "## Domain Model\n\n"
    
    # Extract data structures from format results
    all_data_structures = []
    for filename, fmt in format_results.items():
        if 'data_structures' in fmt and fmt['data_structures']:
            for ds in fmt['data_structures']:
                if isinstance(ds, dict) and 'name' in ds:
                    ds_name = ds['name']
                    ds_purpose = ds.get('purpose', 'Unknown purpose')
                    all_data_structures.append((ds_name, ds_purpose))
                elif isinstance(ds, str):
                    parts = ds.split(' - ') if ' - ' in ds else ds.split(': ') if ': ' in ds else [ds, 'Unknown purpose']
                    all_data_structures.append((parts[0], parts[1] if len(parts) > 1 else 'Unknown purpose'))
    
    domain_md += "### Domain Entities\n\n"
    domain_md += "| Entity Name | Purpose |\n"
    domain_md += "| --- | --- |\n"
    for name, purpose in all_data_structures:
        domain_md += f"| {name} | {purpose} |\n"
    
    domain_md += "\n## Recommended Package Structure\n\n"
    domain_md += "```\n"
    domain_md += "com.example.application\n"
    domain_md += "├── domain         # Domain entities\n"
    domain_md += "├── repository     # Data access interfaces\n"
    domain_md += "├── service        # Business logic services\n"
    domain_md += "├── controller     # REST API controllers\n"
    domain_md += "└── config         # Application configuration\n"
    domain_md += "```\n"
    
    with open("domain_package_recommendations.md", "w") as f:
        f.write(domain_md)
    print("✅ Domain package recommendations saved to domain_package_recommendations.md")
    
    # Save service boundary recommendations
    service_md = "# Service Boundary Recommendations\n\n"
    service_md += "## Service Boundaries\n\n"
    
    # Extract procedures from metadata results
    all_procedures = []
    for filename, meta in metadata_results.items():
        program_name = meta.get('program_name', filename)
        if 'procedures' in meta and meta['procedures']:
            for proc in meta['procedures']:
                if isinstance(proc, dict) and 'name' in proc:
                    proc_name = proc['name']
                    proc_purpose = proc.get('purpose', 'Unknown purpose')
                    all_procedures.append((program_name, proc_name, proc_purpose))
                elif isinstance(proc, str):
                    parts = proc.split(' - ') if ' - ' in proc else proc.split(': ') if ': ' in proc else [proc, 'Unknown purpose']
                    all_procedures.append((program_name, parts[0], parts[1] if len(parts) > 1 else 'Unknown purpose'))
    
    service_md += "### Service Methods\n\n"
    service_md += "| Program | Procedure | Purpose |\n"
    service_md += "| --- | --- | --- |\n"
    for program, proc, purpose in all_procedures:
        service_md += f"| {program} | {proc} | {purpose} |\n"
    
    service_md += "\n## Service Interface Recommendations\n\n"
    service_md += "```java\n"
    service_md += "public interface ExampleService {\n"
    for _, proc, purpose in all_procedures[:5]:  # Show first 5 procedures as examples
        service_md += f"    // {purpose}\n"
        service_md += f"    void {proc}();\n\n"
    service_md += "}\n"
    service_md += "```\n"
    
    with open("service_boundary_recommendations.md", "w") as f:
        f.write(service_md)
    print("✅ Service boundary recommendations saved to service_boundary_recommendations.md")
    
    # Save Spring Boot project structure
    structure_md = "# Spring Boot Project Structure\n\n"
    structure_md += "## Project Structure\n\n"
    structure_md += "```\n"
    structure_md += "src/\n"
    structure_md += "├── main/\n"
    structure_md += "│   ├── java/\n"
    structure_md += "│   │   └── com/example/application/\n"
    structure_md += "│   │       ├── Application.java           # Main application class\n"
    structure_md += "│   │       ├── controller/                # REST controllers\n"
    
    # Add controllers based on RPGLE programs
    for filename, meta in metadata_results.items():
        program_name = meta.get('program_name', filename)
        if program_name:
            # Convert to CamelCase
            controller_name = ''.join([p.capitalize() for p in program_name.split('_')]) + 'Controller'
            structure_md += f"│   │       │   └── {controller_name}.java\n"
    
    structure_md += "│   │       ├── domain/                   # Domain entities\n"
    
    # Add entities based on data structures
    for name, _ in all_data_structures[:5]:  # Show first 5 as examples
        # Convert to CamelCase
        entity_name = ''.join([p.capitalize() for p in name.split('_')])
        structure_md += f"│   │       │   └── {entity_name}.java\n"
    
    structure_md += "│   │       ├── repository/               # Data repositories\n"
    
    # Add repositories based on file accesses
    all_files = set()
    for filename, deps in dependencies_result.items():
        for file_access in deps.get('file_accesses', []):
            all_files.add(file_access)
    
    for file in list(all_files)[:5]:  # Show first 5 as examples
        # Convert to CamelCase
        repo_name = ''.join([p.capitalize() for p in file.split('_')]) + 'Repository'
        structure_md += f"│   │       │   └── {repo_name}.java\n"
    
    structure_md += "│   │       └── service/                  # Business services\n"
    
    # Add services based on RPGLE programs
    for filename, meta in metadata_results.items():
        program_name = meta.get('program_name', filename)
        if program_name:
            # Convert to CamelCase
            service_name = ''.join([p.capitalize() for p in program_name.split('_')]) + 'Service'
            structure_md += f"│   │           └── {service_name}.java\n"
    
    structure_md += "│   └── resources/\n"
    structure_md += "│       ├── application.properties        # Application configuration\n"
    structure_md += "│       └── schema.sql                    # Database schema\n"
    structure_md += "└── test/\n"
    structure_md += "    └── java/\n"
    structure_md += "        └── com/example/application/\n"
    structure_md += "            ├── controller/               # Controller tests\n"
    structure_md += "            ├── repository/               # Repository tests\n"
    structure_md += "            └── service/                  # Service tests\n"
    structure_md += "```\n"
    
    with open("spring_boot_structure.md", "w") as f:
        f.write(structure_md)
    print("✅ Spring Boot project structure saved to spring_boot_structure.md")

def generate_spring_boot_conversion():
    """Generate Spring Boot conversion recommendations."""
    print(f"## Spring Boot Conversion Plan")
    
    for filename, content in file_contents.items():
        print(f"\n### {filename} Conversion Plan")
        
        # Check if we have all the analysis data for this file
        if filename not in dependencies_result or filename not in format_results or filename not in metadata_results:
            print(f"Complete analysis data not available. Run all analysis steps first.")
            continue
        
        # Get the metadata for this file
        meta = metadata_results[filename]
        deps = dependencies_result[filename]
        
        # Generate Spring Boot conversion plan using LLM
        prompt = f"""Based on the RPGLE analysis, generate a detailed Spring Boot conversion plan for this program. Consider:
        
        1. Program metadata:
        - Program name: {meta.get('program_name', 'Unknown')}
        - Purpose: {meta.get('purpose', 'Unknown')}
        - Procedures: {', '.join([str(p) for p in meta.get('procedures', [])])}
        
        2. Dependencies:
        - Program calls: {', '.join([str(p) for p in deps.get('program_calls', [])])}
        - File accesses: {', '.join([str(f) for f in deps.get('file_accesses', [])])}
        
        3. Business rules: {', '.join([str(r) for r in meta.get('business_rules', [])])}
        
        Create a detailed conversion plan that includes:
        - Spring Boot project structure (packages, classes)
        - How to map RPGLE procedures to Java methods
        - How to handle database access
        - How to implement business rules
        - Sample Java code for 1-2 key procedures
        
        Return the results in markdown format suitable for display.
        """
        
        result = query_llm(prompt)
        
        if result:
            # Save the conversion plan to a file
            conversion_filename = f"{filename}_conversion_plan.md"
            with open(conversion_filename, "w") as f:
                f.write(f"# Conversion Plan for {filename}\n\n")
                f.write(result)
            print(f"Conversion plan saved to {conversion_filename}")
            
            # Display the results
            display(HTML(result))
        else:
            print("Failed to generate conversion plan.")

# Display the analysis results
if all([file_contents, dependencies_result, format_results, metadata_results]):
    print("Displaying analysis results...")
    tabs = display_analysis_results()
    display(tabs)
else:
    print("Please complete all analysis steps first.")
    missing_steps = []
    if not file_contents:
        missing_steps.append("Upload RPGLE Files")
    if not dependencies_result:
        missing_steps.append("Analyze Dependencies")
    if not format_results:
        missing_steps.append("Detect Formats")
    if not metadata_results:
        missing_steps.append("Extract Metadata")
    
    print(f"Missing steps: {', '.join(missing_steps)}")

## Download Results

Export the analysis results to a JSON file.

In [31]:
def export_results():
    """Export all analysis results to a JSON file."""
    if not file_contents or not dependencies_result or not format_results or not metadata_results:
        print("Please complete all analysis steps first.")
        return

    # Compile all results
    all_results = {}
    for filename in file_contents.keys():
        all_results[filename] = {
            'summary': file_summaries.get(filename, ''),
            'dependencies': dependencies_result.get(filename, {}),
            'formats': format_results.get(filename, {}),
            'metadata': metadata_results.get(filename, {})
        }

    # Save to a JSON file
    with open('rpgle_analysis_results.json', 'w') as f:
        json.dump(all_results, f, indent=2)

    # Provide download link
    files.download('rpgle_analysis_results.json')

export_button = widgets.Button(description="Export Results to JSON")
export_output = widgets.Output()

def on_export_button_clicked(b):
    with export_output:
        clear_output()
        export_results()
        print("Analysis results exported to rpgle_analysis_results.json")

export_button.on_click(on_export_button_clicked)

display(export_button, export_output)

Button(description='Export Results to JSON', style=ButtonStyle())

Output()

In [32]:
def generate_architecture_recommendations():
    """Generate Java architecture recommendations based on RPGLE analysis."""
    if not dependencies_result or not metadata_results:
        print("Please complete the dependency and metadata analysis steps first.")
        return

    # Prepare a summary of all files for the LLM
    program_summary = []
    for filename, meta in metadata_results.items():
        deps = dependencies_result.get(filename, {})
        program_info = {
            'filename': filename,
            'program_name': meta.get('program_name', 'Unknown'),
            'purpose': meta.get('purpose', 'Unknown'),
            'procedures': meta.get('procedures', []),
            'program_calls': deps.get('program_calls', []),
            'file_accesses': deps.get('file_accesses', [])
        }
        program_summary.append(program_info)

    # Format the summary for the prompt
    programs_txt = json.dumps(program_summary, indent=2)

    prompt = f"""Based on the analysis of these RPGLE programs, generate comprehensive Java architecture recommendations for a Spring Boot conversion.

    Program details:
    ```
    {programs_txt}
    ```

    Generate detailed architectural recommendations including:
    1. Overall architecture pattern (e.g., layered, hexagonal, microservices, etc.)
    2. Component structure
    3. Dependency management approach
    4. Data access strategy
    5. Service organization
    6. Error handling strategy
    7. Cross-cutting concerns (logging, security, etc.)
    8. Testing strategy

    Return the recommendations in markdown format with clear headings and explanations.
    """

    result = query_llm(prompt, max_tokens=5000)
    return result

arch_button = widgets.Button(description="Generate Architecture Recommendations")
arch_output = widgets.Output()

def on_arch_button_clicked(b):
    with arch_output:
        clear_output()
        print("Generating Java architecture recommendations...")
        recommendations = generate_architecture_recommendations()
        display(HTML(recommendations))

arch_button.on_click(on_arch_button_clicked)

display(arch_button, arch_output)

Button(description='Generate Architecture Recommendations', style=ButtonStyle())

Output()

In [33]:
def generate_domain_package_recommendations():
    """Generate domain package recommendations based on RPGLE analysis."""
    if not metadata_results or not format_results:
        print("Please complete the metadata and format analysis steps first.")
        return

    # Extract business concepts and data structures
    business_concepts = []
    data_structures = []

    for filename, meta in metadata_results.items():
        # Extract business rules
        if 'business_rules' in meta and meta['business_rules']:
            business_concepts.extend(meta['business_rules'])

        # Get data structures from format results
        if filename in format_results and 'data_structures' in format_results[filename]:
            data_structures.extend(format_results[filename]['data_structures'])

    # Prepare input for the LLM
    prompt = f"""Based on the RPGLE analysis, recommend domain package organization for a Spring Boot application.

    Business concepts identified:
    ```
    {json.dumps(business_concepts, indent=2)}
    ```

    Data structures identified:
    ```
    {json.dumps(data_structures, indent=2)}
    ```

    Please provide:
    1. Main domain entities that should be created
    2. Package structure organization (e.g., by business function, by entity, etc.)
    3. Entity relationships and recommendations
    4. Java class diagrams (in text format) for key domain objects
    5. Recommendations for using Spring Data JPA entities

    Return the recommendations in markdown format with clear headings and explanations.
    """

    result = query_llm(prompt, max_tokens=5000)
    return result

domain_button = widgets.Button(description="Generate Domain Package Recommendations")
domain_output = widgets.Output()

def on_domain_button_clicked(b):
    with domain_output:
        clear_output()
        print("Generating domain package recommendations...")
        recommendations = generate_domain_package_recommendations()
        display(HTML(recommendations))

domain_button.on_click(on_domain_button_clicked)

display(domain_button, domain_output)

Button(description='Generate Domain Package Recommendations', style=ButtonStyle())

Output()

In [34]:
def generate_service_boundary_recommendations():
    """Generate service boundary recommendations based on RPGLE analysis."""
    if not dependencies_result or not metadata_results:
        print("Please complete the dependency and metadata analysis steps first.")
        return

    # Group procedures by related functionality
    all_procedures = []
    for filename, meta in metadata_results.items():
        if 'procedures' in meta and meta['procedures']:
            file_procedures = [
                {'filename': filename, 'procedure': proc}
                for proc in meta['procedures']
            ]
            all_procedures.extend(file_procedures)

    # Extract program call dependencies
    call_deps = {}
    for filename, deps in dependencies_result.items():
        if 'program_calls' in deps and deps['program_calls']:
            call_deps[filename] = deps['program_calls']

    # Prepare input for the LLM
    prompt = f"""Based on the RPGLE analysis, recommend service boundaries for a Spring Boot application.

    Procedures identified across all programs:
    ```
    {json.dumps(all_procedures, indent=2)}
    ```

    Program call dependencies:
    ```
    {json.dumps(call_deps, indent=2)}
    ```

    Please provide:
    1. Recommended service boundaries and their justification
    2. Service interface definitions (in Java format)
    3. Service implementation recommendations
    4. Communication patterns between services
    5. Recommendations for API design
    6. Transaction boundary considerations

    Focus on creating cohesive services with clear responsibilities and minimal coupling.
    Return the recommendations in markdown format with clear headings and explanations.
    """

    result = query_llm(prompt, max_tokens=5000)
    return result

service_button = widgets.Button(description="Generate Service Boundary Recommendations")
service_output = widgets.Output()

def on_service_button_clicked(b):
    with service_output:
        clear_output()
        print("Generating service boundary recommendations...")
        recommendations = generate_service_boundary_recommendations()
        display(HTML(recommendations))

service_button.on_click(on_service_button_clicked)

display(service_button, service_output)

Button(description='Generate Service Boundary Recommendations', style=ButtonStyle())

Output()

In [35]:
def generate_project_structure():
    """Generate Spring Boot project structure based on RPGLE analysis."""
    if not dependencies_result or not metadata_results:
        print("Please complete all analysis steps first.")
        return

    # Extract program names for base package recommendation
    program_names = []
    program_purposes = []
    for filename, meta in metadata_results.items():
        if 'program_name' in meta and meta['program_name'] != 'Unknown':
            program_names.append(meta['program_name'])
        if 'purpose' in meta and meta['purpose'] != 'Unknown':
            program_purposes.append(meta['purpose'])

    # Get unique file accesses for determining database entities
    all_file_accesses = set()
    for filename, deps in dependencies_result.items():
        if 'file_accesses' in deps:
            all_file_accesses.update(deps['file_accesses'])

    # Prepare input for the LLM
    prompt = f"""Based on the RPGLE analysis, generate a complete Spring Boot project structure.

    Program names: {', '.join(program_names)}
    Program purposes: {', '.join(program_purposes)}
    Database files accessed: {', '.join(all_file_accesses)}

    Please provide:
    1. Project configuration (build.gradle or pom.xml)
    2. Complete package structure with explanations
    3. Main application class
    4. Configuration classes
    5. Controller, Service, and Repository layer organization
    6. Sample implementations for key components
    7. Database configuration
    8. Testing structure

    Return the recommendations as a complete project structure with file paths and code examples.
    Use markdown format with clear headings, and include code blocks for each file.
    """

    result = query_llm(prompt, max_tokens=6000)
    return result

# Functions to generate actual project folders and files
def create_project_files(project_structure):
    """Convert the LLM's project structure recommendation into actual files and directories."""
    # Extract code blocks from the markdown
    pattern = r'```(?:java|xml|properties|gradle|yaml|yml)? ?\n([\s\S]*?)\n```'
    blocks = re.findall(pattern, project_structure)

    # Extract file paths
    path_pattern = r'[^\s]*\.(?:java|xml|properties|gradle|yaml|yml)'

    # Create a project directory
    project_dir = './spring_boot_project'
    os.makedirs(project_dir, exist_ok=True)

    # Parse the structure and create files
    current_file = None
    current_content = []

    lines = project_structure.split('\n')
    for line in lines:
        # Check if line contains a file path
        file_match = re.search(path_pattern, line)
        if file_match and '```' not in line:
            # Save previous file if we were collecting content
            if current_file and current_content:
                file_path = os.path.join(project_dir, current_file)
                os.makedirs(os.path.dirname(file_path), exist_ok=True)
                with open(file_path, 'w') as f:
                    f.write('\n'.join(current_content))
                print(f"Created file: {current_file}")

            # Start collecting for new file
            current_file = file_match.group(0)
            current_content = []

    # Create a zip file of the project
    import shutil
    shutil.make_archive('spring_boot_project', 'zip', '.', 'spring_boot_project')

    return 'spring_boot_project.zip'

structure_button = widgets.Button(description="Generate Project Structure")
structure_output = widgets.Output()

download_button = widgets.Button(description="Generate & Download Project Files", disabled=True)
download_output = widgets.Output()

project_structure_result = None

def on_structure_button_clicked(b):
    global project_structure_result
    with structure_output:
        clear_output()
        print("Generating Spring Boot project structure...")
        project_structure_result = generate_project_structure()
        download_button.disabled = False
        display(HTML(project_structure_result))

def on_download_button_clicked(b):
    with download_output:
        clear_output()
        print("Creating project files...")
        if project_structure_result:
            zip_path = create_project_files(project_structure_result)
            files.download(zip_path)
            print("Project files created and available for download as spring_boot_project.zip")
        else:
            print("Please generate the project structure first.")

structure_button.on_click(on_structure_button_clicked)
download_button.on_click(on_download_button_clicked)

display(structure_button, structure_output)
display(download_button, download_output)

Button(description='Generate Project Structure', style=ButtonStyle())

Output()

Button(description='Generate & Download Project Files', disabled=True, style=ButtonStyle())

Output()

In [36]:
def parse_rpgle_program(rpgle_code):
    """
    Parse RPGLE program to extract detailed structured information.
    """
    print("Parsing RPGLE program...")

    # Create the enhanced prompt for parsing
    prompt = """# Enhanced Prompt for Parsing Modern RPGLE Code

You are a specialized RPGLE code parser tasked with extracting structured information from modern RPGLE source code. Your output MUST strictly adhere to the requested JSON format and include ALL required lists.

## Primary Task

Analyze the provided RPGLE code and produce a comprehensive JSON representation that captures all structural elements, focusing especially on the 11 MANDATORY lists specified below.

## CRITICAL INSTRUCTIONS

1. **YOU MUST return the EXACT JSON structure specified.** Do not use alternative key names or structures.
2. **STRUCTURE IS MORE IMPORTANT THAN COMPLETENESS**: If you cannot analyze all details, prioritize providing all 11 required lists in the correct structure with whatever information you can extract.
3. **The top-level structure MUST include "programName", "programType", "programPurpose", and "requiredLists" with all 11 mandatory sublists.**
4. **If any list would be empty, include it with an empty array**: `"bindingDirectories": []`

## IMPORTANT: Required Lists

THE FOLLOWING 11 LISTS ARE MANDATORY AND MUST BE INCLUDED UNDER "requiredLists":

1. **Subprocedures List**:
   ```json
   "subprocedures": [
     {
       "name": "PGM_Pre_Open",
       "export": true,
       "purpose": "Initializes service program and opens required files",
       "returnType": "void",
       "parameterCount": 0
     }
   ]
   ```

2. **Databases List**:
   ```json
   "databases": [
     {
       "name": "RCPRD",
       "type": "logical",
       "purpose": "Product database with primary key access"
     }
   ]
   ```

3. **Database Keys List**:
   ```json
   "databaseKeys": [
     {
       "database": "RCPGM",
       "keyFields": ["PRD_Key"]
     }
   ]
   ```

4. **Modules List**:
   ```json
   "modules": [
     {
       "name": "RCPGM",
       "purpose": "Database access layer for product information"
     }
   ]
   ```

5. **Binding Directories List**:
   ```json
   "bindingDirectories": [
     {
       "name": "RCPGM",
       "purpose": "Common utilities binding directory"
     }
   ]
   ```

6. **Copy Books List**:
   ```json
   "copyBooks": [
     {
       "name": "*libl/qRpgSrc,StCommonDS",
       "purpose": "Common data structure definitions"
     }
   ]
   ```

7. **Indexes and Key Sets List**:
   ```json
   "indexesAndKeySets": [
     {
       "name": "T_RCPGM_R01_KDS",
       "keyFields": ["PRD_Key"],
       "usedIn": "RCPRDD01_Retrieve_PRD_Record"
     }
   ]
   ```

8. **Input Parameters List**:
   ```json
   "inputParameters": [
     {
       "name": "pi_RCPGM_R01_KDS",
       "type": "likeDS(T_RCPGM_R01_KDS)",
       "procedure": "RCPGM_Retrieve_PRD_Record",
       "purpose": "Key data structure for record retrieval"
     }
   ]
   ```

9. **Core Logic Sections List**:
   ```json
   "coreLogicSections": [
     {
       "location": "RCPGM_Get_PRD_Code_On_Fill_Date_R08",
       "description": "Date range validation between product effective/expiry dates",
       "purpose": "Determine if product is valid on the fill date"
     }
   ]
   ```

10. **External Programs List**:
    ```json
    "externalPrograms": [
      {
        "name": "RCPGM",
        "calledFrom": "RCPGM_Retrieve_PRD_Record",
        "purpose": "Common utility program for error handling"
      }
    ]
    ```

11. **File Operations List**:
    ```json
    "fileOperations": [
      {
        "file": "RCPGM",
        "operations": ["read"],
        "usedIn": "RCPGM_Retrieve_PRD_Record"
      }
    ]
    ```

## YOU MUST EXTRACT From Code

- **All binding directories**: Find all `Ctl-Opt BndDir` statements
- **All copy books**: Look for all `/copy` and `/include` statements
- **All modules**: Extract from comments, compilation options (CRTRPGMOD MODULE)
- **Database keys**: Identify from key data structures (`*key`) and file declarations
- **File operations**: Determine read/write/update operations for each file

## Analysis Instructions

1. **Scan for Program Structure Elements**:
   - Free format directive (`**FREE`)
   - Compiler directives (`/if`, `/define`, `/copy`, etc.)
   - Control options (`Ctl-Opt`)
   - Binding directories (`BndDir`)

2. **Identify All Data Definitions**:
   - Data structures (`Dcl-Ds`)
   - File declarations (`Dcl-F`)
   - Variables (`Dcl-S`)
   - Constants (`Dcl-C`)

3. **Extract All Procedures**:
   - Detect procedure definitions (`Dcl-Proc` and `End-Proc`)
   - Identify export status (`Export` keyword)
   - Capture procedure interfaces (`Dcl-PI`)
   - List all parameters with their directions and types

4. **Map Database Operations**:
   - Record file operations (`Setll`, `Chain`, `ReadE`, `Write`, `Update`, `Delete`)
   - Identify files used with each operation
   - Note key data structures used with `%kds` function
   - Track conditions checked (`%found`, `%Equal`, `%Eof`)

5. **Document Control Flow Patterns**:
   - If-Then-Else structures
   - Select-When blocks
   - Loops (Dow, Dou, For)
   - Subroutines (BegSr/EndSr)
   - Error handling (Monitor/On-Error)

6. **Identify Business Logic Patterns**:
   - Date range validations
   - Hierarchical lookups
   - Status setting and checking
   - Comparison logic

7. **Detect External Integrations**:
   - Service program calls
   - External program calls (`Callp`)
   - External procedure prototypes (`Extpgm`)

## Validation Steps

BEFORE RETURNING YOUR RESPONSE:

1. **Verify JSON Structure**: Confirm your output follows the exact JSON structure specified
2. **Check Required Lists**: Ensure ALL 11 required lists are present under the "requiredLists" key
3. **Validate Content**: Make sure each list contains appropriate entries (or empty arrays if none found)
4. **Count Parameters**: Double-check parameter counts for all procedures

## Output Format

Your analysis MUST be provided as a JSON object using this EXACT structure:

```json
{
  "programName": "PROGRAM_NAME",
  "programType": "Module/Program/Service Program",
  "programPurpose": "Brief description inferred from initial comments",
  "structure": {
    "directives": [
      {
        "type": "free-format/conditional/copy",
        "text": "...",
        "purpose": "..."
      }
    ],
    "controlOptions": [
      {
        "option": "...",
        "value": "...",
        "purpose": "..."
      }
    ],
    "bindingDirectories": [
      "..."
    ]
  },
  "dataDefinitions": {
    "files": [
      {
        "name": "...",
        "usage": "Input/Output/Update",
        "options": ["..."],
        "keys": ["..."]
      }
    ],
    "dataStructures": [
      {
        "name": "...",
        "type": "template/qualified/...",
        "purpose": "...",
        "fields": [
          {
            "name": "...",
            "type": "...",
            "purpose": "..."
          }
        ]
      }
    ],
    "variables": [
      {
        "name": "...",
        "type": "...",
        "scope": "global/local",
        "purpose": "..."
      }
    ],
    "constants": [
      {
        "name": "...",
        "value": "...",
        "purpose": "..."
      }
    ]
  },
  "procedures": [
    {
      "name": "...",
      "export": true/false,
      "returnType": "...",
      "parameters": [
        {
          "name": "...",
          "direction": "input/output/both",
          "type": "..."
        }
      ],
      "logic": {
        "summary": "Brief description of what the procedure does",
        "patterns": ["hierarchical lookup", "date validation", ...],
        "controlFlow": "main logical structure"
      }
    }
  ],
  "businessLogic": {
    "rules": [
      {
        "description": "...",
        "pattern": "...",
        "location": "..."
      }
    ],
    "dataFlow": {
      "inputs": ["..."],
      "transformations": ["..."],
      "outputs": ["..."]
    }
  },
  "externalIntegrations": {
    "servicePrograms": ["..."],
    "externalCalls": [
      {
        "program": "...",
        "purpose": "..."
      }
    ]
  },
  "requiredLists": {
    "subprocedures": [
      {
        "name": "...",
        "export": true/false,
        "purpose": "...",
        "returnType": "...",
        "parameterCount": 0
      }
    ],
    "databases": [
      {
        "name": "...",
        "type": "physical/logical",
        "purpose": "..."
      }
    ],
    "databaseKeys": [
      {
        "database": "...",
        "keyFields": ["...", "..."]
      }
    ],
    "modules": [
      {
        "name": "...",
        "purpose": "..."
      }
    ],
    "bindingDirectories": [
      {
        "name": "...",
        "purpose": "..."
      }
    ],
    "copyBooks": [
      {
        "name": "...",
        "purpose": "..."
      }
    ],
    "indexesAndKeySets": [
      {
        "name": "...",
        "keyFields": ["...", "..."],
        "usedIn": "..."
      }
    ],
    "inputParameters": [
      {
        "name": "...",
        "type": "...",
        "procedure": "...",
        "purpose": "..."
      }
    ],
    "coreLogicSections": [
      {
        "location": "...",
        "description": "...",
        "purpose": "..."
      }
    ],
    "externalPrograms": [
      {
        "name": "...",
        "calledFrom": "...",
        "purpose": "..."
      }
    ],
    "fileOperations": [
      {
        "file": "...",
        "operations": ["read", "write", "update"],
        "usedIn": "..."
      }
    ]
  }
}
```

## RPGLE Code to Parse:

```rpgle
{rpgle_code}
```

REMEMBER:
- You MUST include ALL 11 required lists in your output exactly as specified
- The JSON structure MUST be followed precisely
- All cross-references must be accurate
- If you can't find information for a list, include it with an empty array
- Prioritize structure compliance over comprehensive analysis
"""

    # Call the LLM with the prompt
    parsed_result = query_llm(prompt.replace("{rpgle_code}", rpgle_code), max_tokens=8000)

    try:
        # Try to extract and parse JSON from the response
        json_match = re.search(r'\{[\s\S]*\}', parsed_result)
        if json_match:
            json_str = json_match.group(0)
            parsed_json = json.loads(json_str)
            return parsed_json
        else:
            print("Could not extract JSON from LLM response.")
            return None
    except Exception as e:
        print(f"Error parsing JSON response: {e}")
        print("Raw LLM response:", parsed_result[:500] + "...")
        return None

# Create UI elements for parsing
parse_output = widgets.Output()
parsed_results = {}

def parse_selected_file():
    with parse_output:
        clear_output()
        if not file_contents:
            print("Please upload RPGLE files first.")
            return

        # Create a dropdown for file selection
        file_dropdown = widgets.Dropdown(
            options=list(file_contents.keys()),
            description='Select file:',
            disabled=False,
        )

        parse_file_button = widgets.Button(description="Parse File")
        parse_status = widgets.Output()

        display(file_dropdown, parse_file_button, parse_status)

        def on_parse_file_button_clicked(b):
            with parse_status:
                clear_output()
                filename = file_dropdown.value
                print(f"Parsing {filename}...")

                # Get the code
                code = file_contents[filename]

                # Parse the code
                result = parse_rpgle_program(code)
                if result:
                    # Store the result
                    parsed_results[filename] = result
                    print(f"Successfully parsed {filename}!")

                    # Display summary
                    print("\nSummary of parsed elements:")
                    print(f"Program name: {result.get('programName', 'Unknown')}")
                    print(f"Program type: {result.get('programType', 'Unknown')}")
                    print(f"Program purpose: {result.get('programPurpose', 'Unknown')}")

                    req_lists = result.get('requiredLists', {})
                    print(f"Subprocedures: {len(req_lists.get('subprocedures', []))}")
                    print(f"Databases: {len(req_lists.get('databases', []))}")
                    print(f"Database Keys: {len(req_lists.get('databaseKeys', []))}")
                    print(f"Copy Books: {len(req_lists.get('copyBooks', []))}")
                    print(f"External Programs: {len(req_lists.get('externalPrograms', []))}")
                    print(f"File Operations: {len(req_lists.get('fileOperations', []))}")

                    # Show first few subprocedures if available
                    subprocs = req_lists.get('subprocedures', [])
                    if subprocs:
                        print("\nSome subprocedures found:")
                        for i, proc in enumerate(subprocs[:3]):
                            print(f"- {proc.get('name', 'Unknown')} - {proc.get('purpose', 'No purpose specified')}")
                        if len(subprocs) > 3:
                            print(f"... and {len(subprocs) - 3} more.")
                else:
                    print("Failed to parse the file.")

        parse_file_button.on_click(on_parse_file_button_clicked)

# Add button to show parsing interface
parse_button = widgets.Button(description="Parse RPGLE Programs")
parse_button.on_click(lambda b: parse_selected_file())

display(parse_button, parse_output)

Button(description='Parse RPGLE Programs', style=ButtonStyle())

Output()

In [ ]:
def generate_business_logic_doc(program_data, rpgle_code):
    """
    Generate comprehensive business documentation for an RPGLE program.
    """
    print("Generating business logic documentation...")
    
    # Create the prompt for business documentation generation
    prompt = """# RPGLE Program Business Documentation Generation Prompt

## Task Overview

Generate a comprehensive business documentation for an RPGLE program that explains its purpose, functionality, and business rules in a format accessible to both technical and non-technical stakeholders. The documentation should present the program's business logic with clear explanations, leveraging both code analysis and embedded comments, and include visual diagrams to enhance understanding.

## Document Structure Requirements

Your documentation must include the following sections in this order:

### 1. Program Overview
   - **Program Name and Type**: Identify the program name, type (service program, module, etc.)
   - **Business Purpose**: Summarize the primary business function in 1-2 paragraphs
   - **System Context**: Explain where this program fits in the broader application landscape
   - **Key Business Functions**: Bullet list of main business capabilities

### 2. Business Process Flow
   - **Mermaid Process Diagram**: Create a Mermaid flowchart showing the main business process steps
   - **Process Triggers**: What business events initiate this program
   - **Process Outcomes**: Expected business results after successful execution
   - **Integration Points**: Other systems or programs this interacts with

### 3. Business Rules Inventory
   - Categorize all business rules by functional area
   - For each rule include:
     - Rule ID and descriptive name
     - Plain English description
     - Business purpose/justification
     - Conditions when the rule applies
     - Exceptions to the rule
     - Implementation notes (which procedures/subroutines implement this rule)
   - **Decision Tree Diagram**: Include a Mermaid diagram for complex rule hierarchies

### 4. Data Structures and Business Entities
   - Document the key business entities represented
   - Explain the purpose of major data structures from a business perspective
   - Define important fields and their business significance
   - Note any business validation rules applied to fields
   - **Entity Relationship Diagram**: Mermaid diagram showing relationships between key data structures

### 5. Calculation Logic
   - Document all business calculations with:
     - Purpose of the calculation
     - Business formula in plain English
     - Variables used and their business meaning
     - Sample calculation examples where possible
   - **Algorithm Flowchart**: Mermaid diagram for complex calculation workflows

### 6. Error Handling and Business Exceptions
   - List all business error scenarios
   - Explain the business impact of each error
   - Document recovery paths and alternative flows
   - Explain rejection codes in business terms
   - **Error Flow Diagram**: Mermaid diagram showing error paths and recovery options

### 7. Integration Dependencies
   - Document all external systems this program connects with
   - Explain what business data is exchanged
   - Note any special business handling for integration failures
   - **Integration Map**: Mermaid diagram showing system integration points

### 8. Program Architecture
   - **Procedures Map**: Mermaid diagram showing relationships between procedures
   - **Data Flow Diagram**: Visual representation of how data moves through the program
   - **Component Diagram**: Show relationships between program components

### 9. Required Program Elements
   - **Subprocedures**: List all subprocedures with their business function, complexity, and error handling
   - **Databases**: Document all databases with their business purpose and access patterns
   - **Database Keys**: Explain key fields and their business significance
   - **Modules**: List all modules with their business purpose and dependencies
   - **Binding Directories**: Document binding directories and their business context
   - **Copy Books**: Explain copy books and their business significance
   - **Indexes and Key Sets**: Document their business purpose and usage
   - **Input Parameters**: Explain from a business perspective
   - **Core Logic Sections**: Identify critical business logic areas
   - **External Programs**: Document business integration points
   - **File Operations**: Explain business purpose of file operations

### 10. Business Glossary
   - Define all business-specific terms used in the program
   - Explain technical terms in business language
   - Map technical field names to business concepts

### 11. Change History
   - Document significant business functionality changes
   - Note the business reasons for major modifications
   - Track evolution of business rules

## Special Instructions

1. **Code Comment Integration**:
   - Extract and incorporate meaningful code comments that explain business intent
   - Present comments alongside their related business functionality
   - Use comments to clarify complex business logic

2. **Technical-to-Business Translation**:
   - Translate technical constructs into business language
   - Explain "why" not just "what" the code does
   - Make functionality understandable to business users

3. **Business Rule Consolidation**:
   - Identify duplicate/related rules spread across procedures
   - Group and consolidate rules by business function
   - Highlight dependencies between rules

4. **Visual Elements**:
   - **Mermaid Diagrams**: Create Mermaid syntax for all required diagrams
   - Use the appropriate diagram type for each visualization need:
     - flowchart: For process flows and decision logic
     - sequenceDiagram: For interaction sequences
     - classDiagram: For data structures and entity relationships
     - stateDiagram: For state transitions in the business process
   - Format diagrams for maximum clarity and readability

5. **Critical Path Identification**:
   - Highlight the main business processing path
   - Distinguish primary rules from edge cases
   - Emphasize high-priority business functions

## Mermaid Diagram Specifications

Include the following Mermaid diagrams in your documentation:

1. **Business Process Flowchart**:
   ```
   flowchart TD
     Start[Business Trigger] --> Process1[First Process Step]
     Process1 --> Decision{Decision Point}
     Decision --> |Condition A| Process2[Process A]
     Decision --> |Condition B| Process3[Process B]
     Process2 --> End[Business Outcome]
     Process3 --> End
   ```

2. **Decision Tree for Business Rules**:
   ```
   flowchart TD
     Start[Rule Evaluation] --> Condition1{Condition 1}
     Condition1 --> |True| Action1[Execute Action 1]
     Condition1 --> |False| Condition2{Condition 2}
     Condition2 --> |True| Action2[Execute Action 2]
     Condition2 --> |False| Action3[Execute Action 3]
   ```

3. **Data Flow Diagram**:
   ```
   flowchart LR
     InputData[Input Data] --> Process1[Process 1]
     Process1 --> DataStore[(Data Store)]
     DataStore --> Process2[Process 2]
     Process2 --> OutputData[Output Data]
   ```

4. **System Integration Map**:
   ```
   flowchart TD
     ThisProgram[This Program] --> |Data Exchange 1| ExternalSystem1[External System 1]
     ThisProgram --> |Data Exchange 2| ExternalSystem2[External System 2]
     ExternalSystem1 --> |Response Data| ThisProgram
   ```

5. **Component Relationship Diagram**:
   ```
   classDiagram
     class MainProgram
     class Subprocedure1
     class Subprocedure2
     class ExternalProgram
    
     MainProgram --> Subprocedure1 : calls
     MainProgram --> Subprocedure2 : calls
     Subprocedure2 --> ExternalProgram : integrates with
   ```

## Program Analysis Input

Here is the parsed JSON representation of the RPGLE program to analyze:

```json
{program_json}
```

And here is the original RPGLE code:

```rpgle
{rpgle_code}
```

Please generate the business documentation based on both the parsed representation and the original source code.
"""

    # Prepare the JSON string with proper indentation
    program_json = json.dumps(program_data, indent=2)
    
    # Call the LLM with the prompt
    doc_result = query_llm(prompt
                        .replace("{program_json}", program_json)
                        .replace("{rpgle_code}", rpgle_code), 
                        max_tokens=8000)
    
    return doc_result

# Function to render Mermaid diagrams
def render_mermaid_diagrams(markdown_text):
    """
    Parse markdown text and replace Mermaid code blocks with rendered diagrams.
    """
    # Install required packages if they're not already installed
    try:
        import base64
        from IPython.display import display, HTML
        import re
        import requests
    except ImportError:
        !pip install -q requests
        import base64
        from IPython.display import display, HTML
        import re
        import requests
    
    # Find all Mermaid code blocks
    mermaid_blocks = re.findall(r'```(?:mermaid)?\s*\n([\s\S]*?)\n```', markdown_text)
    
    if not mermaid_blocks:
        return markdown_text  # No Mermaid blocks found, return original
    
    # Function to render Mermaid diagram
    def render_mermaid(mermaid_code):
        # Try to use Mermaid Live Editor API to render the diagram
        try:
            # First attempt: Use Mermaid Live Editor via img src
            mermaid_base64 = base64.b64encode(mermaid_code.encode('utf-8')).decode('utf-8')
            img_url = f"https://mermaid.ink/img/{mermaid_base64}"
            return f'<img src="{img_url}" alt="Mermaid Diagram">'
        except Exception as e:
            # If that fails, fall back to a simplified rendering approach
            print(f"Warning: Could not render Mermaid diagram: {e}")
            # Return the Mermaid code in a styled pre block
            return f'<pre class="mermaid">{mermaid_code}</pre>'
    
    # Replace each Mermaid code block with a rendered diagram
    for i, block in enumerate(mermaid_blocks):
        placeholder = f"MERMAID_DIAGRAM_{i}"
        markdown_text = markdown_text.replace(f"```mermaid\n{block}\n```", placeholder)
        markdown_text = markdown_text.replace(f"```\n{block}\n```", placeholder)
        markdown_text = markdown_text.replace(placeholder, render_mermaid(block))
    
    # Add Mermaid initialization script
    markdown_text = f'''
    <script src="https://cdn.jsdelivr.net/npm/mermaid/dist/mermaid.min.js"></script>
    <script>mermaid.initialize({{startOnLoad:true}});</script>
    {markdown_text}
    '''
    
    return markdown_text

# Create UI elements for business logic documentation
doc_output = widgets.Output()
business_logic_docs = {}

def generate_selected_file_doc():
    with doc_output:
        clear_output()
        if not parsed_results:
            print("Please parse RPGLE programs first.")
            return
        
        # Create a dropdown for file selection
        file_dropdown = widgets.Dropdown(
            options=list(parsed_results.keys()),
            description='Select file:',
            disabled=False,
        )
        
        generate_doc_button = widgets.Button(description="Generate Documentation")
        doc_status = widgets.Output()
        
        display(file_dropdown, generate_doc_button, doc_status)
        
        def on_generate_doc_button_clicked(b):
            with doc_status:
                clear_output()
                filename = file_dropdown.value
                print(f"Generating business logic documentation for {filename}...")
                
                # Get the parsed data and original code
                program_data = parsed_results[filename]
                rpgle_code = file_contents[filename]
                
                # Generate the documentation
                doc_result = generate_business_logic_doc(program_data, rpgle_code)
                if doc_result:
                    # Store the result
                    business_logic_docs[filename] = doc_result
                    
                    # Save original markdown
                    md_content = f"""# Business Logic Documentation for {filename}

{doc_result}
"""
                    with open(f"{filename}_business_doc.md", "w") as f:
                        f.write(md_content)
                    
                    print(f"\nDocumentation saved to {filename}_business_doc.md")
                    
                    # Render Mermaid diagrams
                    print("Rendering Mermaid diagrams...")
                    html_content = render_mermaid_diagrams(doc_result)
                    
                    # Display the documentation with rendered diagrams
                    display(HTML(f'<h2>Business Logic Documentation for {filename}</h2>'))
                    display(HTML(html_content))
                    
                    # Create PDF export
                    try:
                        import pdfkit
                        pdfkit.from_string(html_content, f"{filename}_business_doc.pdf")
                        print(f"Documentation PDF saved to {filename}_business_doc.pdf")
                        files.download(f"{filename}_business_doc.pdf")
                    except:
                        print("PDF export not available. Downloading markdown file instead.")
                        files.download(f"{filename}_business_doc.md")
                else:
                    print("Failed to generate documentation.")
        
        generate_doc_button.on_click(on_generate_doc_button_clicked)

# Add button to show documentation interface
doc_button = widgets.Button(description="Generate Business Logic Documentation")
doc_button.on_click(lambda b: generate_selected_file_doc())

display(doc_button, doc_output)

In [21]:
def generate_combined_documentation():
    """
    Create a combined document with parsed code and business logic documentation.
    """
    if not parsed_results or not business_logic_docs:
        print("Please parse RPGLE programs and generate business logic documentation first.")
        return

    # Find files that have both parsed results and business logic docs
    common_files = set(parsed_results.keys()) & set(business_logic_docs.keys())
    if not common_files:
        print("No files have both parsed results and business logic documentation. Please generate both first.")
        return

    # Create UI elements
    file_dropdown = widgets.Dropdown(
        options=list(common_files),
        description='Select file:',
        disabled=False,
    )

    generate_combined_button = widgets.Button(description="Generate Combined Document")
    combined_status = widgets.Output()

    display(file_dropdown, generate_combined_button, combined_status)

    def on_generate_combined_button_clicked(b):
        with combined_status:
            clear_output()
            filename = file_dropdown.value
            print(f"Generating combined documentation for {filename}...")

            # Get the parsed data and business logic doc
            program_data = parsed_results[filename]
            business_doc = business_logic_docs[filename]

            # Create a combined document
            combined_doc = f"""# Combined Documentation for {filename}

## Part 1: Business Logic Documentation

{business_doc}

## Part 2: Parsed RPGLE Program Structure

```json
{json.dumps(program_data, indent=2)}
```

## Part 3: Original RPGLE Code

```rpgle
{file_contents[filename]}
```
"""

            # Save the combined document
            combined_filename = f"{filename}_combined_doc.md"
            with open(combined_filename, "w") as f:
                f.write(combined_doc)

            print(f"Combined documentation saved to {combined_filename}")

            # Convert to PDF if installed
            try:
                import pypandoc
                print("Converting to PDF...")
                pypandoc.convert_file(combined_filename, 'pdf', outputfile=f"{filename}_combined_doc.pdf")
                print(f"PDF saved to {filename}_combined_doc.pdf")
                files.download(f"{filename}_combined_doc.pdf")
            except (ImportError, OSError):
                print("Could not convert to PDF. pypandoc or pandoc might not be installed.")
                print("Downloading markdown file instead.")
                files.download(combined_filename)

    generate_combined_button.on_click(on_generate_combined_button_clicked)

# Create UI elements for combined documentation
combined_output = widgets.Output()
combined_button = widgets.Button(description="Prepare Combined Documentation")

def show_combined_doc_interface(b):
    with combined_output:
        clear_output()
        generate_combined_documentation()

combined_button.on_click(show_combined_doc_interface)

display(combined_button, combined_output)

# Optional: Install pypandoc for PDF conversion
install_pandoc_button = widgets.Button(description="Install PDF Conversion Tools")
install_output = widgets.Output()

def install_pandoc(b):
    with install_output:
        clear_output()
        print("Installing pypandoc for PDF conversion...")
        !pip install pypandoc
        !apt-get update && apt-get install -y pandoc texlive-xetex
        print("Installation complete!")

install_pandoc_button.on_click(install_pandoc)

display(install_pandoc_button, install_output)

Button(description='Prepare Combined Documentation', style=ButtonStyle())

Output()

Button(description='Install PDF Conversion Tools', style=ButtonStyle())

Output()

In [22]:
def generate_application_business_logic():
    """
    Generate comprehensive application business logic by analyzing relationships between RPGLE programs.
    """
    if not parsed_results or len(parsed_results) < 2:
        print("Please parse at least two RPGLE programs to analyze relationships.")
        return

    print("Analyzing relationships between RPGLE programs...")

    # Extract relationships from parsed results
    relationships = {}
    programs_info = {}

    for filename, data in parsed_results.items():
        # Basic program info
        program_name = data.get('programName', filename)
        programs_info[program_name] = {
            'filename': filename,
            'type': data.get('programType', 'Unknown'),
            'purpose': data.get('programPurpose', 'Unknown')
        }

        # Extract calls to external programs
        external_calls = []
        if 'requiredLists' in data and 'externalPrograms' in data['requiredLists']:
            external_calls = [(ep.get('name', ''), ep.get('calledFrom', ''), ep.get('purpose', ''))
                             for ep in data['requiredLists']['externalPrograms']]

        # Store relationships
        relationships[program_name] = {
            'calls': external_calls,
            'databases': [db.get('name', '') for db in data.get('requiredLists', {}).get('databases', [])],
            'copyBooks': [cb.get('name', '') for cb in data.get('requiredLists', {}).get('copyBooks', [])]
        }

    # Compile information for the application logic analysis
    app_info = {
        'programs': programs_info,
        'relationships': relationships,
        'parsed_details': parsed_results,
        'business_docs': business_logic_docs
    }

    # Create prompt for application business logic generation
    prompt = """# Application Business Logic Analysis

## Task Overview

Create a comprehensive application-level business logic document by analyzing the relationships between multiple RPGLE programs. This document should describe the overall business functionality of the application, how different programs interact, and provide a holistic view of the business processes implemented by the set of programs.

## Input Data

I have analyzed multiple RPGLE programs and identified the following relationships and business logic:

```json
{app_info_json}
```

## Required Document Sections

Please create a comprehensive application business logic document with the following sections:

### 1. Application Overview
- Application name (infer from program names/purposes)
- Overall business purpose
- Key business capabilities
- Primary business processes supported

### 2. System Architecture
- High-level architecture diagram (Mermaid format)
- Program interactions and dependencies
- Data flow between components
- External system integrations

### 3. Business Domain Model
- Core business entities and their relationships
- Key business concepts
- Domain terminology

### 4. Primary Business Workflows
- Main business processes from start to finish
- Process flow diagrams (Mermaid format)
- Decision points and business rules
- Exception handling paths

### 5. Data Management
- Database usage patterns
- Key data entities and their business purpose
- Data validation and business rules
- Data transformation processes

### 6. Integration Points
- External system dependencies
- Data exchange patterns
- Integration challenges and solutions

### 7. Business Rules Catalog
- Consolidated list of business rules across programs
- Rule categorization by business area
- Rule implementation details

### 8. Modernization Considerations
- Legacy design patterns identified
- Suggested improvements for modern architecture
- Business function to microservice mapping
- Suggested Java/Spring Boot implementation approach

## Mermaid Diagram Requirements

Include at least these Mermaid diagrams:

1. **Application Component Diagram**:
```
flowchart TD
  subgraph "Application Components"
    Program1[Program 1] --> Program2[Program 2]
    Program1 --> Program3[Program 3]
    Program2 --> Database[(Database)]
  end
  subgraph "External Systems"
    Program3 --> ExternalSystem[External System]
  end
```

2. **Business Process Workflow**:
```
flowchart TD
  Start[Business Trigger] --> Process1[Process 1]
  Process1 --> Decision{Decision Point}
  Decision -->|Condition A| Process2[Process 2]
  Decision -->|Condition B| Process3[Process 3]
  Process2 --> End[Business Outcome]
  Process3 --> End
```

3. **Domain Entity Relationship Diagram**:
```
classDiagram
  class Entity1 {
    +attribute1
    +attribute2
  }
  class Entity2 {
    +attribute1
    +attribute2
  }
  Entity1 "1" --> "many" Entity2: contains
```

## Output Format

The document should be formatted in Markdown with proper headings, lists, tables, and Mermaid diagrams.
"""

    # Convert app_info to JSON
    app_info_json = json.dumps(app_info, indent=2)

    # Call the LLM with the prompt
    app_logic_doc = query_llm(prompt.replace("{app_info_json}", app_info_json), max_tokens=8000)

    # Save the application logic document
    app_logic_filename = "application_business_logic.md"
    with open(app_logic_filename, "w") as f:
        f.write(app_logic_doc)

    print(f"Application business logic documentation saved to {app_logic_filename}")

    # Display the documentation
    display(HTML(f'<h2>Application Business Logic</h2>'))
    display(HTML(app_logic_doc))

    # Download link
    files.download(app_logic_filename)

    return app_logic_doc

# Create UI elements for application business logic
app_logic_output = widgets.Output()
app_logic_button = widgets.Button(description="Generate Application Business Logic")

def show_app_logic_interface(b):
    with app_logic_output:
        clear_output()
        generate_application_business_logic()

app_logic_button.on_click(show_app_logic_interface)

display(app_logic_button, app_logic_output)

Button(description='Generate Application Business Logic', style=ButtonStyle())

Output()

In [ ]:
## Step 15: Generate Spring Boot Java Code

def generate_java_code():
    """
    Generate Spring Boot Java code from the RPGLE programs based on the project structure and business logic analysis.
    """
    if not parsed_results and not file_contents:
        print("Please upload and parse RPGLE programs first.")
        return
    
    print("Generating Spring Boot Java code...")
    
    # Create directory structure for Java files
    java_project_dir = './java_project'
    os.makedirs(java_project_dir, exist_ok=True)
    
    # Track generated files
    generated_files = {}
    rpgle_files = parsed_results if parsed_results else {filename: {} for filename in file_contents.keys()}
    
    # For each RPGLE file, generate Java code
    for filename, parsed_data in rpgle_files.items():
        if filename not in file_contents:
            print(f"Warning: File content for {filename} not found. Skipping...")
            continue
            
        rpgle_code = file_contents[filename]
        print(f"Converting {filename} to Java code...")
        
        # Extract program info from parsed data (if available) or filename
        program_name = parsed_data.get('programName', os.path.splitext(filename)[0]) if parsed_data else os.path.splitext(filename)[0]
        
        # Create Java class names
        base_name = ''.join(word.title() for word in program_name.split('_') if word)
        if not base_name:
            base_name = ''.join(word.title() for word in filename.split('.')[0].split('_') if word)
        if not base_name or not base_name[0].isalpha():
            base_name = 'R' + base_name if base_name else 'RpgProgram'  # Default name if nothing works
            
        # Get business logic analysis if available
        business_logic = ""
        if business_logic_docs and filename in business_logic_docs:
            business_logic = business_logic_docs[filename]
        
        # Extract procedure and database info if available
        procedures = []
        databases = []
        data_structures = []
        
        if parsed_data and 'requiredLists' in parsed_data:
            req_lists = parsed_data['requiredLists']
            procedures = req_lists.get('subprocedures', [])
            databases = req_lists.get('databases', [])
            data_structures = req_lists.get('dataStructures', [])
        
        # RPGLE to Java conversion prompt
        prompt = f"""# RPGLE to Spring Boot Java Conversion

## Task
Convert this RPGLE program to Spring Boot Java code. Generate ALL necessary Java files for a complete working application.

## RPGLE Program Details
Program Name: {program_name}
Filename: {filename}

## Business Logic Summary
{business_logic[:2000] if business_logic else "No business logic analysis available."}

## RPGLE Code to Convert
```rpgle
{rpgle_code[:10000]}
```

## Required Output:
Generate ALL of the following Java files:

1. Entity Classes: Convert all RPGLE data structures to Java entity classes
2. Service Class: Convert main program and procedures to a service class 
3. Repository Interface: Create interface for database operations
4. Controller Class: Create a REST controller for API access
5. Configuration: Any needed configuration classes

Use these naming conventions:
- Entity: {base_name}Entity and other entity names based on data structures
- Service: {base_name}Service
- Repository: {base_name}Repository
- Controller: {base_name}Controller

For each file:
1. Provide the full file path (e.g., src/main/java/com/example/controller/MyController.java)
2. Provide the COMPLETE Java code for that file including ALL package declarations, imports, and code

IMPORTANT GUIDELINES:
- Include Spring Boot annotations (@Service, @Repository, @RestController, etc.)
- Convert ALL RPGLE business logic to equivalent Java code
- Map ALL RPGLE data structures to Java classes
- Map ALL RPGLE procedures to Java methods
- Convert ALL RPGLE database operations to Spring Data JPA
- Include ALL Javadoc comments
- Follow Spring Boot best practices
- Make sure to properly handle types, enums, constants, etc.
- Include error handling
- Code must be COMPLETE and READY TO USE - no placeholder comments
"""
        
        print(f"Sending conversion request to LLM for {filename}...")
        result = query_llm(prompt, max_tokens=10000)
        
        if not result:
            print(f"Error: No response from LLM for {filename}")
            continue
            
        print(f"Received conversion result. Processing Java files...")
        
        # Process the response to extract Java files
        # Format is typically:
        # file_path
        # ```java
        # code
        # ```
        
        java_file_pattern = r'```(?:java)?\s*\n([\s\S]*?)```'
        file_path_pattern = r'(?:^|\n)(?:src|java|com)/[^\n]+\.java'
        
        # First try to extract file paths followed by code blocks
        current_path = None
        all_paths = re.findall(file_path_pattern, result)
        all_code_blocks = re.findall(java_file_pattern, result)
        
        created_files = []
        
        # If we have matching numbers of paths and code blocks, use them directly
        if len(all_paths) == len(all_code_blocks):
            for i, (path, code) in enumerate(zip(all_paths, all_code_blocks)):
                path = path.strip()
                
                # Clean up the path
                if not path.startswith('src/'):
                    path = 'src/main/java/' + path
                
                # Full path within project directory
                full_path = os.path.join(java_project_dir, path)
                
                # Create directory if it doesn't exist
                os.makedirs(os.path.dirname(full_path), exist_ok=True)
                
                # Write the file
                with open(full_path, 'w') as f:
                    f.write(code)
                
                generated_files[path] = code
                created_files.append(path)
                print(f"  Created Java file: {path}")
        else:
            # If the numbers don't match, try to extract using file markers in the text
            print(f"  Mismatched file paths and code blocks. Using alternative extraction method...")
            
            lines = result.split('\n')
            current_file = None
            current_code = []
            in_code_block = False
            
            for line in lines:
                # Check for file path marker
                if re.match(r'(?:src|java|com)/[^/\s]+\.java', line.strip()) and not in_code_block:
                    # Save previous file if any
                    if current_file and current_code:
                        # Clean up the path
                        if not current_file.startswith('src/'):
                            current_file = 'src/main/java/' + current_file
                        
                        # Full path within project directory
                        full_path = os.path.join(java_project_dir, current_file)
                        
                        # Create directory if it doesn't exist
                        os.makedirs(os.path.dirname(full_path), exist_ok=True)
                        
                        # Write the file
                        with open(full_path, 'w') as f:
                            f.write('\n'.join(current_code))
                        
                        generated_files[current_file] = '\n'.join(current_code)
                        created_files.append(current_file)
                        print(f"  Created Java file: {current_file}")
                    
                    # Start new file
                    current_file = line.strip()
                    current_code = []
                    in_code_block = False
                # Check for code block markers
                elif line.strip() == '```java' or line.strip() == '```':
                    if in_code_block:
                        in_code_block = False
                    else:
                        in_code_block = True
                # Collect code
                elif in_code_block and current_file:
                    current_code.append(line)
            
            # Save the last file if any
            if current_file and current_code:
                # Clean up the path
                if not current_file.startswith('src/'):
                    current_file = 'src/main/java/' + current_file
                
                # Full path within project directory
                full_path = os.path.join(java_project_dir, current_file)
                
                # Create directory if it doesn't exist
                os.makedirs(os.path.dirname(full_path), exist_ok=True)
                
                # Write the file
                with open(full_path, 'w') as f:
                    f.write('\n'.join(current_code))
                
                generated_files[current_file] = '\n'.join(current_code)
                created_files.append(current_file)
                print(f"  Created Java file: {current_file}")
        
        # Create default files if none were created
        if not created_files:
            print(f"  No files extracted from LLM response. Creating default Java files...")
            
            # Create a default entity class
            entity_path = f"src/main/java/com/example/domain/{base_name}.java"
            entity_code = """package com.example.domain;

import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;
import lombok.Data;

/**
 * Entity class for PROGRAM_NAME
 * Generated from RPGLE program
 */
@Entity
@Table(name = "PROGRAM_TABLE")
@Data
public class CLASSNAME {
    @Id
    private Long id;
    
    // Add fields based on RPGLE data structures
}
""".replace("PROGRAM_NAME", program_name).replace("PROGRAM_TABLE", program_name.upper()).replace("CLASSNAME", base_name)

            full_path = os.path.join(java_project_dir, entity_path)
            os.makedirs(os.path.dirname(full_path), exist_ok=True)
            with open(full_path, 'w') as f:
                f.write(entity_code)
            generated_files[entity_path] = entity_code
            print(f"  Created Java file: {entity_path}")
            
            # Create a default service class
            service_path = f"src/main/java/com/example/service/{base_name}Service.java"
            service_code = """package com.example.service;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import com.example.domain.CLASSNAME;
import com.example.repository.CLASSNAMERepository;
import java.util.List;

/**
 * Service class for PROGRAM_NAME
 * Generated from RPGLE program
 */
@Service
public class CLASSNAMEService {
    
    @Autowired
    private CLASSNAMERepository repository;
    
    /**
     * Get all CLASSNAME records
     */
    public List<CLASSNAME> getAllCLASSNAMEs() {
        return repository.findAll();
    }
    
    /**
     * Get CLASSNAME by id
     */
    public CLASSNAME getCLASSNAMEById(Long id) {
        return repository.findById(id).orElse(null);
    }
    
    /**
     * Save CLASSNAME
     */
    public CLASSNAME saveCLASSNAME(CLASSNAME entity) {
        return repository.save(entity);
    }
    
    /**
     * Delete CLASSNAME
     */
    public void deleteCLASSNAME(Long id) {
        repository.deleteById(id);
    }
    
    // Add methods based on RPGLE procedures
}
""".replace("PROGRAM_NAME", program_name).replace("CLASSNAME", base_name)

            full_path = os.path.join(java_project_dir, service_path)
            os.makedirs(os.path.dirname(full_path), exist_ok=True)
            with open(full_path, 'w') as f:
                f.write(service_code)
            generated_files[service_path] = service_code
            print(f"  Created Java file: {service_path}")
            
            # Create a default repository interface
            repo_path = f"src/main/java/com/example/repository/{base_name}Repository.java"
            repo_code = """package com.example.repository;

import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.stereotype.Repository;
import com.example.domain.CLASSNAME;

/**
 * Repository interface for PROGRAM_NAME
 * Generated from RPGLE program
 */
@Repository
public interface CLASSNAMERepository extends JpaRepository<CLASSNAME, Long> {
    // Add query methods based on RPGLE file operations
}
""".replace("PROGRAM_NAME", program_name).replace("CLASSNAME", base_name)

            full_path = os.path.join(java_project_dir, repo_path)
            os.makedirs(os.path.dirname(full_path), exist_ok=True)
            with open(full_path, 'w') as f:
                f.write(repo_code)
            generated_files[repo_path] = repo_code
            print(f"  Created Java file: {repo_path}")
            
            # Create a default controller class
            controller_path = f"src/main/java/com/example/controller/{base_name}Controller.java"
            controller_code = """package com.example.controller;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.*;
import com.example.domain.CLASSNAME;
import com.example.service.CLASSNAMEService;
import java.util.List;

/**
 * REST Controller for PROGRAM_NAME
 * Generated from RPGLE program
 */
@RestController
@RequestMapping("/api/LOWERNAME")
public class CLASSNAMEController {
    
    @Autowired
    private CLASSNAMEService service;
    
    /**
     * Get all CLASSNAMEs
     */
    @GetMapping
    public List<CLASSNAME> getAllCLASSNAMEs() {
        return service.getAllCLASSNAMEs();
    }
    
    /**
     * Get CLASSNAME by id
     */
    @GetMapping("/{id}")
    public CLASSNAME getCLASSNAMEById(@PathVariable Long id) {
        return service.getCLASSNAMEById(id);
    }
    
    /**
     * Create CLASSNAME
     */
    @PostMapping
    public CLASSNAME createCLASSNAME(@RequestBody CLASSNAME entity) {
        return service.saveCLASSNAME(entity);
    }
    
    /**
     * Update CLASSNAME
     */
    @PutMapping("/{id}")
    public CLASSNAME updateCLASSNAME(@PathVariable Long id, @RequestBody CLASSNAME entity) {
        entity.setId(id);
        return service.saveCLASSNAME(entity);
    }
    
    /**
     * Delete CLASSNAME
     */
    @DeleteMapping("/{id}")
    public void deleteCLASSNAME(@PathVariable Long id) {
        service.deleteCLASSNAME(id);
    }
}
""".replace("PROGRAM_NAME", program_name).replace("CLASSNAME", base_name).replace("LOWERNAME", base_name.lower())

            full_path = os.path.join(java_project_dir, controller_path)
            os.makedirs(os.path.dirname(full_path), exist_ok=True)
            with open(full_path, 'w') as f:
                f.write(controller_code)
            generated_files[controller_path] = controller_code
            print(f"  Created Java file: {controller_path}")
    
    # Create main application class
    app_path = "src/main/java/com/example/Application.java"
    app_full_path = os.path.join(java_project_dir, app_path)
    
    if not os.path.exists(app_full_path):
        app_code = """package com.example;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

/**
 * Main Spring Boot Application class
 * Generated from RPGLE conversion
 */
@SpringBootApplication
public class Application {
    public static void main(String[] args) {
        SpringApplication.run(Application.class, args);
    }
}
"""
        os.makedirs(os.path.dirname(app_full_path), exist_ok=True)
        with open(app_full_path, 'w') as f:
            f.write(app_code)
        generated_files[app_path] = app_code
        print(f"Created Java file: {app_path}")
    
    # Create application.properties
    props_path = "src/main/resources/application.properties"
    props_full_path = os.path.join(java_project_dir, props_path)
    
    if not os.path.exists(props_full_path):
        props_content = """# Spring Boot application properties
# Generated from RPGLE conversion

# Server configuration
server.port=8080

# Database configuration
spring.datasource.url=jdbc:h2:mem:testdb
spring.datasource.driverClassName=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=password
spring.jpa.database-platform=org.hibernate.dialect.H2Dialect

# JPA/Hibernate configuration
spring.jpa.hibernate.ddl-auto=update
spring.jpa.show-sql=true

# H2 Console configuration
spring.h2.console.enabled=true
spring.h2.console.path=/h2-console
"""
        os.makedirs(os.path.dirname(props_full_path), exist_ok=True)
        with open(props_full_path, 'w') as f:
            f.write(props_content)
        generated_files[props_path] = props_content
        print(f"Created file: {props_path}")
    
    # Create pom.xml if it doesn't exist
    pom_path = "pom.xml"
    pom_full_path = os.path.join(java_project_dir, pom_path)
    
    if not os.path.exists(pom_full_path):
        pom_content = """<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.7.17</version>
        <relativePath/>
    </parent>
    
    <groupId>com.example</groupId>
    <artifactId>rpgle-converted-app</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>RPGLE Converted Application</name>
    <description>Spring Boot application converted from RPGLE code</description>
    
    <properties>
        <java.version>11</java.version>
    </properties>
    
    <dependencies>
        <!-- Spring Boot Starters -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-jpa</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-validation</artifactId>
        </dependency>
        
        <!-- Database -->
        <dependency>
            <groupId>com.h2database</groupId>
            <artifactId>h2</artifactId>
            <scope>runtime</scope>
        </dependency>
        
        <!-- Lombok for boilerplate reduction -->
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        
        <!-- Testing -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>
    
    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
                <configuration>
                    <excludes>
                        <exclude>
                            <groupId>org.projectlombok</groupId>
                            <artifactId>lombok</artifactId>
                        </exclude>
                    </excludes>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>
"""
        with open(pom_full_path, 'w') as f:
            f.write(pom_content)
        generated_files[pom_path] = pom_content
        print(f"Created file: {pom_path}")
    
    # Create README.md with information about the converted project
    readme_path = "README.md"
    readme_full_path = os.path.join(java_project_dir, readme_path)
    
    readme_content = """# RPGLE to Spring Boot Converted Application

This Spring Boot application was automatically generated from RPGLE source code using an LLM-based conversion tool.

## Project Structure

The project follows a standard Spring Boot architecture:

- `src/main/java/com/example/domain/` - Entity classes converted from RPGLE data structures
- `src/main/java/com/example/repository/` - Data access interfaces for database operations
- `src/main/java/com/example/service/` - Business logic services converted from RPGLE programs
- `src/main/java/com/example/controller/` - REST API controllers for accessing the services

## Converted RPGLE Programs

The following RPGLE programs were converted:

"""
    
    for filename in file_contents.keys():
        readme_content += f"- `{filename}` - Converted to Java classes\n"
    
    readme_content += """
## How to Run

1. Make sure you have Java 11+ and Maven installed
2. Clone this repository
3. Run `mvn spring-boot:run` to start the application
4. Access the H2 console at http://localhost:8080/h2-console (JDBC URL: jdbc:h2:mem:testdb, Username: sa, Password: password)
5. Access the REST API at http://localhost:8080/api/...

## Notes on Conversion

- The conversion process attempted to maintain the business logic from the original RPGLE code
- Data structures were converted to JPA entities
- File operations were converted to Spring Data repository methods
- Business logic in procedures was converted to service methods
- REST API endpoints were added for accessing the functionality
"""
    
    with open(readme_full_path, 'w') as f:
        f.write(readme_content)
    generated_files[readme_path] = readme_content
    print(f"Created file: {readme_path}")
    
    # Create a zip file of the Java project
    shutil.make_archive('java_project', 'zip', '.', 'java_project')
    print("\nJava code generation complete!")
    print(f"Created {len(generated_files)} Java files.")
    print("\nAll files have been saved to java_project.zip")
    
    # Provide download link
    try:
        files.download('java_project.zip')
    except Exception as e:
        print(f"Error providing download: {e}")
        print("You can manually download the generated zip file.")
    
    return generated_files

# Execute the function when the cell is run
generated_java_code = generate_java_code()

## Summary

This notebook provides a comprehensive tool for converting RPGLE programs to Spring Boot Java applications using LLM-assisted analysis. The process:

1. Uploads and analyzes RPGLE files to identify dependencies
2. Detects and classifies formats used in the code
3. Extracts detailed program metadata
4. Visualizes program relationships and dependencies
5. Generates Spring Boot conversion recommendations
6. Exports results for further processing
7. Generates Java architecture recommendations
8. Provides domain package organization suggestions
9. Recommends service boundaries
10. Creates a complete Spring Boot project structure
11. Parses RPGLE programs to extract structured information
12. Generates comprehensive business logic documentation
13. Creates combined documentation with parsed code and business logic
14. Produces application-level business logic analysis
15. Generates Spring Boot Java code implementing the RPGLE functionality

The tool uses LLMs to understand the complex RPGLE code and provide intelligent conversion guidance, making it easier to modernize legacy RPG applications to Java Spring Boot.

## Summary

This notebook provides a comprehensive tool for converting RPGLE programs to Spring Boot Java applications using LLM-assisted analysis. The process:

1. Uploads and analyzes RPGLE files to identify dependencies
2. Detects and classifies formats used in the code
3. Extracts detailed program metadata
4. Visualizes program relationships and dependencies
5. Generates Spring Boot conversion recommendations
6. Exports results for further processing
7. Generates Java architecture recommendations
8. Provides domain package organization suggestions
9. Recommends service boundaries
10. Creates a complete Spring Boot project structure

The tool uses LLMs to understand the complex RPGLE code and provide intelligent conversion guidance.