# Chained Code Remediation Pipeline
This notebook performs a two-step process:
1. **Scans code to extract SELECT queries, data types, and function models**
2. **Maps and transforms data types using S4 mapping from an external Excel, updating code lines if mapping found**

Input notebooks required in the working directory:
- `Scan-Program.ipynb` (for scanning ABAP code)
- `Transform-DE-DT.ipynb` (for data element mapping logic)
- `Mapping-DE-DT.xlsx` (for mapping data elements to S4 types)

You can adapt the `sample_code` variable to your own ABAP code for processing.

In [None]:
# Step 1: Import scan logic from Scan-Program.ipynb
# (copy-paste here, or use %run if on local Jupyter)
import re

def scan_keywords(code_lines):
    select_queries = []
    data_types = []
    function_names = []
    
    in_select = False
    current_select = []
    for line in code_lines:
        l = line.rstrip()
        # SELECT block
        if not in_select and re.search(r'\bSELECT\b', l, re.IGNORECASE):
            in_select = True
            current_select = [l]
            if l.strip().endswith(('.', ';')):
                select_queries.append(' '.join(current_select))
                in_select = False
                current_select = []
            continue
        if in_select:
            current_select.append(l)
            if l.strip().endswith(('.', ';')):
                select_queries.append(' '.join(current_select))
                in_select = False
                current_select = []
            continue
        # DATA: TYPE after TYPE keyword
        if re.search(r'\bDATA\b', l, re.IGNORECASE):
            match = re.search(r'TYPE\s+([\w\.\_]+)', l, re.IGNORECASE)
            if match:
                data_types.append(match.group(1))
        # CALL FUNCTION: function name
        if re.search(r'\bCALL FUNCTION\b', l, re.IGNORECASE):
            match = re.search(r'CALL FUNCTION\s+["\']?([\w\_]+)["\']?', l, re.IGNORECASE)
            if match:
                function_names.append(match.group(1))
    return select_queries, data_types, function_names

## Sample ABAP Code
Replace `sample_code` with your own codebase as a list of lines for testing.

In [None]:
sample_code = [
    'SELECT *',
    '  FROM users',
    '  WHERE id = 1;',
    'DATA lv_name TYPE string.',
    'CALL FUNCTION "BAPI_USER_CREATE".',
    'WRITE lv_name.',
    'SELECT id, name',
    '  FROM customers',
    '  WHERE active = 1',
    '  ORDER BY name.',
    'data: lt_orders TYPE TABLE_OF_ORDERS.',
    'call function "Z_CUSTOM_MODULE".',
    'DATA number TYPE i.',
    'CALL FUNCTION \'RFC_PING\'.'
]

In [None]:
# Step 2: Run scan to extract SELECTs, Data Types, Functions
selects, data_types, functions = scan_keywords(sample_code)
print('SELECT Queries:')
for s in selects:
    print(s)
print('\nDATA TYPEs:')
for d in data_types:
    print(d)
print('\nFUNCTION Names:')
for f in functions:
    print(f)

## Step 3: Import the Data Element Mapping Logic
We use the mapping function from `Transform-DE-DT.ipynb` for S4 mapping.

Make sure `Mapping-DE-DT.xlsx` is present in the working directory.

In [None]:
import pandas as pd
mapping_file = 'Mapping-DE-DT.xlsx'
df_map = pd.read_excel(mapping_file)

def get_s4_declaration(data_element):
    row = df_map.loc[df_map['CRM'] == data_element]
    if row.empty:
        return None  # Indicate no mapping
    s4 = str(row.iloc[0]['S4']).strip()
    length = row.iloc[0]['LENGTH'] if 'LENGTH' in row.columns else None
    decimal = row.iloc[0]['DECIMAL'] if 'DECIMAL' in row.columns else None
    result = s4
    if pd.notnull(length):
        result += f" LENGTH {int(float(length))}"
    if decimal is not None and pd.notnull(decimal):
        result += f" DECIMALS {int(float(decimal))}"
    return result

## Step 4: Create a Remediation Plan
- For each data type found, get the mapped S4 declaration (if available).
- For each line in code, replace original data type with mapped S4 declaration **only if mapping is found**.
- If no mapping, leave code unchanged.

In [None]:
# Build mapping dict: {original_type: mapped_declaration}
mapping_results = {}
for dt in set(data_types):
    mapped = get_s4_declaration(dt)
    if mapped:
        mapping_results[dt] = mapped

print('Mapping Results:')
for k, v in mapping_results.items():
    print(f'{k} -> {v}')

## Step 5: Replace Code with Mapped Data Types
- Only change the TYPE clause if mapping is found.

In [None]:
import copy

remediated_code = copy.deepcopy(sample_code)

for i, line in enumerate(remediated_code):
    if re.search(r'\bDATA\b', line, re.IGNORECASE):
        match = re.search(r'(TYPE\s+)([\w\.\_]+)', line, re.IGNORECASE)
        if match:
            original_type = match.group(2)
            if original_type in mapping_results:
                # Replace only the type part
                mapped_decl = mapping_results[original_type]
                new_line = re.sub(r'(TYPE\s+)([\w\.\_]+)', f"\\1{mapped_decl}", line, count=1, flags=re.IGNORECASE)
                remediated_code[i] = new_line

print('--- Remediated Code ---')
for l in remediated_code:
    print(l)

## Summary
- Extracted SELECT queries, Data Types, and Function Names
- Mapped and replaced Data Types in code using the mapping sheet, only where mapping found
- Output is a remediated code list (ready for review or export)
