<a href="https://colab.research.google.com/github/odanielgp/sudolang-file-merger/blob/main/SudoLang_XML_document_merger.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Text File Merger for Sudo Lang Compliance

**Description:**
This Python code provides a user-friendly interface for merging multiple text files into a single XML file format that complies with the [SudoLang](https://github.com/paralleldrive/sudolang-llm-support/blob/main/sudolang.sudo.md) requirements for including files in an instruction or prompt. The code is designed to be run in a Google Colab environment.

The main functionality of the code is as follows:

1. File Upload:
   - The user is presented with an "Upload Files" button.
   - Upon clicking the button, the user can select multiple text files to upload.
   - The uploaded files are stored in a dictionary, with the file names as keys and file contents as values.

2. File Processing:
   - After the files are uploaded successfully, the "Process Files" button becomes enabled.
   - When the user clicks the "Process Files" button, the code processes the uploaded files.
   - Each file is converted into an XML structure, with the file name as the `<source>` element and the file content as the `<document_content>` element.
   - The XML structures for each file are stored in a list.

3. XML Structure Generation:
   - The code then generates the complete XML structure by combining the individual file XML structures.
   - The XML structures are joined together and wrapped with the `<documents>` root element.
   - The generated XML structure is displayed in the output area of the user interface.

4. File Download:
   - After the XML structure is generated, the "Download XML" button becomes enabled.
   - When the user clicks the "Download XML" button, the code initiates the download of the generated XML file.
   - The XML file is saved with the name "merged_files.xml" and automatically downloaded through the Colab interface.

The code utilizes the IPython widgets library to create an interactive user interface. The interface consists of the following components:
- Title label: Displays the title of the application.
- Upload Files button: Allows the user to select and upload text files.
- Process Files button: Initiates the processing of uploaded files and generates the XML structure.
- Download XML button: Enables the user to download the generated XML file.
- Output area: Displays relevant messages and the generated XML structure.

In [7]:
from google.colab import files
import ipywidgets as widgets
from IPython.display import display

def process_uploaded_files(uploaded):
    """
    Process the uploaded files and create an XML structure for each file.

    Args:
        uploaded (dict): A dictionary containing the uploaded file names as keys and file contents as values.

    Returns:
        list: A list of XML structures for each uploaded file.
    """
    documents = []
    for i, (file_name, file_data) in enumerate(uploaded.items(), start=1):
        file_content = file_data.decode("utf-8")
        document = f'''
<document index="{i}">
<source>{file_name}</source>
<document_content>{file_content}</document_content>
</document>
'''
        documents.append(document)
    return documents

def generate_xml_structure(documents):
    """
    Generate the complete XML structure by combining the individual document XML structures.

    Args:
        documents (list): A list of XML structures for each document.

    Returns:
        str: The complete XML structure.
    """
    xml_structure = "<documents>\n"
    xml_structure += "\n".join(documents)
    xml_structure += "\n</documents>"
    return xml_structure

def download_file(file_name, file_content):
    """
    Download a file with the given file name and content.

    Args:
        file_name (str): The name of the file to be downloaded.
        file_content (str): The content of the file to be downloaded.
    """
    with open(file_name, "w") as file:
        file.write(file_content)
    files.download(file_name)

def on_upload_button_clicked(b):
    """
    Event handler for the "Upload Files" button click.

    Args:
        b (ipywidgets.Button): The button that triggered the event.
    """
    global uploaded
    uploaded = files.upload()
    output_area.clear_output()
    with output_area:
        print("Files uploaded successfully!")
    process_button.disabled = False

def on_process_button_clicked(b):
    """
    Event handler for the "Process Files" button click.

    Args:
        b (ipywidgets.Button): The button that triggered the event.
    """
    global xml_output
    documents = process_uploaded_files(uploaded)
    xml_output = generate_xml_structure(documents)
    output_area.clear_output()
    with output_area:
        print("Generated XML structure:")
        print(xml_output)
    download_button.disabled = False

def on_download_button_clicked(b):
    """
    Event handler for the "Download XML" button click.

    Args:
        b (ipywidgets.Button): The button that triggered the event.
    """
    file_name = "merged_files.xml"
    download_file(file_name, xml_output)
    output_area.clear_output()
    with output_area:
        print(f"The XML file '{file_name}' has been downloaded.")

# Create UI components
title_label = widgets.Label(value="Text File Merger", style={'description_width': 'initial'})
upload_button = widgets.Button(description="Upload Files")
process_button = widgets.Button(description="Process Files", disabled=True)
download_button = widgets.Button(description="Download XML", disabled=True)
output_area = widgets.Output()

# Set up event handlers
upload_button.on_click(on_upload_button_clicked)
process_button.on_click(on_process_button_clicked)
download_button.on_click(on_download_button_clicked)

# Create the UI layout
ui_components = [
    title_label,
    upload_button,
    process_button,
    download_button,
    output_area
]

ui = widgets.VBox(ui_components)

# Display the UI
display(ui)

VBox(children=(Label(value='Text File Merger', style=DescriptionStyle(description_width='initial')), Button(de…