# 🎯 **Step 0: Installing Required Packages**
---
To run this notebook, you need to install the following Python packages:

```bash
pip install langchain-community reportlab python-dotenv
pip install -U langchain-openai
```

In [None]:
!pip install -q langchain-community reportlab python-dotenv
!pip install -U -q langchain-openai

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


# 🎯 **Step 1: Importing Required Libraries**
---

In [None]:
from abc import ABC, abstractmethod
from reportlab.lib.pagesizes import A4
from reportlab.platypus import BaseDocTemplate, PageTemplate, Frame, Paragraph, Spacer, HRFlowable, Image
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.enums import TA_LEFT
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
from reportlab.lib.units import inch, cm
import re
from google.colab import userdata
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.memory import ConversationBufferMemory, ConversationSummaryMemory
from concurrent.futures import ThreadPoolExecutor, as_completed
import threading

# 🎯 **Step 2: Abstract Base Class for Document Generators**
---

In [None]:
class DocumentGenerator(ABC):
    """
    Abstract base class for document generators that defines a consistent interface
    for generating documents based on user input prompts.

    This class should be subclassed by concrete implementations that provide the
    actual logic for generating documents. It enforces the implementation of the
    `generate` method which takes a user prompt and an output filename.

    Methods:
        generate(user_prompt: str, output_filename: str) -> None:
            Abstract method to be implemented by subclasses to handle the document
            generation process based on the user's input.

    Raises:
        NotImplementedError: If the 'generate' method is not implemented in a subclass.
    """

    @abstractmethod
    def generate(self, user_prompt, output_filename):
        """
        Abstract method to generate a document based on the given user prompt and
        save it to the specified output filename.

        Parameters:
            user_prompt (str): The user's input prompt used for generating content.
            output_filename (str): The filename where the generated document will be saved.

        Raises:
            NotImplementedError: This method must be implemented in a subclass.
        """
        pass

# 🎯 **Step 3: Custom Style Manager for PDF Document Generation**
---

In [None]:
class StyleManager:
    """
    Manages and applies custom paragraph styles and fonts for PDF document generation.

    This class is responsible for registering fonts, creating paragraph styles (such as
    headers and body text), and exposing them for use in document rendering using
    ReportLab. It ensures consistent typography and styling throughout the generated PDF.

    Attributes:
        styles (StyleSheet1): A ReportLab stylesheet containing predefined styles.

    Methods:
        register_font(font_name: str, font_path: str) -> None:
            Registers a font to be used in the document styles.

        get_styles() -> StyleSheet1:
            Returns the stylesheet containing all custom paragraph styles.
    """

    def __init__(self):
        """
        Initializes the StyleManager by setting up the stylesheet,
        registering required fonts, and defining custom styles.
        """
        self.styles = getSampleStyleSheet()
        self.__register_fonts()
        self.__create_custom_styles()

    @classmethod
    def register_font(cls, font_name: str, font_path: str):
        """
        Registers a new font for use in PDF generation.

        Parameters:
            font_name (str): The name to assign to the registered font.
            font_path (str): The file path to the font file (.ttf).
        """
        pdfmetrics.registerFont(TTFont(font_name, font_path))


    def __register_fonts(self):
        """
        Internal method to register all required fonts used in the custom styles.
        """
        StyleManager.register_font('Poppins', '/content/drive/MyDrive/DIBIMBING/DATA ENGINEER/PROJECT/Day 2 - Assignment - Python/Fonts/Poppins-Regular.ttf')
        StyleManager.register_font('Poppins-Bold', '/content/drive/MyDrive/DIBIMBING/DATA ENGINEER/PROJECT/Day 2 - Assignment - Python/Fonts/Poppins-Bold.ttf')
        StyleManager.register_font('Poppins-Italic', '/content/drive/MyDrive/DIBIMBING/DATA ENGINEER/PROJECT/Day 2 - Assignment - Python/Fonts/Poppins-Italic.ttf')

    def __create_custom_styles(self):
         """
        Internal method to define and add custom paragraph styles
        (e.g., headers, body text, and title styles) to the stylesheet.
        """
        self.styles.add(ParagraphStyle(
            name='CustomBodyText',
            fontName='Poppins',
            fontSize=11,
            leading=14,
            alignment=TA_LEFT,
            spaceAfter=6
        ))
        self.styles.add(ParagraphStyle(
            name='Header1',
            fontName='Poppins-Bold',
            fontSize=18,
            textColor='#1a356e',
            spaceAfter=6,
            spaceBefore=12,
            leading=22
        ))
        self.styles.add(ParagraphStyle(
            name='Header2',
            fontName='Poppins-Bold',
            fontSize=14,
            textColor='#1a356e',
            spaceAfter=8,
            spaceBefore=8,
            leading=18
        ))
        self.styles.add(ParagraphStyle(
            name='Header3',
            fontName='Poppins-Bold',
            fontSize=12,
            textColor='#1a356e',
            spaceAfter=6,
            spaceBefore=6,
            leading=16
        ))
        self.styles.add(ParagraphStyle(
            name='TitleLarge',
            fontName='Poppins-Bold',
            fontSize=24,
            alignment=TA_LEFT,
            spaceAfter=14,
            leading=28
        ))

    def get_styles(self):
        """
        Returns the customized stylesheet for use in PDF document generation.

        Returns:
            StyleSheet1: A ReportLab stylesheet containing all defined styles.
        """
        return self.styles

# 🎯 **Step 4: PDF Document Template with Custom Header and Footer**
---

In [None]:
class DocumentTemplate:
    """
    Represents a document template that includes a customizable header and footer.

    This class sets up a PDF layout using ReportLab's BaseDocTemplate. It includes
    configuration for document metadata, margins, frames, and predefined header/footer
    layout including a logo, title, and page number.

    Attributes:
        filename (str): The target output filename for the generated PDF.
        title (str): The title of the document, used in headers or metadata.
        theme_general (str): The general theme of the document, displayed in the header.
        metadata (str): Metadata for the document (e.g., author, date).
        styles (StyleSheet1): A ReportLab stylesheet instance with custom styles.
        doc (BaseDocTemplate): The constructed document template object.

    Methods:
        header_footer(canvas, doc):
            Adds a header and footer to each page, including logo, theme, and page number.
    """

    def __init__(self, filename: str, title: str, theme_general: str, metadata: str):
         """
        Initializes the DocumentTemplate with metadata and prepares styles and layout.

        Parameters:
            filename (str): The filename for the document.
            title (str): The title of the document.
            theme_general (str): The general theme of the document.
            metadata (str): The metadata for the document.
        """
        self.__filename = filename
        self.title = title
        self.theme_general = theme_general
        self.metadata = metadata
        self.styles = StyleManager().get_styles()
        self.__doc = self.__create_template()

    @property
    def filename(self) -> str:
        """
        Returns the filename where the PDF will be saved.

        Returns:
            str: The document filename.
        """
        return self.__filename

    @property
    def doc(self):
         """
        Returns the constructed BaseDocTemplate object.

        Returns:
            BaseDocTemplate: The document template with configured layout.
        """
        return self.__doc

    def __create_template(self):
       """
        Creates and returns the BaseDocTemplate with margin, frame, and header/footer template.

        Returns:
            BaseDocTemplate: The created document template.
        """
        frame = Frame(
            2 * cm,
            2.5 * cm,
            A4[0] - 4 * cm,
            A4[1] - 5 * cm,
            id='normal'
        )

        page_template = PageTemplate(
            id='PageWithHF',
            frames=frame,
            onPage=self.header_footer
        )

        return BaseDocTemplate(
            self.filename,
            pagesize=A4,
            leftMargin=1.2*inch,
            rightMargin=1.2*inch,
            topMargin=1.5*inch,
            bottomMargin=1*inch
        )

    def header_footer(self, canvas, doc):
        """
        Draws the header and footer for each page of the PDF.

        Header includes a logo and the theme. Footer includes the page number.

        Parameters:
            canvas (Canvas): The canvas to draw on.
            doc (BaseDocTemplate): The document being generated.
        """
        canvas.saveState()

        logo_path = '/content/drive/MyDrive/DIBIMBING/DATA ENGINEER/PROJECT/Day 2 - Assignment - Python/Logo/Logo.png'
        try:
            logo = Image(logo_path, width=1*inch, height=1*inch)
            logo.drawOn(canvas, 2 * cm, A4[1] - 1.5 * cm)
        except:
            print("Warning: Logo file not found, skipping logo.")

        canvas.setFont('Poppins', 10)
        canvas.drawRightString(A4[0] - 2 * cm, A4[1] - 1.5 * cm, f"Tema: {self.theme_general}")

        canvas.setFont('Poppins', 10)
        canvas.drawRightString(A4[0] - 2 * cm, 1.5 * cm, f"Halaman {doc.page}")

        canvas.restoreState()

# 🎯 **Step 5: Memory Manager for LLM-Based Report Generation**
---

In [None]:
class MemoryManager:
    """
    Manages conversation memory for AI interactions using buffer and summary memory.

    This class helps maintain context across interactions with a language model (LLM)
    by saving user inputs and AI outputs. It supports both short-term (buffered)
    and long-term (summarized) memory using Langchain's memory utilities.

    Attributes:
        llm: The language model instance used for summarization.
        buffer_memory (ConversationBufferMemory): Stores recent conversation history.
        summary_memory (ConversationSummaryMemory): Stores a summarized version of the conversation.
        __sections_content (list): Internal list to store AI-generated sections.

    Methods:
        update_memory(human_input, ai_output):
            Saves a new pair of user and AI messages to both memory types.

        clear_memory():
            Clears both buffer and summary memory along with internal content cache.

        memory_context:
            Returns the concatenated context from stored AI outputs.
    """

    def __init__(self, llm):
        """
        Initializes the MemoryManager with a language model for memory handling.

        Parameters:
            llm: The language model used for AI interactions and memory summarization.
        """
        self.llm = llm
        self.buffer_memory = ConversationBufferMemory()
        self.summary_memory = ConversationSummaryMemory(llm=llm)
        self.__sections_content = []

    def update_memory(self, human_input: str, ai_output: str) -> bool:
        """
        Updates the memory with a new human input and AI output.

        Parameters:
            human_input (str): The input provided by the user.
            ai_output (str): The response generated by the AI.

        Returns:
            bool: True if memory update was successful, False otherwise.
        """
        try:
            self.buffer_memory.save_context(
                inputs={"human_input": human_input},
                outputs={"ai_output": ai_output}
            )
            self.summary_memory.save_context(
                inputs={"human_input": human_input},
                outputs={"ai_output": ai_output}
            )
            self.__sections_content.append(ai_output)
            return True
        except Exception as e:
            print(f"Error updating memory: {str(e)}")
            return False

    @property
    def memory_context(self) -> str:
        """
        Returns the concatenated memory context consisting of previous AI outputs.

        Returns:
            str: Combined string of stored AI-generated sections.
        """
        return "\n".join(self.__sections_content)

    def clear_memory(self) -> bool:
        """
        Clears both the buffer and summary memory, along with internal section storage.

        Returns:
            bool: True if all memory was cleared successfully, False otherwise.
        """
        try:
            self.buffer_memory.clear()
            self.summary_memory.clear()
            self.__sections_content = []
            return True
        except Exception as e:
            print(f"Error clearing memory: {str(e)}")
            return False

# 🎯 **Step 6: Quality Controller for Section Review in AI-Generated Documents**
---

In [None]:
class QualityController:
    """
    Controls the quality of generated document sections using an external QC agent.

    This class is responsible for checking whether each section in a generated
    document meets quality standards. It communicates with a quality control agent,
    maintains a log of checks, and can export a summary report.

    Attributes:
        qc_agent: The agent used to evaluate content quality.
        document_title (str): The title of the document being reviewed.
        memory_manager: The memory manager for contextual information.
        qc_logs (list): Stores the logs of quality check results.

    Methods:
        check_section(heading, content, previous_heading):
            Evaluates the quality of a document section and logs the result.

        generate_report(filename):
            Saves a report file summarizing the quality control checks.

        __extract_text(content_elements):
            Extracts plain text from a list of paragraph elements.

        __parse_result(result):
            Parses the QC result into a status and note.
    """

    def __init__(self, qc_agent, document_title: str, memory_manager):
       """
        Initializes the QualityController.

        Parameters:
            qc_agent: The quality control agent.
            document_title (str): The title of the document.
            memory_manager: The memory manager for conversational context.
        """
        self.qc_agent = qc_agent
        self.document_title = document_title
        self.memory_manager = memory_manager
        self.qc_logs = []

    def check_section(self, heading: str, content, previous_heading: str = None) -> bool:
        """
        Checks the quality of a section using the quality control agent.

        Parameters:
            heading (str): The current section heading.
            content: A list of paragraph elements (reportlab Paragraphs).
            previous_heading (str, optional): Heading of the previous section.

        Returns:
            bool: True if the section passes the quality check, False otherwise.
        """
        try:
            raw_content = self.__extract_text(content)
            memory_context = self.memory_manager.memory_context

            context = f"Judul Utama: {self.document_title}"
            if previous_heading:
                context += f"\nHeading Sebelumnya: {previous_heading}"
            context += f"\n{memory_context}"

            result = self.qc_agent.invoke({
                "heading": heading,
                "content": raw_content,
                "memory_context": memory_context,
                "context": context
            }).content

            status, notes = self.__parse_result(result)
            self.qc_logs.append({
                'status': status,
                'heading': heading,
                'content': raw_content[:500] + "..." if len(raw_content) > 500 else raw_content,
                'notes': notes
            })
            return status.upper() == "LULUS"
        except Exception as e:
            print(f"QC Error: {str(e)}")
            return False

    def generate_report(self, filename: str = "qc_report.txt"):
        """
        Generates a quality control report and saves it to a file.

        Parameters:
            filename (str, optional): The name of the file to write the report to.
        """
        with open(filename, "w", encoding="utf-8") as f:
            f.write("=== LAPORAN QUALITY CONTROL ===\n")
            f.write(f"Judul Dokumen: {self.document_title}\n")
            f.write(f"Total Section: {len(self.qc_logs)}\n")
            passed = sum(1 for log in self.qc_logs if log['status'].upper() == "LULUS")
            f.write(f"Pass Rate: {(passed/len(self.qc_logs))*100:.1f}%\n\n")
            for idx, log in enumerate(self.qc_logs, 1):
                f.write(f"Section {idx}:\n")
                f.write(f"Judul: {log['heading']}\n")
                f.write(f"Status: {log['status']}\n")
                f.write(f"Catatan: {log['notes']}\n")
                f.write("-"*50 + "\n")

    def __extract_text(self, content_elements):
        """
        Extracts plain text from a list of paragraph elements.

        Parameters:
            content_elements: A list of Paragraph instances.

        Returns:
            str: The combined text from all paragraphs.
        """
        return "\n".join([p.getPlainText() for p in content_elements if isinstance(p, Paragraph)])

    def __parse_result(self, result: str):
        """
        Parses the result of the quality check to extract status and notes.

        Parameters:
            result (str): The response string from the QC agent.

        Returns:
            tuple: (status, notes), where both are strings.
        """
        status_match = re.search(r'\[STATUS\]\s*(.+?)\n', result)
        notes_match = re.search(r'\[CATATAN\]\s*(.+)', result, re.DOTALL)
        status = status_match.group(1).strip() if status_match else "TIDAK LULUS"
        notes = notes_match.group(1).strip() if notes_match else "Tidak ada catatan"
        return status, notes

# 🎯 **Step 7: Agent Manager for Modular AI-Driven Document Workflow**
---

In [None]:
class AgentManager:
    """
    Manages multiple AI agents for different document generation tasks.

    This class serves as a central hub for initializing and retrieving agents
    responsible for various stages of document generation, such as metadata extraction,
    theme generation, outline creation, content writing, and quality control.

    Attributes:
        llm: The language model used to power all agents.
        agents (dict): A dictionary of agents mapped by task name.

    Methods:
        __initialize_agents():
            Initializes and stores all agents in the manager.

        __create_chain(system_msg, human_template):
            Constructs a single agent chain using a system and human message prompt.
    """

    def __init__(self, llm):
        """
        Initializes the AgentManager with a language model.

        Parameters:
            llm: The language model instance used for creating agent chains.
        """
        self.llm = llm
        self.agents = self.__initialize_agents()

    def __initialize_agents(self):
        """
        Initializes predefined agents for each document processing task.

        Returns:
            dict: A dictionary mapping task names to their respective agents.
        """
        return {
            "metadata": self.__create_chain(
                "Sistem ekstraksi metadata dokumen",
                """
                    (C) Context:
                    Kamu adalah asisten AI yang bertugas mengekstraksi metadata dari permintaan pengguna. Metadata akan digunakan untuk membangun struktur dan konten dokumen.

                    (O) Objective:
                    Ekstrak informasi penting dari permintaan pengguna dan kembalikan dalam format yang terstruktur. Informasi yang perlu diekstrak:
                    - **Judul**: Judul utama dokumen
                    - **Tema**: Tema umum yang akan diangkat
                    - **Bahasa**: Bahasa yang digunakan (misalnya: Bahasa Indonesia, English)
                    - **Jumlah Halaman**: Jumlah halaman yang diminta (jika ada)
                    - **Tipe Konten**: Tipe konten (misalnya: akademik, blog, artikel)

                    (S) Style:
                    Gunakan gaya penulisan yang singkat, jelas, dan terstruktur. Hindari penggunaan bahasa yang ambigu.

                    (T) Tone:
                    Netral, profesional, dan informatif.

                    (A) Audience:
                    Sistem generasi dokumen dan pengguna akhir (misalnya: penulis, pembaca).

                    (R) Response:
                    Berikan hasil dalam format terstruktur seperti berikut:
                    - Judul: [Judul]
                    - Tema: [Tema]
                    - Bahasa: [Bahasa]
                    - Jumlah Halaman: [Jumlah Halaman]
                    - Tipe Konten: [Tipe Konten]

                    Contoh:
                    - Judul: Perkembangan Kecerdasan Buatan 2024
                    - Tema: Teknologi AI
                    - Bahasa: Bahasa Indonesia
                    - Jumlah Halaman: 1
                    - Tipe Konten: Artikel

                    Input Pengguna:
                    {user_prompt}
                """
            ),
            "theme": self.__create_chain(
                "Pembuat tema umum",
                """
                    (C) Context:
                    Kamu adalah asisten AI yang bertugas membuat tema umum untuk konten.

                    (O) Objective:
                    Buat tema umum berdasarkan judul berikut: {title}. Tema harus berbeda dari judul dan terdiri dari 2-3 kata.

                    (S) Style:
                    Gunakan Bahasa Indonesia formal.

                    (T) Tone:
                    Netral dan profesional.

                    (A) Audience:
                    Pembaca umum, blogger, atau penulis artikel.

                    (R) Response:
                    Berikan tema umum dalam format teks sederhana.
                """
            ),
            "structure": self.__create_chain(
                "Pembuat outline konten",
                """
                    (C) Context:
                    Kamu adalah asisten AI yang bertugas menyusun struktur konten untuk blog atau artikel.

                    (O) Objective:
                    Buat outline konten dari metadata berikut: {metadata}. Struktur harus fleksibel, informatif, dan menarik. Gunakan format header yang diberi nomor (contoh: `1. Pengantar`, `1.1 Apa itu AI?`).

                    (S) Style:
                    Gunakan Bahasa Indonesia yang ringkas dan menarik. Konten harus sesuai dengan gaya blog atau artikel.

                    (T) Tone:
                    Ramah, informatif, dan sedikit kasual.

                    (A) Audience:
                    Pembaca umum, blogger, atau penulis artikel.

                    (R) Response:
                    Tampilkan outline dalam format markdown heading:
                    - Gunakan heading level 1 dengan tanda `#` untuk bagian utama (misalnya: `# 1. Pengantar`)
                    - Gunakan heading level 2 dengan tanda `##` untuk subbagian (misalnya: `## 1.1 Apa itu AI?`)
                    - Maksimum dua level saja (contoh: `# 2.`, `## 2.1`) — tidak lebih dalam
                    - Gunakan penomoran yang konsisten dan rapi

                    Contoh format:
                    # 1. Pengantar
                    ## 1.1 Apa itu AI?
                    ## 1.2 Mengapa AI Penting?
                    # 2. Pengembangan AI
                    ...
                """
            ),
            "content": self.__create_chain(
                "Penulis konten",
                """
                    (C) Context:
                    {memory_context}

                    Kamu adalah penulis konten AI yang bertugas menulis isi dari sebuah artikel atau blog, dengan struktur heading yang sudah ditentukan sebelumnya.

                    (O) Objective:
                    Tuliskan isi dari bagian berikut: {section}. Panjang maksimal: {max_words} kata. Fokus pada isi, jangan buat heading. Pastikan konten informatif, menarik, dan mudah dipahami. Konten harus konsisten dengan konteks sebelumnya dan judul utama.

                    (S) Style:
                    Bahasa Indonesia yang ringkas dan menarik, dengan gaya penulisan yang sesuai untuk blog atau artikel.

                    (T) Tone:
                    Ramah, informatif, dan sedikit kasual.

                    (A) Audience:
                    Pembaca umum, blogger, atau penulis artikel.

                    (R) Response:
                    Kembalikan hasil dalam format markdown, TANPA heading. Gunakan aturan berikut:
                    - Paragraf harus terdiri dari 3-5 kalimat
                    - Gunakan bullet (`-`) jika perlu menjabarkan poin
                    - Gunakan **bold** untuk menekankan istilah penting atau konsep utama
                    - Gunakan *italic* hanya untuk istilah asing atau serapan jika relevan
                    - Jangan menambahkan referensi atau sumber
                """
            ),
            "qc": self.__create_chain(
                "Quality Control konten",
                """
                    (C) Context:
                    {memory_context}

                    Kamu adalah editor quality control (QC) untuk konten blog atau artikel dalam format markdown. Heading pada konten sudah ditentukan sebelumnya, dan kamu akan menerima teks isi dari setiap bagian TANPA heading-nya.

                    (O) Objective:
                    Tugas utama kamu adalah:
                    - Memastikan isi konten memiliki **keterkaitan dan relevansi yang kuat dengan heading/topik bagian** (judul heading akan diberikan)
                    - Memastikan konten bebas dari kesalahan tata bahasa, ejaan, dan logika
                    - Menjaga konsistensi istilah, kejelasan alur, dan keterbacaan
                    - Tidak menambahkan konten baru atau mengubah struktur besar
                    - Tidak membuat atau mengedit heading

                    (S) Style:
                    - Bahasa Indonesia yang ringkas dan menarik
                    - Gunakan markdown bila perlu: **bold**, *italic*, list
                    - Jaga format dan struktur paragraf agar mudah dibaca
                    - Gunakan baris kosong antar paragraf

                    (T) Tone:
                    Ramah, informatif, dan mudah dipahami

                    (A) Audience:
                    Pembaca umum, blogger, atau penulis artikel

                    (I) Input:
                    Kamu akan diberikan dua hal:
                    1. Heading bagian sebagai konteks
                    2. Konten isi dari bagian tersebut (dalam format markdown)

                    (R) Response:
                    Berikan penilaian kualitas terhadap konten berdasarkan heading yang diberikan, dengan **format berikut**:
                    [STATUS] LULUS / TIDAK LULUS [CATATAN] Penjelasan alasan kelulusan atau ketidaklulusan, bisa berupa:

                    - Apakah konten sudah relevan dengan heading
                    - Catatan perbaikan jika ada (misalnya: typo, grammar, logika, dll)
                    - Saran tambahan untuk meningkatkan keterbacaan atau konsistensi

                    Catatan: Tidak perlu menambahkan atau menyunting heading maupun isi konten. Fokus hanya pada evaluasi QC.

                    ---
                    **Heading Bagian:** {heading}

                    **Konten:**
                    {content}
                """
            )
        }

    def __create_chain(self, system_msg, human_template):
        """
        Creates a chain for a specific agent using prompt templates.

        Parameters:
            system_msg (str): The system message to define agent behavior.
            human_template (str): The template for user inputs.

        Returns:
            A chain composed of prompt and LLM for execution.
        """
        prompt = ChatPromptTemplate.from_messages([
            ("system", system_msg),
            ("human", human_template)
        ])
        return prompt | self.llm

# 🎯 **Step 8: Metadata Handler for Metadata & Document Structure Parser**
---

In [None]:
class MetadataHandler:
    """
    Handles metadata extraction and processing tasks for document generation.

    This class provides static utility methods to extract titles and parse
    outline structures from raw metadata or user-defined strings. It's designed
    to assist in structuring document headers and organizing content sections.

    Methods:
        extract_title(metadata):
            Extracts the document title from raw metadata text.

        parse_outline(outline):
            Parses a markdown-style outline into structured sections.

        __detect_level(text):
            Detects the heading level based on markdown symbols (#).
    """

    @staticmethod
    def extract_title(metadata):
        """
        Extracts the title from a metadata string.

        Parameters:
            metadata (str): The raw metadata text containing the title.

        Returns:
            str: The extracted title if found, otherwise a default title.
        """
        match = re.search(r'Judul:\s*(.+)', metadata)
        if match:
            return match.group(1).strip()
        else:
            print("Peringatan: Metadata tidak mengandung judul, menggunakan judul default.")
            return "Dokumen Tanpa Judul"

    @staticmethod
    def parse_outline(outline):
        """
        Parses an outline string into structured section headings.

        Parameters:
            outline (str): A string containing markdown-style headings.

        Returns:
            list: A list of tuples containing (heading_level, heading_text).
        """
        sections = []
        for line in outline.split('\n'):
            line = line.strip()
            if not line:
                continue
            level = MetadataHandler._detect_level(line)
            if level:
                sections.append((level, line))
        return sections

    @staticmethod
    def __detect_level(text):
        """
        Detects the markdown heading level based on '#' symbols.

        Parameters:
            text (str): The heading line to analyze.

        Returns:
            int or None: The heading level (1–3) or None if not a heading.
        """
        match = re.match(r'^#+', text)
        if match:
            level = len(match.group(0))
            return min(level, 3)
        return None

# 🎯 **Step 9: Multi Thread Manager for Executing Parallel Tasks Using ThreadPoolExecutor**
---

In [None]:
class MultiThreadManager:
    """
    MultiThreadManager — Executing Parallel Tasks Using ThreadPoolExecutor.

    This class handles the execution of multiple tasks in parallel using `ThreadPoolExecutor`.
    It is suitable for speeding up operations where multiple functions or processes can be run concurrently.

    Attributes:
        lock (threading.Lock): A lock object to ensure thread-safe access when collecting results.

    Methods:
        __init__(lock):
            Initializes the MultiThreadManager with a lock object.

        execute_parallel_tasks(tasks):
            Executes multiple functions in parallel and collects their results in a thread-safe manner.
    """

    def __init__(self, lock):
         """
        Initializes the MultiThreadManager with a lock object.

        Parameters:
            lock (threading.Lock): A lock object to ensure thread safety during shared data access.
        """
        self.lock = lock

    def execute_parallel_tasks(self, tasks):
        """
        Executes multiple functions in parallel and collects their results in a thread-safe manner.

        Parameters:
            tasks (list[Callable]): A list of functions to be executed concurrently.

        Returns:
            list: A list of results from all executed functions.
        """
        with ThreadPoolExecutor() as executor:
            futures = []
            for task in tasks:
                futures.append(executor.submit(task))
            results = []
            for future in as_completed(futures):
                with self.lock:
                    results.append(future.result())
            return results

# 🎯 **Step 10: PDFGenerator — Automated PDF Document Generation with AI and Quality Control**
---

In [None]:
class PDFGenerator:
    """"Main PDF document generator class.

    This class is responsible for generating PDF documents using AI for content generation,
    metadata extraction, and quality control. It integrates several AI agents and handles
    content generation, validation, and quality control processes.

    Attributes:
        llm (ChatOpenAI): The language model used for generating content.
        memory_manager (MemoryManager): Handles memory for AI conversations.
        agent_manager (AgentManager): Manages the AI agents for various tasks.
        agents (dict): The dictionary of AI agents (e.g., metadata extraction, content generation).
        styles (dict): Styles used for formatting the document.
        title (str): The title of the document.
        lock (threading.Lock): A lock object for managing multi-threading.
        content_generator (ContentGenerator): Generates content for the document sections.
        qc (QualityController): Manages the quality control of generated content.
        thread_manager (MultiThreadManager): Manages the parallel execution of tasks.

    Methods:
        __init__():
            Initializes the PDFGenerator with necessary components.

        generate(user_prompt, output_filename):
            Initiates the document generation process.
    """

    def __init__(self):
        """
        Initializes the PDF generator with necessary components including AI models, memory,
        agents, content generator, and quality control manager.

        """
        self.llm = ChatOpenAI(
            openai_api_key=userdata.get("OPENAI_API_KEY"),
            model_name="gpt-4o-mini",
            temperature=0.7,
            max_tokens=2000
        )
        self.memory_manager = MemoryManager(llm=self.llm)
        self.agent_manager = AgentManager(self.llm)
        self.agents = self.agent_manager.agents
        self.styles = StyleManager().get_styles()
        self.title = ""
        self.lock = threading.Lock()
        self.content_generator = ContentGenerator(
            self.agents,
            self.memory_manager,
            self.styles,
            self
        )
        self.qc = QualityController(
            self.agents["qc"],
            self.title,
            self.memory_manager
        )
        self.thread_manager = MultiThreadManager(self.lock)

    def generate(self, user_prompt, output_filename):
        """
        Generates a PDF document based on the provided user prompt.

        The process involves metadata extraction, content generation for sections,
        quality control checks, and the final PDF assembly.

        Parameters:
            user_prompt (str): The user's input specifying the document details.
            output_filename (str): The filename for the generated PDF document.

        Returns:
            bool: True if the document is successfully generated, False otherwise.
        """
        try:
            self.memory_manager.clear_memory()
            self.memory_manager.update_memory(
                human_input="Mulai generasi dokumen",
                ai_output=f"Judul: {user_prompt}"
            )

            metadata = self.agents["metadata"].invoke({"user_prompt": user_prompt}).content
            doc_title = MetadataHandler.extract_title(metadata)
            self.title = doc_title
            self.qc.doc_title = doc_title

            theme_general = self.agents["theme"].invoke({"title": doc_title}).content
            theme_general = theme_general.strip()

            if theme_general == doc_title:
                theme_general = "Tema Umum"

            outline = self.agents["structure"].invoke({"metadata": metadata}).content
            sections = MetadataHandler.parse_outline(outline)

            page_match = re.search(r'(\d+)\s*halaman', user_prompt.lower())
            max_pages = int(page_match.group(1)) if page_match else 2
            total_words = max_pages * 400
            words_per_section = max(80, total_words // len(sections))

            doc = DocumentTemplate(output_filename, doc_title, theme_general, metadata)
            story = []

            story.append(Paragraph(doc_title, self.styles['TitleLarge']))
            story.append(Spacer(1, 0.2 * cm))

            section_tasks = []
            previous_headings = [None] + [s[1] for s in sections[:-1]]

            for i, (level, section_text) in enumerate(sections):
                max_words = words_per_section
                prev_heading = previous_headings[i]
                section_tasks.append(
                    lambda l=level, st=section_text, mw=max_words, ph=prev_heading:
                    self.content_generator.generate_section(l, st, mw, ph)
                )

            generated_contents = self.thread_manager.execute_parallel_tasks(section_tasks)

            previous_heading = None
            story_sections = []

            for i, (level, section_text) in enumerate(sections):
                content = generated_contents[i]
                header_style = self.styles.get(f'Header{level}', self.styles['Header3'])
                section_text_without_hashtag = re.sub(r'^#+\s*', '', section_text)
                story_sections.append((level, section_text, content, previous_heading))
                previous_heading = section_text

            for level, section_text, content, prev_heading in story_sections:
                header_style = self.styles.get(f'Header{level}', self.styles['Header3'])
                section_text_without_hashtag = re.sub(r'^#+\s*', '', section_text)

                story.append(Paragraph(section_text_without_hashtag, header_style))

                if level == 1:
                    story.append(Spacer(1, 0.05 * cm))
                    story.append(HRFlowable(width="100%", thickness=1, color="#cccccc", spaceBefore=6, spaceAfter=6))
                    story.append(Spacer(1, 0.1 * cm))

                qc_status = False
                attempts = 0
                max_attempts = 5

                while not qc_status and attempts < max_attempts:
                    attempts += 1
                    qc_status = self.qc.check_section(section_text, content, prev_heading)

                    if qc_status:
                        story.extend(content)
                        self.memory_manager.update_memory(
                            human_input=f"Generating content for section: {section_text}",
                            ai_output="\n".join([p.getPlainText() for p in content if isinstance(p, Paragraph)]),
                        )
                    else:
                        print(f"Bagian '{section_text}' tidak lolos QC (Percobaan {attempts}/{max_attempts}). Mengenerate ulang...")
                        content = self.content_generator.regenerate_section(
                            level, section_text, words_per_section, prev_heading
                        )

                if not qc_status:
                    print(f"Gagal mengenerate bagian '{section_text}' setelah {max_attempts} percobaan. Melanjutkan ke bagian berikutnya.")

            doc.doc.addPageTemplates([PageTemplate(id='PageWithHF', frames=Frame(2 * cm, 2.5 * cm, A4[0] - 4 * cm, A4[1] - 5 * cm, id='normal'), onPage=doc.header_footer)])
            doc.doc.build(story)
            self.qc.generate_report()
            return True
        except Exception as e:
            print(f"Generation Error: {str(e)}")
            return False

# 🎯 **Step 11: Content Generator for Content Generation and Markdown Processing for PDF**
---

In [None]:
class ContentGenerator:
    """Handles content generation and processing for sections of the PDF document.

    This class is responsible for generating content for specific sections, processing it into
    paragraphs, and converting markdown-like syntax into formatted text. It integrates with
    AI agents and uses the memory manager to generate coherent content while adhering to
    the required structure.

    Attributes:
        agents (dict): Dictionary of AI agents for content generation and other tasks.
        memory_manager (MemoryManager): Manages memory context used for content generation.
        styles (dict): Styles used for formatting the content.
        pdf_generator (PDFGenerator): The main PDF generator instance that provides the document's title.

    Methods:
        __init__(agents, memory_manager, styles, pdf_generator):
            Initializes the ContentGenerator with necessary components.

        generate_section(level, section_text, max_words, previous_heading=None):
            Generates content for a specific section of the document.

        regenerate_section(level, section_text, max_words, previous_heading=None):
            Regenerates content for a specific section if quality control fails.

        __markdown_to_paragraphs(text, level):
            Converts markdown-like text into paragraphs with specified formatting.
    """

    def __init__(self, agents, memory_manager, styles, pdf_generator):
        """
        Initializes the content generator with the necessary components.

        Parameters:
            agents (dict): Dictionary of AI agents used for various tasks like content generation.
            memory_manager (MemoryManager): Manages the memory context for content generation.
            styles (dict): Styles to be applied to the generated content.
            pdf_generator (PDFGenerator): The PDFGenerator instance used to retrieve the document's title.
        """
        self.agents = agents
        self.memory_manager = memory_manager
        self.styles = styles
        self.pdf_generator = pdf_generator  # Store the PDFGenerator instance

    def generate_section(self, level, section_text, max_words, previous_heading=None):
        """
        Generates content for a specific section.

        Uses the AI agent for content generation based on the given section text,
        maximum word limit, and memory context. The generated content is then
        processed into paragraphs and returned.

        Parameters:
            level (int): The heading level of the section (e.g., 1 for main headings, 2 for subheadings).
            section_text (str): The text of the section to be generated.
            max_words (int): The maximum number of words for the generated section.
            previous_heading (str, optional): The title of the previous section. Defaults to None.

        Returns:
            list: A list of Paragraph objects representing the generated section content.
        """
        memory_context = self.memory_manager.memory_context
        context = f"Judul Utama: {self.pdf_generator.title}"  # Use PDFGenerator's title
        if previous_heading:
            context += f"\nHeading Sebelumnya: {previous_heading}"
        context += f"\n{memory_context}"

        raw_content = self.agents["content"].invoke({
            "section": section_text,
            "max_words": max_words,
            "memory_context": memory_context,
            "context": context
        }).content

        return self.__markdown_to_paragraphs(raw_content, level)

    def regenerate_section(self, level, section_text, max_words, previous_heading=None):
        """
        Regenerates content for a specific section.

        This method simply calls the `generate_section` method to regenerate content if the
        original content did not meet the quality control or other requirements.

        Parameters:
            level (int): The heading level of the section.
            section_text (str): The text of the section to be regenerated.
            max_words (int): The maximum number of words for the regenerated section.
            previous_heading (str, optional): The title of the previous section.

        Returns:
            list: A list of Paragraph objects representing the regenerated section content.
        """
        print(f"Mengenerate ulang bagian: {section_text}")
        return self.generate_section(level, section_text, max_words, previous_heading)

    def __markdown_to_paragraphs(self, text, level):
        """
        Converts markdown-like syntax into formatted paragraphs.

        This method processes raw text by applying basic markdown formatting (e.g., bold, italics,
        bullet points) and splitting the content into individual paragraphs. It also applies the
        appropriate styles for each paragraph based on the section's heading level.

        Parameters:
            text (str): The raw text to be converted into paragraphs.
            level (int): The heading level for the section.

        Returns:
            list: A list of Paragraph objects representing the formatted text.
        """
        body_style = self.styles['CustomBodyText']
        header_style = self.styles.get(f'Header{level}', self.styles['Header3'])

        # Basic markdown replacements
        text = re.sub(r'\*\*(.+?)\*\*', r'<b>\1</b>', text)
        text = re.sub(r'\*(.+?)\*', r'<i>\1</i>', text)
        text = re.sub(r'^#+\s*', '', text, flags=re.MULTILINE)
        text = re.sub(r'(^|\s)-\s+', r'\1• ', text)

        raw_paragraphs = re.split(r'\n{2,}', text.strip())
        paragraphs = []

        previous_was_bullet = False

        for para in raw_paragraphs:
            para = para.strip()
            if not para:
                continue

            if para.startswith("• ") or '• ' in para:
                # Bullet points detected
                bullet_points = re.findall(r'•\s[^•]+', para)
                for bullet in bullet_points:
                    paragraphs.append(Paragraph(bullet.strip(), body_style))
                previous_was_bullet = True
            else:
                sentences = re.split(r'(?<=[.!?])\s+', para)
                for i in range(0, len(sentences), 5):
                    chunk = ' '.join(sentences[i:i+5])
                    if chunk:
                        paragraphs.append(Paragraph(chunk.strip(), body_style))
                # Tambahkan Spacer jika sebelumnya bukan bullet dan setelah ini juga bukan bullet
                if len(sentences) > 1:
                    previous_was_bullet = False
                    paragraphs.append(Spacer(1, 0.1 * cm))

        return paragraphs

# 🎯 **Step 12: Test**
---

In [None]:
generator = PDFGenerator()

In [None]:
user_request = """
Buatkan artikel mengenai hubungan Data Engineering dan Artificial Intelligence secara lengkap dan bagaimana dampaknya jika diterapkan pada suatu perusahaan
"""

In [None]:
generator.generate(
    user_prompt=user_request,
    output_filename="laporan_DE_AI.pdf"
)

Bagian '## 1.1 Apa itu Data Engineering?' tidak lolos QC (Percobaan 1/5). Mengenerate ulang...
Mengenerate ulang bagian: ## 1.1 Apa itu Data Engineering?
Bagian '## 1.1 Apa itu Data Engineering?' tidak lolos QC (Percobaan 2/5). Mengenerate ulang...
Mengenerate ulang bagian: ## 1.1 Apa itu Data Engineering?
Bagian '## 1.2 Apa itu Artificial Intelligence?' tidak lolos QC (Percobaan 1/5). Mengenerate ulang...
Mengenerate ulang bagian: ## 1.2 Apa itu Artificial Intelligence?
Bagian '## 2.1 Penyediaan Data Berkualitas' tidak lolos QC (Percobaan 1/5). Mengenerate ulang...
Mengenerate ulang bagian: ## 2.1 Penyediaan Data Berkualitas
Bagian '## 2.2 Pengolahan dan Pembersihan Data' tidak lolos QC (Percobaan 1/5). Mengenerate ulang...
Mengenerate ulang bagian: ## 2.2 Pengolahan dan Pembersihan Data
Bagian '## 3.3 Peningkatan Pengalaman Pelanggan' tidak lolos QC (Percobaan 1/5). Mengenerate ulang...
Mengenerate ulang bagian: ## 3.3 Peningkatan Pengalaman Pelanggan
Bagian '# 4. Tantangan dalam Int

True