### System for generating, evaluating, and refining risk policy documents

This code implements a system for generating, evaluating, and refining risk policy documents   
using large language models (LLMs) and vector databases. The primary goal is to automate the   
creation of high-quality policy documents that adhere to regulatory standards and organizational needs.   
The system is designed with a modular architecture, dividing the task into distinct, manageable components,   
each with a specific role in the document creation pipeline.

The core components of the system include: a DocumentCreator responsible for drafting and refining policy   
sections based on retrieved context, a DocumentEvaluator which provides feedback on the correctness, clarity,   
and compliance of individual sections, and a FinalReviewer that ensures the overall document is coherent,   
consistent, and complete. Additionally, a ChecklistEvaluator assesses the final document against a predefined   
checklist of requirements, providing a structured analysis for quality assurance. These components work   
together in a sequential manner, building and refining the document iteratively.

The process starts with the DocumentCreator retrieving relevant context from a Chroma vector database using a   
query constructed from the overall document description, section title, and section description. This context   
is then passed to the LLM to draft an initial section using Markdown, optimized for clarity and future LaTeX   
conversion. The generated draft undergoes an evaluation phase by the DocumentEvaluator, which identifies areas   
for improvement. This feedback is then used by the DocumentCreator to refine the section, producing a more   
polished result. This draft, evaluate, refine process is repeated for each section in a table of contents.

Once all sections have been created, they are assembled into a complete document. The FinalReviewer then   
conducts a holistic assessment of the entire policy, checking for consistency and completeness, and suggesting   
improvements to the overall structure. After a final review, a ChecklistEvaluator assesses the document against   
a predefined checklist of requirements. The results of this checklist evaluation are returned as a Pandas   
DataFrame, providing a structured overview of the document’s compliance.

To facilitate the transformation of the generated document, a LatexConverter converts the Markdown output into   
a well-formatted LaTeX document, complete with a table of contents, sections, and escaped special characters.   
The LaTeX document is then compiled into a PDF, allowing for a high-quality, professional output. This output   
process ensures the generated policies are visually appealing and ready for further use.

Finally, the main script initializes the components, retrieves the necessary configuration from environment   
variables, and then orchestrates the entire policy generation process. It manages all the stages described,   
from the context retrieval through to the final PDF compilation, logging all prompts, responses, and other   
metrics for performance analysis. The program also outputs the final document, the checklist evaluation   
results as a pandas dataframe and the cost and tokens used for each step of the process.

#### High-level possible improvement points
At a high level, the agent already covers the core lifecycle of drafting sections, evaluating them,   
refining with feedback, and finally reviewing holistically. From a “principal components” standpoint,  
here are some additional building blocks you might consider:

**Workflow and Approval Stage**  
Currently, the agent handles creation, evaluation, and refinement. Yet many policy and document    
workflows require an approval stage involving multiple stakeholders. A “Workflow Orchestrator” or    
“Approval Manager” could track different approval gates, requiring sign-off from designated roles. 

**Policy Governance / Versioning System**  
Version control is often critical when generating business or regulatory documents. Keeping an    
auditable record of changes (including timestamps, reasons for changes, and who initiated them)    
would be valuable. A component that automatically logs versions, diffs, and rationales behind    
revisions would address governance needs.

**Compliance Validator**  
Beyond evaluating clarity and correctness, you might want a module that specifically checks compliance    
with relevant regulations (e.g., Basel III specifics, local legal requirements). This validator could    
highlight noncompliant or missing elements, referencing actual regulatory texts.

**Risk Assessment and Gap Analysis**  
Especially for a Credit Risk Policy, an automated gap analysis module could compare the drafted document   
to recognized regulatory frameworks or best practices libraries. It would flag coverage gaps where   
crucial topics or sections are missing, prompting the user to address them.

**Specialized Knowledge Base Integration**  
If you have proprietary or domain-specific data sources beyond Chroma, you might integrate additional   
retrievers and vector stores, or even direct API calls to knowledge bases or compliance databases.   
This ensures the final document aligns with all internal and external mandates.

**Collaboration Layer**  
Large-scale policy creation often involves multiple contributors. A collaboration layer could store   
and display each contributor’s feedback, track assigned tasks, and unify the agent’s automated   
suggestions with the team’s manual edits.

**Monitoring & Auditing**  
Adding a logging, alerting, and audit trail for each stage of the document creation could be crucial  
for compliance. It would track who ran what step, which context or data was retrieved, how many tokens   
were used, and what the final text was at each iteration.

All of these represent higher-level, “principal” enhancements that would strengthen the end-to-end   
management and governance of your policy creation workflow, rather than just tweaking or refining the existing steps.


#### Lower-level Missing Components / Improvement Points:
1) Error Handling: No explicit exception handling (e.g., API failures, empty retrievals).
2) Async or Parallelism: Large documents might benefit from concurrency or async retrieval/drafting.
3) Prompt Customization: Could add parameters like temperature, max_tokens, or system-level instructions.
4) Token/Cost Tracking: Nice approach, but consider storing usage stats persistently or summarizing them.
5) Versioning / Persistence: Consider storing intermediate drafts or retrieval results (e.g., DB or file).
6) Validation / Testing: A minimal testing suite for each class would help ensure reliability.
7) Security & Confidentiality: If dealing with sensitive policies, ensure secure storage of API keys/logs.
8) Explanation / Guidance to End Users: Possibly add docstrings or usage instructions for each step.

In [1]:
import os
import logging
import json
from typing import Any, Dict, List
from datetime import datetime
import subprocess
import pandas as pd  # Import pandas
import re

# LangChain imports
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.prompts import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate
)
from langchain_core.documents import Document
from langchain_chroma import Chroma
from langchain.callbacks import get_openai_callback
from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field

# Ensure no limitations in Jupyter Notebook output
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

In [2]:
# Parametrisation
LOG_FILE = "openai_history.json"
PDF_OUTPUT = "policy.pdf"

In [3]:
class DocumentCreator:
    """
    Creates and refines policy sections.
    Pulls context from Chroma by passing heading, overall doc, and section desc as the query.
    """
    def __init__(self, llm, retriever):
        logging.info("Initializing DocumentCreator...")
        self.llm = llm
        self.retriever = retriever
        logging.info("DocumentCreator initialized.")

    def retrieve_context(self, overall_desc, section_title, section_desc):
        logging.info(f"Retrieving context for section: '{section_title}'...")
        query_text = (
            f"Overall Document: {overall_desc}\n"
            f"Section Title: {section_title}\n"
            f"Section Description: {section_desc}"
        )
        # Retrieve relevant documents
        #Updated get relevant documents to .invoke()
        docs = self.retriever.invoke(query_text)
       
        if not docs:
            logging.warning(f"No context retrieved from Chroma. Using dummy context.")
            context = "This is a dummy context. Please ensure your ChromaDB is populated for better results."
        else:
            context = "\n".join([d.page_content for d in docs])
            logging.info(f"Context length for section: '{section_title}'`: {len(context)} characters.")
       
        return context

    def draft_section(self, overall_desc, section_title, section_desc):
        logging.info(f"Drafting section: '{section_title}'...")
        # Create a chat prompt with system and user instructions
        system_template = (
            "You are drafting a Credit Risk Policy section. "
            "Incorporate best practices and references. "
            "Format your output as Markdown, emphasizing clarity and readability for LaTeX conversion. "
            "Use markdown headings (e.g., # for sections, ## for subsections) for structure and organize information logically. "
            "Ensure the text is in well-structured paragraphs with correct grammar and punctuation."
        )
        user_template = (
            "Section Title: {section_title}\n"
            "Overall Description: {overall_desc}\n"
            "Section Description: {section_desc}\n"
            "Relevant Context:\n{context}\n\n"
            "Draft the policy section using markdown."
        )

        system_msg = SystemMessagePromptTemplate.from_template(system_template)
        human_msg = HumanMessagePromptTemplate.from_template(user_template)

        # Retrieve any relevant context from Chroma
        context = self.retrieve_context(overall_desc, section_title, section_desc)

        chat_prompt = ChatPromptTemplate.from_messages([system_msg, human_msg])
        formatted_prompt = chat_prompt.format_messages(
            section_title=section_title,
            overall_desc=overall_desc,
            section_desc=section_desc,
            context=context
        )

        with get_openai_callback() as cb:
            response = self.llm.invoke(formatted_prompt)
            logging.info(f"LLM call for drafting section '{section_title}' completed.")
            logging.info(f"Drafting section '{section_title}'")
            logging.info(f"Token usage: {cb.total_tokens} (Prompt: {cb.prompt_tokens}, Completion: {cb.completion_tokens}, Cost: ${cb.total_cost:.4f})")
            draft_cost = cb.total_cost
            draft_tokens = cb.total_tokens

        self.log_message(formatted_prompt, response.content, "draft")
        logging.info(f"Section '{section_title}' drafted.")
        return response.content, draft_cost, draft_tokens # Return cost and tokens

    def refine_with_feedback(self, draft, feedback):
        logging.info(f"Refining section with feedback...")
        system_template = (
            "You are refining a policy section draft based on reviewer feedback. "
            "Format your output as Markdown, ensuring that the structure is consistent with the original draft. "
            "Use markdown headings and structured paragraphs for clarity and easy conversion to LaTeX. "
            "Address all feedback points carefully."
        )
        user_template = (
            "Original Draft:\n{draft}\n\n"
            "Reviewer Feedback:\n{feedback}\n\n"
            "Refine the draft, addressing all feedback. Preserve original structure."
        )

        system_msg = SystemMessagePromptTemplate.from_template(system_template)
        human_msg = HumanMessagePromptTemplate.from_template(user_template)

        chat_prompt = ChatPromptTemplate.from_messages([system_msg, human_msg])
        formatted_prompt = chat_prompt.format_messages(draft=draft, feedback=feedback)

        with get_openai_callback() as cb:
            response = self.llm.invoke(formatted_prompt)
            logging.info(f"LLM call for refining section with feedback completed.")
            logging.info(f"Refinement token usage: {cb.total_tokens} (Prompt: {cb.prompt_tokens}, Completion: {cb.completion_tokens}, Cost: ${cb.total_cost:.4f})")
            refine_cost = cb.total_cost
            refine_tokens = cb.total_tokens
        
        self.log_message(formatted_prompt, response.content, "refine")
        logging.info(f"Section refined with feedback.")
        return response.content, refine_cost, refine_tokens # Return cost and tokens
    
    def log_message(self, prompt, response, stage):
        """Logs the prompt and response to the JSON file immediately."""
        timestamp = datetime.now().isoformat()
        log_entry = {
            "timestamp": timestamp,
            "stage": stage,
            "prompt": [msg.content for msg in prompt],
            "response": response,
        }
        with open(LOG_FILE, 'a', encoding='utf-8') as f:
            json.dump(log_entry, f, indent=2)
            f.write('\n')  # Add a newline to separate JSON objects in the file

In [4]:
class DocumentEvaluator:
    """
    Evaluates individual sections for correctness, clarity, and compliance.
    Provides feedback for improvement.
    """
    def __init__(self, llm):
        logging.info("Initializing DocumentEvaluator...")
        self.llm = llm
        logging.info("DocumentEvaluator initialized.")

    def evaluate_section(self, draft_section):
        logging.info("Evaluating section...")
        system_template = "You are an independent reviewer of a policy draft."
        user_template = (
            "Draft Section:\n{draft_section}\n\n"
            "1) Evaluate correctness, clarity, and compliance.\n"
            "2) Suggest improvements to the structure and content for better LaTeX conversion, such as proper use of headings.\n" #Added
            "3) Provide concise feedback."
        )

        system_msg = SystemMessagePromptTemplate.from_template(system_template)
        human_msg = HumanMessagePromptTemplate.from_template(user_template)

        chat_prompt = ChatPromptTemplate.from_messages([system_msg, human_msg])
        formatted_prompt = chat_prompt.format_messages(draft_section=draft_section)

        with get_openai_callback() as cb:
            response = self.llm.invoke(formatted_prompt)
            logging.info("LLM call for section evaluation completed.")
            logging.info(f"Evaluation token usage: {cb.total_tokens} (Prompt: {cb.prompt_tokens}, Completion: {cb.completion_tokens}, Cost: ${cb.total_cost:.4f})")
            eval_cost = cb.total_cost
            eval_tokens = cb.total_tokens

        self.log_message(formatted_prompt, response.content, "eval")
        logging.info("Section evaluated, feedback generated.")
        return response.content, eval_cost, eval_tokens # Return cost and tokens

    def log_message(self, prompt, response, stage):
        """Logs the prompt and response to the JSON file immediately."""
        timestamp = datetime.now().isoformat()
        log_entry = {
            "timestamp": timestamp,
            "stage": stage,
            "prompt": [msg.content for msg in prompt],
            "response": response,
        }
        with open(LOG_FILE, 'a', encoding='utf-8') as f:
            json.dump(log_entry, f, indent=2)
            f.write('\n')  # Add a newline to separate JSON objects in the file

In [5]:
class FinalReviewer:
    """
    Conducts a holistic review of the entire assembled policy.
    Ensures coherence, consistency, and completeness.
    """
    def __init__(self, llm):
        logging.info("Initializing FinalReviewer...")
        self.llm = llm
        logging.info("FinalReviewer initialized.")

    def review_document(self, full_document):
        logging.info("Conducting final document review...")
        system_template = (
            "You are a senior reviewer conducting a final review "
            "of the entire policy. "
            "Format your output as Markdown, ensure proper structure with markdown headings."
        )
        user_template = (
            "Below is the entire policy:\n\n"
            "{full_document}\n\n"
            "1) Check consistency, completeness, and proper use of headings.\n"
            "2) Suggest improvements.\n"
            "3) Provide the final revised text. Make sure all text is in a readable format for latex conversion."
        )

        system_msg = SystemMessagePromptTemplate.from_template(system_template)
        human_msg = HumanMessagePromptTemplate.from_template(user_template)

        chat_prompt = ChatPromptTemplate.from_messages([system_msg, human_msg])
        formatted_prompt = chat_prompt.format_messages(full_document=full_document)

        with get_openai_callback() as cb:
            response = self.llm.invoke(formatted_prompt)
            logging.info("LLM call for final document review completed.")
            logging.info(f"Final review token usage: {cb.total_tokens} (Prompt: {cb.prompt_tokens}, Completion: {cb.completion_tokens}, Cost: ${cb.total_cost:.4f})")
            final_review_cost = cb.total_cost
            final_review_tokens = cb.total_tokens
        
        self.log_message(formatted_prompt, response.content, "final_review")
        logging.info("Final document review completed.")
        return response.content, final_review_cost, final_review_tokens # Return cost and tokens
    
    def log_message(self, prompt, response, stage):
        """Logs the prompt and response to the JSON file immediately."""
        timestamp = datetime.now().isoformat()
        log_entry = {
            "timestamp": timestamp,
            "stage": stage,
            "prompt": [msg.content for msg in prompt],
            "response": response,
        }
        with open(LOG_FILE, 'a', encoding='utf-8') as f:
            json.dump(log_entry, f, indent=2)
            f.write('\n')  # Add a newline to separate JSON objects in the file

In [6]:
class ChecklistItem(BaseModel):
    """Pydantic model for a single checklist item."""
    requirement: str = Field(description="The requirement from the checklist")
    status: str = Field(description="yes/no/partially")
    reasoning: str = Field(description="The reasoning for the status")

class ChecklistOutput(BaseModel):
    """Pydantic model for a list of checklist items."""
    items: List[ChecklistItem] = Field(description="List of checklist items")

class ChecklistEvaluator:
    """
    Evaluates the final document against a checklist of requirements.
    Returns results as a Pandas DataFrame directly.
    """
    def __init__(self, llm):
        logging.info("Initializing ChecklistEvaluator...")
        self.llm = llm
        logging.info("ChecklistEvaluator initialized.")
        self.output_parser = PydanticOutputParser(pydantic_object=ChecklistOutput)

    def _fix_json(self, json_str: str) -> str:
        """Attempt to fix common JSON errors."""
        # Remove any text before the first bracket
        json_str = json_str[json_str.find('['):]
        json_str = re.sub(r'\\', '', json_str)
        return json_str

    def evaluate_with_checklist(self, document, checklist):
        logging.info("Evaluating document against checklist...")
        system_template = (
            "You are an expert reviewer evaluating a policy document "
            "against a checklist of requirements. "
            "Provide the output in a structured JSON format."
        )
        user_template = (
            "Policy Document:\n{document}\n\n"
            "Checklist of Requirements:\n{checklist}\n\n"
            "For each requirement, determine if it has been incorporated "
            "into the document (yes/no/partially), and give a reasoning "
            "for your opinion. Provide your reasoning in a clear and concise manner.\n\n"
            "Output in the following JSON format: \n{format_instructions}\n"
        )

        system_msg = SystemMessagePromptTemplate.from_template(system_template)
        human_msg = HumanMessagePromptTemplate.from_template(user_template)

        chat_prompt = ChatPromptTemplate.from_messages([system_msg, human_msg])
        formatted_prompt = chat_prompt.format_messages(
            document=document,
            checklist=checklist,
            format_instructions = self.output_parser.get_format_instructions()
        )
        
        with get_openai_callback() as cb:
            response = self.llm.invoke(formatted_prompt)
            logging.info("LLM call for checklist evaluation completed.")
            logging.info(f"Checklist evaluation token usage: {cb.total_tokens} (Prompt: {cb.prompt_tokens}, Completion: {cb.completion_tokens}, Cost: ${cb.total_cost:.4f})")
            checklist_eval_cost = cb.total_cost
            checklist_eval_tokens = cb.total_tokens
        
        # Attempt to parse the response multiple times and fix it, if possible
        max_attempts = 3
        for attempt in range(max_attempts):
           try:
                parsed_response = self.output_parser.parse(response.content)
                df = pd.DataFrame([item.model_dump() for item in parsed_response.items])
                self.log_message(formatted_prompt, response.content, "checklist_eval")
                logging.info("Checklist evaluation completed successfully.")
                return df, checklist_eval_cost, checklist_eval_tokens
           except Exception as e:
                logging.warning(f"Attempt {attempt+1}/{max_attempts} failed to parse checklist eval response: {e}")
                
                # Try to fix the json and parse again
                fixed_json = self._fix_json(response.content)
                try:
                    parsed_response = self.output_parser.parse(fixed_json)
                    df = pd.DataFrame([item.model_dump() for item in parsed_response.items])
                    self.log_message(formatted_prompt, response.content, "checklist_eval")
                    logging.info("Checklist evaluation completed successfully after fixing JSON.")
                    return df, checklist_eval_cost, checklist_eval_tokens
                except Exception as e:
                      logging.warning(f"Attempt {attempt+1}/{max_attempts} failed to parse checklist eval response after fix: {e}")
        
        # If all attempts fail log the error and return an empty df
        logging.error("Failed to parse checklist eval response, returning empty dataframe")
        self.log_message(formatted_prompt, response.content, "checklist_eval_fail")
        return pd.DataFrame(), checklist_eval_cost, checklist_eval_tokens

    
    def log_message(self, prompt, response, stage):
        """Logs the prompt and response to the JSON file immediately."""
        timestamp = datetime.now().isoformat()
        log_entry = {
            "timestamp": timestamp,
            "stage": stage,
            "prompt": [msg.content for msg in prompt],
            "response": response,
        }
        with open(LOG_FILE, 'a', encoding='utf-8') as f:
            json.dump(log_entry, f, indent=2)
            f.write('\n')  # Add a newline to separate JSON objects in the file

In [7]:
class LatexConverter:
    def __init__(self):
        logging.info("Initializing LatexConverter...")
        logging.info("LatexConverter initialized.")

    def _escape_latex(self, text):
        """Escapes LaTeX special characters."""
         # List of special characters that need to be escaped in LaTeX.
        specials = {
            '&':  r'\&',
            '%':  r'\%',
            '$':  r'\$',
            '#':  r'\#',
            '_':  r'\_',
            '{':  r'\{',
            '}':  r'\}',
            '~':  r'\textasciitilde{}',
            '^':  r'\textasciicircum{}',
            '\\': r'\textbackslash{}',
            '<':  r'\textless{}',
            '>':  r'\textgreater{}'
        }
        for char, escaped in specials.items():
            text = text.replace(char, escaped)
        return text

    def markdown_to_latex(self, markdown_text, toc_list):
        """Converts a document in markdown format to LaTeX format."""
        latex_output = ""

        # Add Document Class and Packages
        latex_output += "\\documentclass{article}\n"
        latex_output += "\\usepackage[utf8]{inputenc}\n"
        latex_output += "\\usepackage{graphicx}\n"  # Include graphicx package
        latex_output += "\\usepackage{hyperref}\n" #For hyperreferences
        latex_output += "\\title{Credit Risk Policy Document}\n"
        latex_output += "\\author{AI-Generated Document}\n"
        latex_output += "\\date{\\today}\n"
        latex_output += "\\begin{document}\n"
        latex_output += "\\maketitle\n"
        
        # Add table of contents
        latex_output += "\\tableofcontents\n"

        # Process table of contents first
        for title in toc_list:
            # Escape LaTeX special characters in ToC titles
             safe_title = self._escape_latex(title)
             latex_output += f"\\section{{{safe_title}}}\n"
    
        # Convert markdown into latex
        lines = markdown_text.split("\n")
        for line in lines:
            line = line.strip()
            if line.startswith("# "):
                line = line[2:]
                safe_line = self._escape_latex(line)
                latex_output += f"\\section{{{safe_line}}}\n" #No * as this will appear in ToC
            elif line.startswith("## "):
                line = line[3:]
                safe_line = self._escape_latex(line)
                latex_output += f"\\subsection{{{safe_line}}}\n" #No * as this will appear in ToC
            elif line.startswith("### "):
                line = line[4:]
                safe_line = self._escape_latex(line)
                latex_output += f"\\subsubsection{{{safe_line}}}\n" #No * as this will appear in ToC
            elif line:
                #Handle bold and italics
                line = line.replace("**","\\textbf{")
                line = line.replace("}", "}") # Correctly close bold text
                line = line.replace("*","\\textit{")
                line = line.replace("}", "}") # Correctly close italics text
                line = line.replace("``","`") # Remove as not needed
                safe_line = self._escape_latex(line)
                latex_output += f"{safe_line}\n\n" #Add paragraph
            else:
                 latex_output += "\n" #Keep blank lines
        latex_output += "\\end{document}\n" #End document
        return latex_output

    def generate_pdf(self, latex_string, output_path="output.pdf"):
        """Generates a PDF file from a LaTeX string."""
        try:
            logging.info("Generating PDF...")
            # Create a temporary LaTeX file
            with open("temp.tex", "w", encoding="utf-8") as f:
                f.write(latex_string)

            # Compile the LaTeX file into a PDF using pdflatex
            subprocess.run(["pdflatex", "temp.tex"], check=True, capture_output=True)
            logging.info("PDF compilation completed successfully")

            # Move the resulting PDF to the desired path
            os.rename("temp.pdf", output_path)
            logging.info(f"PDF saved to {output_path}")

            #Clean up the latex files
            os.remove("temp.tex")
            os.remove("temp.log")
            os.remove("temp.aux")
            
        except subprocess.CalledProcessError as e:
             logging.error(f"Error compiling LaTeX to PDF: {e}")
             logging.error(f"LaTeX compilation error output: {e.stderr.decode('utf-8')}")

        except Exception as e:
             logging.error(f"An error occurred: {e}")

In [None]:
class CreditRiskPolicyAgent:
    """
    Orchestrates creation, section-level evaluation, refinement,
    and a final holistic review.
    """
    def __init__(self, api_key, chroma_collection):
        logging.info("Initializing CreditRiskPolicyAgent...")
        embeddings = OpenAIEmbeddings(api_key=api_key)
        store = Chroma(collection_name=chroma_collection, embedding_function=embeddings)

        # ChatOpenAI is the newer recommended approach for OpenAI chat models
        self.llm = ChatOpenAI(model_name="gpt-4", api_key=api_key)
        
        self.creator = DocumentCreator(self.llm, store.as_retriever())
        self.evaluator = DocumentEvaluator(self.llm)
        self.final_reviewer = FinalReviewer(self.llm)
        self.checklist_evaluator = ChecklistEvaluator(self.llm)
        logging.info("CreditRiskPolicyAgent initialized.")

    def build_policy(self, overall_desc, toc_dict, checklist):
        logging.info("Starting policy building process...")
        refined_sections = {}
        total_tokens_policy = 0 # Initialize total tokens
        total_cost_policy = 0 # Initialize total cost
        toc_list = []  #list to store titles for LaTeX ToC
        
        for key, s_info in toc_dict.items():
            s_title = s_info["title"]
            s_desc = s_info["description"]
            logging.info(f"Processing section: '{s_title}'...")
            toc_list.append(s_title) #Append the titles to the list

            # 1) Draft
            draft, draft_cost, draft_tokens = self.creator.draft_section(overall_desc, s_title, s_desc)
            total_cost_policy += draft_cost # Accumulate cost
            total_tokens_policy += draft_tokens # Accumulate tokens

            # 2) Evaluate
            feedback, eval_cost, eval_tokens = self.evaluator.evaluate_section(draft)
            total_cost_policy += eval_cost # Accumulate cost
            total_tokens_policy += eval_tokens # Accumulate tokens

            # 3) Refine
            refined, refine_cost, refine_tokens = self.creator.refine_with_feedback(draft, feedback)
            refined_sections[s_title] = refined
            total_cost_policy += refine_cost # Accumulate cost
            total_tokens_policy += refine_tokens # Accumulate tokens
            logging.info(f"Section '{s_title}' processing complete.")

        # Combine all sections for final review
        entire_doc = ""
        for s_title, text in refined_sections.items():
            entire_doc += f"# {s_title}\n\n{text}\n\n"

        final_policy_output, final_review_cost, final_review_tokens = self.final_reviewer.review_document(entire_doc)
        total_cost_policy += final_review_cost # Accumulate cost
        total_tokens_policy += final_review_tokens # Accumulate tokens
       
        # Checklist Evaluation
        checklist_evaluation, checklist_eval_cost, checklist_eval_tokens = self.checklist_evaluator.evaluate_with_checklist(final_policy_output, checklist)
        total_cost_policy += checklist_eval_cost
        total_tokens_policy += checklist_eval_tokens
        
        final_policy = final_policy_output
        logging.info("Policy building process completed.")
        logging.info(f"Total document creation cost: ${total_cost_policy:.4f}") # Log total cost
        logging.info(f"Total tokens used for document creation: {total_tokens_policy}") # Log total tokens
        return final_policy, checklist_evaluation, toc_list

def main():
    api_key = os.environ["OPENAI_API_KEY"]
    chroma_collection = "risk_universe"

    overall_doc_description = (
        "This document sets out the company's Credit Risk Policy, "
        "aligned with Basel III and internal guidelines."
    )

    toc = {
        "section_1": {
            "title": "Introduction and Scope",
            "description": "Define all key terms"
        },
        "section_2":{
            "title":"Credit Risk Appetite and Strategy",
             "description":"Define the organisations appetite for credit risk"
        },
        "section_3": {
            "title": "Governance and Compliance",
            "description": "Governance structure and compliance needs"
        }
    }
   
    checklist = """
    1. The policy clearly states its objectives.
    2. The policy aligns with Basel III regulatory requirements.
    3. Key credit risk terms are clearly defined.
    4. Roles and responsibilities for credit risk management are specified.
    5. Credit approval process is clearly defined.
    6. The policy covers credit risk assessment methodologies.
    7. Credit monitoring and reporting procedures are outlined.
    8. The policy addresses handling of problem loans and impairments.
    9. The policy includes details on collateral management.
    10. The policy has a process for regular review and updates.
    """

    agent = CreditRiskPolicyAgent(api_key, chroma_collection)
    logging.info("Starting to build final document...")
    final_document, checklist_evaluation, toc_list = agent.build_policy(overall_doc_description, toc, checklist)
    logging.info("Final document build complete.")
       
    # Convert to latex and generate pdf
    latex_converter = LatexConverter()
    latex_document = latex_converter.markdown_to_latex(final_document, toc_list)
    latex_converter.generate_pdf(latex_document, PDF_OUTPUT)
    logging.info("PDF creation completed")

    return final_document, checklist_evaluation

final_document, checklist_evaluation = main()
checklist_evaluation

2025-01-25 07:35:06,147 - INFO - Initializing CreditRiskPolicyAgent...
2025-01-25 07:35:07,047 - INFO - Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.
2025-01-25 07:35:08,063 - INFO - Initializing DocumentCreator...
2025-01-25 07:35:08,064 - INFO - DocumentCreator initialized.
2025-01-25 07:35:08,064 - INFO - Initializing DocumentEvaluator...
2025-01-25 07:35:08,065 - INFO - DocumentEvaluator initialized.
2025-01-25 07:35:08,065 - INFO - Initializing FinalReviewer...
2025-01-25 07:35:08,065 - INFO - FinalReviewer initialized.
2025-01-25 07:35:08,066 - INFO - Initializing ChecklistEvaluator...
2025-01-25 07:35:08,066 - INFO - ChecklistEvaluator initialized.
2025-01-25 07:35:08,067 - INFO - CreditRiskPolicyAgent initialized.
2025-01-25 07:35:08,067 - INFO - Starting to build final document...
2025-01-25 07:35:08,069 - INFO - Starting policy building process...
2025-01-25 07:35:08,069 - INFO - Processing section: 'Introduct