From 3fa4b06f6acf72c6b344594f24f66214f171473b Mon Sep 17 00:00:00 2001 From: Brona Nilsson Date: Mon, 3 Nov 2025 12:50:57 +0100 Subject: [PATCH 1/5] Add .gitignore for complex-document-rag folder --- .../complex-document-rag/.gitignore | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 ai/generative-ai-service/complex-document-rag/.gitignore diff --git a/ai/generative-ai-service/complex-document-rag/.gitignore b/ai/generative-ai-service/complex-document-rag/.gitignore new file mode 100644 index 000000000..01b77f130 --- /dev/null +++ b/ai/generative-ai-service/complex-document-rag/.gitignore @@ -0,0 +1,26 @@ +# macOS system files +.DS_Store + +.env +# Python cache +**/__pycache__/ + +# Virtual environments +venv/ + +# Local config +config.py + +# Data folders +data/ +embeddings/ +charts/ +reports/ + +# Logs +*.log +logs/ + +# Text files (except requirements.txt) +*.txt +!requirements.txt \ No newline at end of file From d40ffb5edd356e94517e2bffed2aaa5cc2c15ab7 Mon Sep 17 00:00:00 2001 From: Brona Nilsson Date: Mon, 3 Nov 2025 13:30:26 +0100 Subject: [PATCH 2/5] Update PDF chunker and entity extraction --- .../complex-document-rag/README.md | 379 ++++----- .../files/agents/agent_factory.py | 416 +++++---- .../files/agents/report_writer_agent.py | 786 ++++++++++-------- .../complex-document-rag/files/gradio.css | 202 ++++- .../complex-document-rag/files/gradio_app.py | 109 +-- .../files/handlers/pdf_handler.py | 4 +- .../files/handlers/query_handler.py | 34 +- .../files/handlers/vector_handler.py | 55 +- .../files/handlers/xlsx_handler.py | 26 +- .../complex-document-rag/files/ingest_pdf.py | 432 +++++++--- .../complex-document-rag/files/ingest_xlsx.py | 98 ++- .../files/local_rag_agent.py | 95 ++- .../files/oci_embedding_handler.py | 7 + .../files/requirements.txt | 2 +- .../files/vector_store.py | 606 +++++++------- 15 files changed, 1875 insertions(+), 1376 deletions(-) diff --git a/ai/generative-ai-service/complex-document-rag/README.md b/ai/generative-ai-service/complex-document-rag/README.md index dc0a5c3d7..71659174a 100644 --- a/ai/generative-ai-service/complex-document-rag/README.md +++ b/ai/generative-ai-service/complex-document-rag/README.md @@ -1,261 +1,240 @@ -# RAG Report Generator +# Enterprise RAG Report Generator with Oracle OCI Gen AI -An enterprise-grade Retrieval-Augmented Generation (RAG) system for generating comprehensive business reports from multiple document sources using Oracle Cloud Infrastructure (OCI) Generative AI services. +A sophisticated Retrieval-Augmented Generation (RAG) system built with Oracle OCI Generative AI, designed for enterprise document analysis and automated report generation. This application processes complex documents (PDFs, Excel files) and generates comprehensive analytical reports using multi-agent workflows. -Reviewed date: 22.09.2025 +**Reviewed: 19.09.2025** -## Features +## Features -- **Multi-Document Processing**: Ingest and process PDF and XLSX documents -- **Multiple Embedding Models**: Support for Cohere multilingual and v4.0 embeddings -- **Advanced LLM Support**: Integration with OCI models (Grok-3, Grok-4, Llama 3.3, Cohere Command) -- **Agentic Workflows**: Multi-agent system for intelligent report generation -- **Hierarchical Report Structure**: Automatically organizes content based on user queries -- **Citation Tracking**: Source attribution with references -- **Multi-Language Support**: Generate reports in English, Arabic, Spanish, and French -- **Visual Analytics**: Automatic chart and table generation from data +### Document Processing +- **Multi-format Support**: Process PDF documents and Excel spreadsheets (.xlsx, .xls) +- **Entity-aware Ingestion**: Automatically detect and tag entities within documents +- **Smart Chunking**: Intelligent document segmentation with context preservation +- **Multi-language Support**: Powered by Cohere's multilingual embedding models + +### Advanced RAG Capabilities +- **Multi-Collection Search**: Query across different document collections simultaneously +- **Hybrid Search**: Combine vector similarity and keyword matching for optimal results +- **Entity Filtering**: Filter search results by specific organizations or entities +- **Dimension-aware Storage**: Automatic handling of different embedding model dimensions + +### Intelligent Report Generation +- **Multi-Agent Architecture**: Specialized agents for planning, research, and writing +- **Comparison Reports**: Generate side-by-side comparisons of multiple entities +- **Structured Output**: Automated section generation with tables and charts +- **Chain-of-Thought Reasoning**: Advanced reasoning capabilities for complex queries + +### Model Flexibility +- **Multiple LLM Support**: + - Grok-3 and Grok-4 + - Llama 3.3 + - Cohere Command + - Dedicated AI Clusters (DAC) +- **Embedding Model Options**: + - Cohere Multilingual (1024D) + - ChromaDB Default (384D) + - Custom OCI embeddings + +### User Interface +- **Gradio Web Interface**: Clean, intuitive UI for document processing and querying +- **Vector Store Viewer**: Explore and manage your document collections +- **Real-time Progress Tracking**: Monitor processing and generation status +- **Report Downloads**: Export generated reports in Markdown format ## Prerequisites -- Python 3.11+ -- OCI Account with Generative AI service access -- OCI CLI configured with appropriate credentials +### Oracle OCI Configuration +- Set up your Oracle Cloud Infrastructure (OCI) account +- Obtain the following: + - Compartment OCID + - Generative AI Service Endpoint + - Model IDs for your chosen LLMs + - API keys and authentication credentials +- Configure your `~/.oci/config` file with your profile details -## Installation +### Python Environment +- Python 3.8 or later +- Virtual environment recommended +- Sufficient disk space for vector storage -1. Clone the repository: -```bash -git clone -cd agentic_rag -``` +## Installation + +1. **Clone the repository:** -2. Create a virtual environment: +2. **Create and activate a virtual environment:** ```bash python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` -3. Install dependencies: +3. **Install dependencies:** ```bash -pip install -r requirements.txt +pip install -r files/requirements.txt ``` -4. Configure OCI credentials: +4. **Configure environment variables:** ```bash -# Create OCI config directory if it doesn't exist -mkdir -p ~/.oci - -# Add your OCI configuration to ~/.oci/config -# See: https://docs.oracle.com/en-us/iaas/Content/API/Concepts/sdkconfig.htm +cp files/.env.example files/.env +# Edit .env with your OCI credentials and model IDs ``` -5. Set up environment variables: -```bash -# Create .env file with your configuration -cat > .env << EOF -# OCI Configuration -OCI_COMPARTMENT_ID=your-compartment-id -COMPARTMENT_ID_DAC=your-dac-compartment-id # If using dedicated cluster - -# Model IDs (get from OCI Console) -OCI_GROK_3_MODEL_ID=your-grok3-model-id -OCI_GROK_4_MODEL_ID=your-grok4-model-id -OCI_LLAMA_3_3_MODEL_ID=your-llama-model-id -OCI_COHERE_COMMAND_A_MODEL_ID=your-cohere-model-id - -# Default Models (optional) -DEFAULT_EMBEDDING_MODEL=cohere-embed-multilingual-v3.0 -DEFAULT_LLM_MODEL=grok-3 -EOF +5. **Set up OCI configuration:** +Ensure your `~/.oci/config` file contains: +```ini +[DEFAULT] +user=ocid1.user.oc1..xxxxx +fingerprint=xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx +tenancy=ocid1.tenancy.oc1..xxxxx +region=us-chicago-1 +key_file=~/.oci/oci_api_key.pem ``` -## Quick Start +## Usage -1. Launch the Gradio interface: +### Starting the Application ```bash +cd files python gradio_app.py ``` - -2. Open your browser to `http://localhost:7863` - -3. Follow these steps in the interface: - - **Document Processing Tab**: Upload and process your documents (PDF/XLSX) - see samples in sample_data folder - - **Vector Store Viewer Tab**: View and manage your document collections - - **Inference & Query Tab**: Enter queries and generate reports - see sample queries in sample_queries folder - -## Usage Guide - -### Document Processing - -1. Select an embedding model (e.g., cohere-embed-multilingual-v3.0) -2. Upload documents: - - **XLSX**: Financial data, ESG metrics, structured data - - **PDF**: Reports, policies, unstructured documents -3. Specify the entity name for each document, i.e. the bank or institition's name -4. Click "Process" to ingest into the vector store - -### Generating Reports - -1. In the **Inference & Query** tab: - - Enter your query (can be structured with numbered sections) - - Select LLM model (Grok-3 recommended for reports) - - Choose data sources (PDF/XLSX collections) - - Enable "Agentic Workflow" for comprehensive multi-agent reports - - Click "Run Query" - -2. Example structured query: -``` -Prepare a comprehensive ESG comparison report between Company A and Company B: - -1) Climate Impact & Emissions - - Net-zero commitments and targets - - Scope 1, 2, and 3 emissions - -2) Social & Governance - - Diversity targets - - Board oversight - -3) Financial Performance - - Revenue and profitability - - ESG investments +The application will launch at `http://localhost:7863` + +### Document Processing Workflow + +1. **Upload Documents** + - Navigate to the "DOCUMENT PROCESSING" tab + - Select your embedding model + - Upload PDF or Excel files + - Specify the entity/organization name + - Click "Process" to ingest documents + +2. **Query Your Documents** + - Go to the "INFERENCE & QUERY" tab + - Enter your query or question + - Select data sources (PDF/XLSX collections) + - Choose between standard or agentic workflow + - Click "Run Query" to generate response + +3. **Generate Reports** + - Enable "Use Agentic Workflow" for comprehensive reports + - Specify entities for comparison reports + - Download generated reports in Markdown format + +### Advanced Features + +**Vector Store Management:** +- View collection statistics +- Search across collections +- List and manage document chunks +- Delete collections when needed + +**Multi-Entity Comparison:** +```python +# Example: Compare ESG metrics between two companies +Query: "Compare sustainability initiatives" +Entity 1: "CompanyA" +Entity 2: "CompanyB" ``` -### Report Features - -Generated reports include: -- Executive summary addressing your specific query -- Hierarchically organized sections -- Data tables and visualizations -- Source citations [1], [2] for traceability -- References section with full source details -- Professional formatting (Times New Roman, black headings) - -## Project Structure - +## πŸ“ File Structure ``` -agentic_rag/ -β”œβ”€β”€ gradio_app.py # Main application interface -β”œβ”€β”€ local_rag_agent.py # Core RAG system logic -β”œβ”€β”€ vector_store.py # Vector database management -β”œβ”€β”€ oci_embedding_handler.py # OCI embedding services -β”œβ”€β”€ agents/ -β”‚ β”œβ”€β”€ agent_factory.py # Agent creation and management -β”‚ └── report_writer_agent.py # Report generation logic -β”œβ”€β”€ handlers/ -β”‚ β”œβ”€β”€ query_handler.py # Query processing -β”‚ β”œβ”€β”€ pdf_handler.py # PDF document processing -β”‚ β”œβ”€β”€ xlsx_handler.py # Excel document processing -β”‚ └── vector_handler.py # Vector store operations -β”œβ”€β”€ ingest_pdf.py # PDF ingestion pipeline -β”œβ”€β”€ ingest_xlsx.py # Excel ingestion pipeline -β”œβ”€β”€ sample_data/ # Sample documents for testing -β”œβ”€β”€ sample_queries/ # Example queries for reports -└── utils/ - └── demo_logger.py # Logging utilities +. +β”œβ”€β”€ files/ +β”‚ β”œβ”€β”€ gradio_app.py # Main application interface +β”‚ β”œβ”€β”€ local_rag_agent.py # RAG system core logic +β”‚ β”œβ”€β”€ vector_store.py # Vector storage management +β”‚ β”œβ”€β”€ oci_embedding_handler.py # OCI embedding integration +β”‚ β”œβ”€β”€ disable_telemetry.py # Telemetry management +β”‚ β”œβ”€β”€ agents/ # Multi-agent components +β”‚ β”‚ β”œβ”€β”€ agent_factory.py # Agent initialization +β”‚ β”‚ └── report_writer_agent.py # Report generation +β”‚ β”œβ”€β”€ handlers/ # Document processors +β”‚ β”‚ β”œβ”€β”€ pdf_handler.py # PDF processing +β”‚ β”‚ β”œβ”€β”€ xlsx_handler.py # Excel processing +β”‚ β”‚ └── query_handler.py # Query processing +β”‚ └── requirements.txt # Python dependencies +β”œβ”€β”€ README.md # Project documentation +└── LICENSE # License information ``` -## Advanced Configuration - -### Embedding Models +## Screenshots -Available embedding models: -- `cohere-embed-multilingual-v3.0` (1024 dimensions) -- `cohere-embed-v4.0` (1024 dimensions) -- `chromadb-default` (384 dimensions, local) +### Main Interface +[Screenshot: Document Processing Tab] -### LLM Models +### Query Interface +[Screenshot: Inference & Query Tab] -Supported OCI Generative AI models: -- **Grok-3**: Best for comprehensive reports (16K output tokens) -- **Grok-4**: Advanced reasoning (120K output tokens) -- **Llama 3.3**: Fast inference (4K output tokens) -- **Cohere Command**: Instruction following (4K output tokens) +### Vector Store Viewer +[Screenshot: Collection Management] -### Vector Store Management +### Generated Report Example +[Screenshot: Sample Report Output] -- Collections are automatically created per embedding model -- Switch between models without data loss -- Delete collections via the Vector Store Viewer tab +## πŸ”§ Configuration -## Troubleshooting - -### Common Issues +### Model Selection +Configure available models in your `.env` file: +```env +# LLM Models +OCI_GROK_3_MODEL_ID=ocid1.generativeaimodel.oc1... +OCI_GROK_4_MODEL_ID=ocid1.generativeaimodel.oc1... +OCI_LLAMA_3_3_MODEL_ID=ocid1.generativeaimodel.oc1... -1. **OCI Authentication Error** - - Verify ~/.oci/config is properly configured - - Check compartment ID in .env file - - Ensure your user has appropriate IAM policies +# Embedding Models +DEFAULT_EMBEDDING_MODEL=cohere-embed-multilingual-v3.0 -2. **Embedding Model Errors** - - Verify model IDs in .env file - - Check OCI service limits and quotas - - Ensure embedding service is enabled in your region +# Compartment Configuration +OCI_COMPARTMENT_ID=ocid1.compartment.oc1... +``` -3. **Memory Issues** - - For large documents, process in smaller batches - - Adjust chunk size in ingestion settings - - Consider using pagination for large result sets +### Performance Tuning +- Adjust chunk sizes in `ingest_pdf.py` and `ingest_xlsx.py` +- Configure parallel processing in `report_writer_agent.py` +- Modify token limits in model configurations -### Logs +## πŸ› Troubleshooting -Check `logs/app.log` for detailed debugging information. +### Common Issues -## API Usage (Optional) +**Vector Store Dimension Mismatch:** +- Ensure consistent embedding model usage +- Clear existing collections when switching models +- Check collection metadata for dimension conflicts -For programmatic access: +**OCI Authentication Errors:** +- Verify `~/.oci/config` configuration +- Check API key permissions +- Ensure compartment access rights -```python -from local_rag_agent import RAGSystem -from vector_store import EnhancedVectorStore - -# Initialize system -vector_store = EnhancedVectorStore( - persist_directory="embed-cohere-embed-multilingual-v3.0", - embedding_model="cohere-embed-multilingual-v3.0" -) - -rag_system = RAGSystem( - vector_store=vector_store, - model_name="grok-3", - use_cot=True -) - -# Process query -response = rag_system.process_query("Your query here") -print(response["answer"]) -``` +**Memory Issues:** +- Reduce chunk sizes for large documents +- Consider using dedicated AI clusters for heavy workloads ## Contributing +This project welcomes contributions from the community. Before submitting a pull request, please: + 1. Fork the repository 2. Create a feature branch -3. Make your changes -4. Run tests: `python -m pytest tests/` -5. Submit a pull request +3. Commit your changes +4. Push to the branch +5. Open a pull request -## License +Please review our contribution guidelines for coding standards and best practices. -[Your License Here] +## πŸ”’ Security -## Support +Please consult the security guide for our responsible security vulnerability disclosure process. Report security issues to the maintainers privately. -For issues and questions: -- Check the logs in `logs/app.log` -- Review the troubleshooting section -- Open an issue on GitHub +## πŸ“„ License -## Acknowledgments +Copyright (c) 2024 Oracle and/or its affiliates. -- Oracle Cloud Infrastructure for Generative AI services -- Gradio for the web interface -- ChromaDB for vector storage -- The open-source community +Licensed under the Universal Permissive License (UPL), Version 1.0. -## License -Copyright (c) 2025 Oracle and/or its affiliates. +See LICENSE for more details. -Licensed under the Universal Permissive License (UPL), Version 1.0. +## ⚠️ Disclaimer -See [LICENSE](LICENSE.txt) for more details. +ORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK. -ORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK. diff --git a/ai/generative-ai-service/complex-document-rag/files/agents/agent_factory.py b/ai/generative-ai-service/complex-document-rag/files/agents/agent_factory.py index 9987f0113..e4332cab9 100644 --- a/ai/generative-ai-service/complex-document-rag/files/agents/agent_factory.py +++ b/ai/generative-ai-service/complex-document-rag/files/agents/agent_factory.py @@ -445,38 +445,45 @@ def _process_batch(self, batch: List[Dict[str, Any]]) -> List[str]: prompt = "\n".join(prompt_parts) self.log_prompt(prompt, f"ChunkRewriter (Batch of {len(batch)})") - response = self.llm.invoke([DummyMessage(prompt)]) - - # Handle different LLM response styles - if hasattr(response, "content"): - text = response.content.strip() - elif isinstance(response, list) and isinstance(response[0], dict): - text = response[0].get("generated_text") or response[0].get("text") - if not text: - raise ValueError("⚠️ No valid 'generated_text' found in response.") - text = text.strip() - else: - raise TypeError(f"⚠️ Unexpected response type: {type(response)} β€” {response}") - - self.log_response(text, f"ChunkRewriter (Batch of {len(batch)})") - rewritten_chunks = self._parse_batch_response(text, len(batch)) - rewritten_chunks = [self._clean_chunk_text(chunk) for chunk in rewritten_chunks] - - # Enhanced logging with side-by-side comparison - paired = list(zip(batch, rewritten_chunks)) - for i, (original_chunk, rewritten_text) in enumerate(paired, 1): - # Get the actual raw chunk text, not the metadata - original_text = original_chunk.get("text", "") - metadata = original_chunk.get("metadata", {}) - - # Use demo logger for visual comparison if available - if DEMO_MODE and hasattr(logger, 'chunk_comparison'): - # Pass the actual chunk text, not metadata - logger.chunk_comparison(original_text, rewritten_text, metadata) + try: + response = self.llm.invoke([DummyMessage(prompt)]) + + # Handle different LLM response styles + if hasattr(response, "content"): + text = response.content.strip() + elif isinstance(response, list) and isinstance(response[0], dict): + text = response[0].get("generated_text") or response[0].get("text") + if not text: + raise ValueError("⚠️ No valid 'generated_text' found in response.") + text = text.strip() else: - logger.info(f"βš™ Rewritten Chunk {i}:\n{rewritten_text}\nMetadata: {json.dumps(metadata, indent=2)}\n") - - return rewritten_chunks + raise TypeError(f"⚠️ Unexpected response type: {type(response)} β€” {response}") + + self.log_response(text, f"ChunkRewriter (Batch of {len(batch)})") + rewritten_chunks = self._parse_batch_response(text, len(batch)) + rewritten_chunks = [self._clean_chunk_text(chunk) for chunk in rewritten_chunks] + + # Enhanced logging with side-by-side comparison + paired = list(zip(batch, rewritten_chunks)) + for i, (original_chunk, rewritten_text) in enumerate(paired, 1): + # Get the actual raw chunk text, not the metadata + original_text = original_chunk.get("text", "") + metadata = original_chunk.get("metadata", {}) + + # Use demo logger for visual comparison if available + if DEMO_MODE and hasattr(logger, 'chunk_comparison'): + # Pass the actual chunk text, not metadata + logger.chunk_comparison(original_text, rewritten_text, metadata) + else: + logger.info(f"βš™ Rewritten Chunk {i}:\n{rewritten_text}\nMetadata: {json.dumps(metadata, indent=2)}\n") + + return rewritten_chunks + + except Exception as e: + # Handle timeout and other errors gracefully + logger.error(f"❌ Batch processing failed: {e}") + # Return None for each chunk to indicate failure (not empty strings!) + return [None] * len(batch) def _parse_batch_response(self, response_text: str, expected_chunks: int) -> List[str]: @@ -581,8 +588,7 @@ def _detect_comparison_query(self, query: str) -> bool: """Use LLM to detect whether the query involves a comparison.""" prompt = f""" Does the query below involve a **side-by-side comparison between two or more named entities such as companies, organizations, or products**? - -Exclude comparisons to frameworks (e.g., CSRD, ESRS), legal standards, or regulations β€” those do not count. +Include comparisons to frameworks (e.g., CSRD, ESRS), legal standards, or regulations. Query: "{query}" @@ -641,228 +647,206 @@ def extract_first_json_list(text): return re.findall(r'"([^"]+)"', text) def _extract_entities(self, query: str) -> List[str]: - """Use LLM to extract entity names, then normalize + dedupe.""" - prompt = f""" -Extract company/organization names mentioned in the query and return a CLEANED JSON list. + """Prefer exact vector-store tags typed by the user; LLM only as fallback.""" + import re + logger = getattr(self, "logger", None) or __import__("logging").getLogger(__name__) -CLEANING RULES (apply to each name before returning): -- Lowercase everything. -- Remove legal suffixes at the end: plc, ltd, inc, llc, lp, l.p., corp, corporation, co., co, s.a., s.a.s., ag, gmbh, bv, nv, oy, ab, sa, spa, pte, pvt, pty, srl, sro, k.k., kk, kabushiki kaisha. -- Remove punctuation except internal ampersands (&). Collapse multiple spaces. -- No duplicates. + # --- 0) known tag set from your vector store (lowercased) --- + # Populate this once at init: self.known_tags = {id.lower() for id in vector_store_ids()} + known = getattr(self, "known_tags", None) -CONSTRAINTS: -- Return ONLY a JSON list of strings, e.g. ["aelwyn","elinexa"] -- No prose, no keys, no explanations. -- Do not include standards, clause numbers, sectors, or generic words like "entity". -- If none are present, return []. + tagged = [] -Examples: -Query: "Compare Aelwyn vs Elinexa PLC policies" -Return: ["aelwyn","elinexa"] + # A) Existing FY/Q pattern (kept) + tagged += [m.group(0) for m in re.finditer( + r"\b[A-Za-z][A-Za-z0-9\-]*_(?:FY|Q[1-4])\d{2,4}\b", query, flags=re.I + )] -Query: "Barclays (UK) and JPMorgan Chase & Co." -Return: ["barclays","jpmorgan chase & co"] + # B) NEW: generic "_" e.g., "mof_2022", "mof_2024" + tagged += [m.group(0) for m in re.finditer( + r"\b[A-Za-z][A-Za-z0-9\-]*_\d{2,4}\b", query + )] -Query: "What are Microsoft’s 2030 targets?" -Return: ["microsoft"] + # C) (Optional but useful) quoted tokens like "mof_2022" + tagged += [m.group(1) for m in re.finditer( + r'"([A-Za-z0-9][A-Za-z0-9_\-]{1,80})"', query + )] -Query: "No company here" -Return: [] + # De-dup preserve order (case-insensitive) + seen = set() + tagged_unique: List[str] = [] + for t in tagged: + k = t.lower() + if k not in seen: + # If we know the store IDs, only keep those that exist + if not known or k in known: + seen.add(k) + tagged_unique.append(t) + + # --- Early return: if user typed valid tags, trust them verbatim --- + if tagged_unique: + if logger: + logger.info(f"[Entity Extractor] Exact tags: {tagged_unique}") + return tagged_unique + + # --- Fallback: your original LLM extraction (unchanged) --- + prompt = f""" + Extract company/organization names mentioned in the query and return a CLEANED JSON list. -Now process this query: + CLEANING RULES (apply to each name before returning): + - Lowercase everything. + - Remove legal suffixes at the end: plc, ltd, inc, llc, lp, l.p., corp, corporation, co., co, s.a., s.a.s., ag, gmbh, bv, nv, oy, ab, sa, spa, pte, pvt, pty, srl, sro, k.k., kk, kabushiki kaisha. + - Remove punctuation except internal ampersands (&). Collapse multiple spaces. + - No duplicates. -{query} -""" + CONSTRAINTS: + - Return ONLY a JSON list of strings, e.g. ["aelwyn","elinexa"] + - No prose, no keys, no explanations. + - Do not include standards, clause numbers, sectors, or generic words like "entity". + - If none are present, return []. + + Now process this query: + + {query} + """ try: raw = self.llm(prompt).strip() - print(raw) entities = self.extract_first_json_list(raw) - # Keep strings only and strip whitespace entities = [e.strip() for e in entities if isinstance(e, str) and e.strip()] - # Deduplicate while preserving order - seen = set() - cleaned: List[str] = [] + final: List[str] = [] + seen2 = set() + for e in entities: - if e.lower() not in seen: - seen.add(e.lower()) - cleaned.append(e) + k = e.lower() + if (not known or k in known) and k not in seen2: + seen2.add(k) + final.append(e) - if not cleaned: - logger.warning(f"[Entity Extractor] No plausible entities extracted from LLM output: {entities}") + if not final and logger: + logger.warning(f"[Entity Extractor] No plausible entities extracted. LLM: {entities} | tags: []") - logger.info(f"[Entity Extractor] Raw: {raw} | Cleaned: {cleaned}") - return cleaned + if logger: + logger.info(f"[Entity Extractor] Raw: {raw} | Tags: [] | Final: {final}") + return final except Exception as e: - logger.warning(f"⚠️ Failed to robustly extract entities via LLM: {e}") + if logger: + logger.warning(f"⚠️ Failed to robustly extract entities via LLM: {e}") return [] + def plan( - self, - query: str, - context: List[Dict[str, Any]] | None = None, - is_comparison_report: bool = False - ) -> tuple[list[Dict[str, Any]], list[str], bool]: - """ - Strategic planner that returns structured topics with steps. - Supports both comparison and single-entity analysis with consistent output format. + self, + query: str, + context: List[Dict[str, Any]] | None = None, + is_comparison_report: bool = False, + comparison_mode: str | None = None, # kept for compatibility, not used to hardcode content + provided_entities: Optional[List[str]] = None + ) -> tuple[list[Dict[str, Any]], list[str], bool]: """ - raw = None - is_comparison = self._detect_comparison_query(query) or is_comparison_report - entities = self._extract_entities(query) - logger.info(f"[Planner] Detected entities: {entities} | Comparison task: {is_comparison}") - - if is_comparison and len(entities) < 2: - logger.warning(f"⚠️ Comparison task detected but only {len(entities)} entity found: {entities}") - is_comparison = False # fallback to single-entity mode + PROMPT-DRIVEN PLANNER + - Derive section topics from the user's TASK PROMPT (not hardcoded). + - For each topic, emit one mirrored retrieval step per entity. + - Output shape: List[{"topic": str, "steps": List[str]}], plus (entities, is_comparison). - ctx = "\n".join(f"{i+1}. {c['content']}" for i, c in enumerate(context or [])) - - if is_comparison: - template = """ - You are a strategic planning agent generating grouped research steps for a comparative analysis report. + Returns: + (plan, entities, is_comparison) + """ - TASK: {query} + # 1) Determine comparison intent and entities (keep your existing logic) + is_comparison = self._detect_comparison_query(query) or is_comparison_report - OBJECTIVE: - Break the task into high-level comparison **topics**. For each topic, generate **two steps** β€” one per entity. + if provided_entities: + entities = [e for e in provided_entities if isinstance(e, str) and e.strip()] + logger.info(f"[Planner] Using provided entities: {entities}") + else: + entities = self._extract_entities(query) + logger.info(f"[Planner] Detected entities: {entities} | Comparison task: {is_comparison}") - RULES: - - Keep topic titles focused and distinct (e.g., "Scope 1 Emissions") - - Use a consistent step format: "Find (something) for (Entity)" - - Use only these entities: {entities} + # If comparison requested but <2 entities, degrade gracefully to single-entity mode + if is_comparison and len(entities) < 2: + logger.warning(f"⚠️ Comparison requested but only {len(entities)} entity found: {entities}. Falling back to single-entity.") + is_comparison = False + # 2) Ask the LLM ONLY for topics (strings), not full objects β€” we’ll build steps ourselves + # This avoids fragile JSON with missing "topic" keys. + topic_prompt = f""" +Extract the main section topics from the TASK PROMPT. +Use the user's own headings/bullets/order when present. +If none are explicit, infer 5–10 concise, non-overlapping topics that reflect the user's request. - EXAMPLE: - [ - {{ - "topic": "Net-Zero Targets", - "steps": [ - "Find net-zero targets for Company-A", - "Find net-zero targets for Company-B" - ] - }} - ] +TASK PROMPT: +{query} - TASK: {query} +Return ONLY a JSON array of strings, e.g. ["Executive Summary","Revenue Analysis","Profitability"]. +No prose, no keys, no markdown. +""" + self.log_prompt(topic_prompt, "Planner: Topic Extraction") + raw_topics = None + topics: list[str] = [] + try: + raw_topics = self.llm(topic_prompt).strip() + json_str = UniversalJSONCleaner.clean_and_extract_json(raw_topics, expected_type="array") + parsed = UniversalJSONCleaner.parse_with_validation(json_str, expected_structure=None) + if isinstance(parsed, list): + # Keep only non-empty strings + topics = [str(t).strip() for t in parsed if isinstance(t, (str, int, float)) and str(t).strip()] + except Exception as e: + logger.error(f"❌ Topic extraction failed: {e}") + logger.debug(f"Raw topic response:\n{raw_topics}") + + # 2b) Hard fallback: if still empty, derive topics from obvious headings in the query + if not topics: + # Grab capitalized/bulleted lines as headings + lines = [ln.strip() for ln in (query or "").splitlines()] + bullets = [ln.lstrip("-*β€’ ").strip() for ln in lines if ln.strip().startswith(("-", "*", "β€’"))] + caps = [ln for ln in lines if ln and ln == ln.title() and len(ln.split()) <= 8] + candidates = bullets or caps + if candidates: + topics = [t for t in candidates if len(t) >= 3][:10] + + # 2c) Ultimate fallback: generic buckets (kept minimal, not domain-specific) + if not topics: + topics = [ + "Executive Summary", + "Key Metrics", + "Section 1", + "Section 2", + "Section 3", + "Risks & Considerations", + "Conclusion" + ] - ENTITIES: {entities} - Respond ONLY with valid JSON. - Use standard double quotes (") for all JSON keys and string values. - You MAY and SHOULD use single quotes (') *inside* string values for possessives (e.g., "CEO's"). - Do NOT use curly or smart quotes. - Do NOT write `"CEO"s"`, only `"CEO's"`. - """ - else: - if not entities: - logger.warning("⚠️ No entity found in query β€” using fallback") - entities = ["The Company"] - template = """ - You are a planning agent decomposing a task for a single entity into structured research topics. - -TASK: {query} - -OBJECTIVE: -Break this into 3–10 key topics. Under each topic, include 1–2 retrieval-friendly steps. - -RULES: -- Keep topics distinct and concrete (e.g., Carbon Disclosure) -- Use only these entities: {entities} -- Use a consistent step format: "Find (something) for (Entity)" - -EXAMPLE: -[ -{{ - "topic": "Carbon Disclosure for Company-A", - "steps": [ - "Find 2023 Scope 1 and 2 emissions for Company-A" - ] -}}, -{{ - "topic": "Company-A Diversity Strategy", - "steps": [ - "Analyze gender and ethnicity diversity at Company-A" - ] -}} -] -Respond ONLY with valid JSON. -Do NOT use possessive forms (e.g., do NOT write "Aelwyn's Impact"). Instead, write "Impact for Aelwyn" or "Impact of Aelwyn". -Use the format: "Find (something) for (Entity)" -Do NOT use curly or smart quotes. + # 3) Build plan objects and MIRROR steps across entities (no hardcoded content) + plan: list[dict] = [] + for t in topics: + t_clean = str(t).strip() + if not t_clean: + continue - """ + if is_comparison and len(entities) >= 2: + # One retrieval step per entity β€” mirrored wording + steps = [f"Find all items requested under '{t_clean}' for {entities[0]}", + f"Find all items requested under '{t_clean}' for {entities[1]}"] + else: + # Single entity (or unknown) + e0 = entities[0] if entities else "The Entity" + steps = [f"Find all items requested under '{t_clean}' for {e0}"] - messages = ChatPromptTemplate.from_template(template).format_messages( - query=query, - context=ctx, - entities=entities - ) - full_prompt = "\n".join(str(m.content) for m in messages) - self.log_prompt(full_prompt, "Planner") + plan.append({"topic": t_clean, "steps": steps}) + # 4) Log and return try: - raw = self.llm.invoke(messages).content.strip() - self.log_response(raw, "Planner") - cleaned = UniversalJSONCleaner.clean_and_extract_json(raw, expected_type="array") + self.log_response(json.dumps(plan, ensure_ascii=False, indent=2), "Planner: Plan (topicsβ†’steps)") + except Exception: + pass - plan = UniversalJSONCleaner.parse_with_validation( - cleaned, expected_structure="Array of objects with 'topic' and 'steps' keys" - ) + return plan, entities, is_comparison - if not isinstance(plan, list): - raise ValueError("Parsed plan is not a list") - - for section in plan: - if not isinstance(section, dict): - raise ValueError("Section is not a dict") - if "topic" not in section or "steps" not in section: - raise ValueError("Missing 'topic' or 'steps'") - if not isinstance(section["topic"], str): - raise ValueError("Topic must be a string") - if not isinstance(section["steps"], list): - raise ValueError("Steps must be a list") - if not all(isinstance(s, str) for s in section["steps"]): - raise ValueError("Each step must be a string") - - # Optional: Validate entity inclusion if this was a comparison task - if is_comparison and entities: - for section in plan: - step_text = " ".join(section["steps"]).lower() - for entity in entities: - if entity.lower() not in step_text: - logger.warning( - f"⚠️ Entity '{entity}' not found in steps for topic: '{section['topic']}'" - ) - - return plan, entities, is_comparison - except Exception as e: - logger.error(f"❌ Failed to parse planner output: {e}") - logger.error(f"Raw response:\n{raw}") - # Attempt a minimal prompt instead of hardcoded fallback - try: - fallback_prompt = f""" - Return a JSON list of 5 objects like this: - [{{ - "topic": "X and Y", - "steps": ["Find X for The Company", "Analyze Y for The Company"] - }}] - TASK: {query} - Respond with valid JSON - """ - raw_fallback = self.llm(fallback_prompt).strip() - cleaned_fallback = UniversalJSONCleaner.clean_and_extract_json(raw_fallback) - fallback_plan = UniversalJSONCleaner.parse_with_validation( - cleaned_fallback, expected_structure="Array of objects with 'topic' and 'steps' keys" - ) - return fallback_plan, entities, is_comparison - except Exception as inner_e: - logger.error(f"πŸ›‘ Fallback planner also failed: {inner_e}") - raise RuntimeError("Both planner and fallback planner failed") from inner_e class ResearchAgent(Agent): diff --git a/ai/generative-ai-service/complex-document-rag/files/agents/report_writer_agent.py b/ai/generative-ai-service/complex-document-rag/files/agents/report_writer_agent.py index f11ddde9e..302f1577e 100644 --- a/ai/generative-ai-service/complex-document-rag/files/agents/report_writer_agent.py +++ b/ai/generative-ai-service/complex-document-rag/files/agents/report_writer_agent.py @@ -5,63 +5,226 @@ import uuid import logging import datetime -import matplotlib.pyplot as plt - import math +import re +from docx.oxml.shared import OxmlElement +from docx.text.run import Run logger = logging.getLogger(__name__) logging.basicConfig(level=logging.INFO) os.makedirs("charts", exist_ok=True) + + +_MD_TOKEN_RE = re.compile(r'(\*\*.*?\*\*|__.*?__|\*.*?\*|_.*?_)') + +def add_inline_markdown_paragraph(doc, text: str): + """ + Creates a paragraph and renders lightweight inline Markdown: + **bold** or __bold__ β†’ bold run + *italic* or _italic_ β†’ italic run + Everything else is plain text. No links/lists/code handling. + """ + p = doc.add_paragraph() + i = 0 + for m in _MD_TOKEN_RE.finditer(text): + # leading text + if m.start() > i: + p.add_run(text[i:m.start()]) + token = m.group(0) + # strip the markers + if token.startswith('**') or token.startswith('__'): + content = token[2:-2] + run = p.add_run(content) + run.bold = True + else: + content = token[1:-1] + run = p.add_run(content) + run.italic = True + i = m.end() + # trailing text + if i < len(text): + p.add_run(text[i:]) + return p + def add_table(doc, table_data): - """Create a professionally styled Word table from list of dicts.""" + """Create a Word table from list of dicts or list of lists, robustly.""" if not table_data: return - + headers = [] - seen = set() - for row in table_data: - for k in row.keys(): - if k not in seen: - headers.append(k) - seen.add(k) - - # Create table with proper styling + rows_normalized = [] + + # Case 1: list of dicts + if isinstance(table_data[0], dict): + seen = set() + for row in table_data: + for k in row.keys(): + if k not in seen: + headers.append(k) + seen.add(k) + rows_normalized = table_data + + # Case 2: list of lists + elif isinstance(table_data[0], (list, tuple)): + max_len = max(len(row) for row in table_data) + headers = [f"Col {i+1}" for i in range(max_len)] + for row in table_data: + rows_normalized.append({headers[i]: row[i] if i < len(row) else "" + for i in range(max_len)}) + + else: + headers = ["Value"] + rows_normalized = [{"Value": str(row)} for row in table_data] + table = doc.add_table(rows=1, cols=len(headers)) table.style = 'Table Grid' - - # Style header row + header_row = table.rows[0] for i, h in enumerate(headers): cell = header_row.cells[i] cell.text = str(h) - # Make header bold for paragraph in cell.paragraphs: for run in paragraph.runs: run.bold = True - # Add data rows - for row in table_data: + for row in rows_normalized: row_cells = table.add_row().cells for i, h in enumerate(headers): row_cells[i].text = str(row.get(h, "")) +def _color_for_label(label: str, entities: list[str] | tuple[str, ...] | None, + base="#a9bbbc", e1="#437c94", e2="#c74634") -> str: + """Pick a bar color based on whether a label mentions one of the entities.""" + if not entities: + return base + lbl = label.lower() + ents = [e for e in entities if isinstance(e, str)] + if len(ents) >= 1 and ents[0].lower() in lbl: + return e1 + if len(ents) >= 2 and ents[1].lower() in lbl: + return e2 + return base + + +def detect_units(chart_data: dict, title: str = "") -> str: + """Detect units of measure from chart data and title.""" + # Common patterns for currency + currency_patterns = [ + (r'\$|USD|usd|dollar', 'USD'), + (r'€|EUR|eur|euro', 'EUR'), + (r'Β£|GBP|gbp|pound', 'GBP'), + (r'Β₯|JPY|jpy|yen', 'JPY'), + (r'β‚Ή|INR|inr|rupee', 'INR'), + ] + + # Common patterns for other units - order matters! + unit_patterns = [ + (r'million|millions|mn|mln|\$m|\$M', 'Million'), + (r'billion|billions|bn|bln|\$b|\$B', 'Billion'), + (r'thousand|thousands|k|\$k', 'Thousand'), + (r'percentage|percent|%', '%'), + (r'tonnes|tons|tonne|ton', 'Tonnes'), + (r'co2e|CO2e|co2|CO2', 'CO2e'), + (r'kwh|kWh|KWH', 'kWh'), + (r'mwh|MWh|MWH', 'MWh'), + (r'kg|kilogram|kilograms', 'kg'), + (r'employees|headcount|people', 'Employees'), + (r'days|day', 'Days'), + (r'hours|hour|hrs', 'Hours'), + (r'years|year|yrs', 'Years'), + ] + + # Check title and keys for units - also check values if they're strings + combined_text = title.lower() + " " + " ".join(str(k).lower() for k in chart_data.keys()) + # Also check string values which might contain unit info + for v in chart_data.values(): + if isinstance(v, str): + combined_text += " " + v.lower() + + detected_currency = None + detected_scale = None + detected_unit = None + + # Check for currency + for pattern, unit in currency_patterns: + if re.search(pattern, combined_text, re.IGNORECASE): + detected_currency = unit + break + + # Check for scale (million, billion, etc.) + for pattern, unit in unit_patterns[:4]: # First 4 are scales + if re.search(pattern, combined_text, re.IGNORECASE): + detected_scale = unit + break + + # Check for other units + for pattern, unit in unit_patterns[4:]: # Rest are units + if re.search(pattern, combined_text, re.IGNORECASE): + detected_unit = unit + break + + # Combine detected elements + if detected_currency and detected_scale: + return f"{detected_scale} {detected_currency}" + elif detected_currency: + # If we detect currency but no scale, look for financial context clues + if 'revenue' in combined_text or 'sales' in combined_text or 'income' in combined_text: + # Financial data without explicit scale often means millions + if 'fy' in combined_text or 'fiscal' in combined_text or 'quarterly' in combined_text: + return "Million USD" # Corporate financials are typically in millions + return detected_currency + return detected_currency + elif detected_unit: + if detected_scale and detected_unit not in ['%', 'Employees', 'Days', 'Hours', 'Years']: + return f"{detected_scale} {detected_unit}" + return detected_unit + elif detected_scale: + # If we only have scale (like "Million") without currency, check for financial context + if any(term in combined_text for term in ['revenue', 'cost', 'profit', 'income', 'sales', 'expense', 'financial']): + return f"{detected_scale} USD" + return detected_scale + + # For financial metrics without explicit units, default to "Million USD" + if any(term in combined_text for term in ['revenue', 'sales', 'profit', 'income', 'cost', 'expense', 'financial', 'fiscal', 'fy20']): + return "Million USD" + + return "Value" # Default fallback + + +def format_value_with_units(value: float, units: str) -> str: + """Format a value with appropriate precision based on units.""" + if '%' in units: + return f"{value:.1f}%" + elif 'Million' in units or 'Billion' in units: + return f"{value:,.1f}" + elif value >= 1000: + return f"{value:,.0f}" + else: + return f"{value:.1f}" + + +def make_chart(chart_data: dict, title: str = "", + entities: list[str] | tuple[str, ...] | None = None, + units: str | None = None) -> str | None: + """Generate a chart with conditional formatting and fallback for list values. + If `entities` contains up to two names, bars whose labels include those names + are highlighted in two distinct colors. Otherwise a default color is used. + Units are detected automatically or can be passed explicitly. + """ -def make_chart(chart_data: dict, title: str = "") -> str | None: - """Generate a chart with conditional formatting and fallback for list values.""" - import numpy as np import textwrap os.makedirs("charts", exist_ok=True) clean = {} for k, v in chart_data.items(): - # NEW: Reduce lists to latest entry if all elements are numeric + # Reduce lists to latest numeric entry if isinstance(v, list): if all(isinstance(i, (int, float)) for i in v): - v = v[-1] # use the latest value + v = v[-1] else: continue @@ -78,47 +241,56 @@ def make_chart(chart_data: dict, title: str = "") -> str | None: labels = list(clean.keys()) values = list(clean.values()) + + # Detect units if not provided + if not units: + units = detect_units(chart_data, title) + + # Update title to include units if not already present + if units and units != "Value" and units.lower() not in title.lower(): + title = f"{title} ({units})" - # Decide chart orientation based on label length and count - create more variety + # Decide orientation max_label_length = max(len(label) for label in labels) if labels else 0 - - # More nuanced decision for chart orientation - if len(clean) > 12: # Many items -> horizontal + if len(clean) > 12: horizontal = True - elif max_label_length > 40: # Very long labels -> horizontal + elif max_label_length > 40: horizontal = True - elif len(clean) <= 4 and max_label_length <= 20: # Few items, short labels -> vertical + elif len(clean) <= 4 and max_label_length <= 20: horizontal = False - elif len(clean) <= 6 and max_label_length <= 30: # Medium items, medium labels -> vertical + elif len(clean) <= 6 and max_label_length <= 30: horizontal = False - else: # Default to horizontal for edge cases + else: horizontal = True - fig, ax = plt.subplots(figsize=(12, 8)) # Increased figure size for better readability + fig, ax = plt.subplots(figsize=(12, 8)) if horizontal: - # Wrap long labels for horizontal charts wrapped_labels = ['\n'.join(textwrap.wrap(label, width=40)) for label in labels] - bars = ax.barh(wrapped_labels, values, color=["#2e7d32" if "aelwyn" in l.lower() else "#f9a825" if "elinexa" in l.lower() else "#4472C4" for l in labels]) - ax.set_xlabel("Value") + colors = [_color_for_label(l, entities) for l in labels] + bars = ax.barh(wrapped_labels, values, color=colors) + ax.set_xlabel(units) # Use detected units instead of "Value" ax.set_ylabel("Category") for bar in bars: width = bar.get_width() - ax.annotate(f"{width:.1f}", xy=(width, bar.get_y() + bar.get_height() / 2), xytext=(5, 0), - textcoords="offset points", ha='left', va='center', fontsize=8) + formatted_value = format_value_with_units(width, units) + ax.annotate(formatted_value, xy=(width, bar.get_y() + bar.get_height() / 2), + xytext=(5, 0), textcoords="offset points", + ha='left', va='center', fontsize=8) else: - # Wrap long labels for vertical charts wrapped_labels = ['\n'.join(textwrap.wrap(label, width=15)) for label in labels] - bars = ax.bar(range(len(labels)), values, color=["#2e7d32" if "aelwyn" in l.lower() else "#f9a825" if "elinexa" in l.lower() else "#4472C4" for l in labels]) - ax.set_ylabel("Value") + colors = [_color_for_label(l, entities) for l in labels] + bars = ax.bar(range(len(labels)), values, color=colors) + ax.set_ylabel(units) # Use detected units instead of "Value" ax.set_xlabel("Category") ax.set_xticks(range(len(labels))) ax.set_xticklabels(wrapped_labels, ha='center', va='top') - for bar in bars: height = bar.get_height() - ax.annotate(f"{height:.1f}", xy=(bar.get_x() + bar.get_width() / 2, height), xytext=(0, 5), - textcoords="offset points", ha='center', va='bottom', fontsize=8) + formatted_value = format_value_with_units(height, units) + ax.annotate(formatted_value, xy=(bar.get_x() + bar.get_width() / 2, height), + xytext=(0, 5), textcoords="offset points", + ha='center', va='bottom', fontsize=8) ax.set_title(title[:100]) ax.grid(axis="y" if not horizontal else "x", linestyle="--", alpha=0.6) @@ -126,21 +298,18 @@ def make_chart(chart_data: dict, title: str = "") -> str | None: filename = f"chart_{uuid.uuid4().hex}.png" path = os.path.join("charts", filename) - fig.savefig(path, dpi=300, bbox_inches='tight') # Higher DPI and tight bbox for better quality + fig.savefig(path, dpi=300, bbox_inches='tight') plt.close(fig) return path - - def append_to_doc(doc, section_data: dict, level: int = 2, citation_map: dict | None = None): """Append section to document with heading, paragraph, table, chart, and citations.""" heading = section_data.get("heading", "Untitled Section") - # Use the level parameter to control heading hierarchy doc.add_heading(heading, level=level) text = section_data.get("text", "").strip() - + # Add citations to the text if sources are available if text and citation_map and section_data.get("sources"): citation_numbers = [] @@ -148,14 +317,13 @@ def append_to_doc(doc, section_data: dict, level: int = 2, citation_map: dict | source_key = f"{source.get('file', 'Unknown')}_{source.get('sheet', '')}_{source.get('entity', '')}" if source_key in citation_map: citation_numbers.append(citation_map[source_key]) - if citation_numbers: - # Add unique citation numbers at the end of the text unique_citations = sorted(set(citation_numbers)) citations_str = " " + "".join([f"[{num}]" for num in unique_citations]) text = text + citations_str - + if text: + add_inline_markdown_paragraph(doc, text) doc.add_paragraph(text) table_data = section_data.get("table", []) @@ -176,17 +344,23 @@ def append_to_doc(doc, section_data: dict, level: int = 2, citation_map: dict | else: flattened_chart_data[k] = v - chart_path = make_chart(flattened_chart_data, title=heading) + # Pass dynamic entities (if present) so colors match those names + entities = section_data.get("entities") + # Pass units if available in section data + units = section_data.get("units") + chart_path = make_chart(flattened_chart_data, title=heading, entities=entities, units=units) if chart_path: doc.add_picture(chart_path, width=Inches(6)) last_paragraph = doc.paragraphs[-1] last_paragraph.alignment = 1 # center + def save_doc(doc, filename: str = "_report.docx"): """Save the Word document.""" doc.save(filename) logger.info(f"βœ… Report saved: {filename}") + class SectionWriterAgent: def __init__(self, llm, tokenizer=None): self.llm = llm @@ -197,34 +371,26 @@ def __init__(self, llm, tokenizer=None): print("⚠️ No tokenizer provided for SectionWriterAgent") def estimate_tokens(self, text: str) -> int: - # naive estimate: 1 token β‰ˆ 4 characters for English-like text return max(1, len(text) // 4) def log_token_count(self, text: str, tokenizer=None, label: str = "Prompt"): if not text: print(f"⚠️ Cannot log tokens: empty text for {label}") return - if tokenizer: token_count = len(tokenizer.encode(text)) else: token_count = self.estimate_tokens(text) - print(f"{label} token count: {token_count}") - - - def write_section(self, section_title: str, context_chunks: list[dict]) -> dict: from collections import defaultdict - # Group chunks by entity and preserve metadata grouped = defaultdict(list) grouped_metadata = defaultdict(list) for chunk in context_chunks: entity = chunk.get("_search_entity", "Unknown") grouped[entity].append(chunk.get("content", "")) - # Preserve metadata for citations metadata = chunk.get("metadata", {}) grouped_metadata[entity].append(metadata) @@ -240,12 +406,15 @@ def write_section(self, section_title: str, context_chunks: list[dict]) -> dict: "text": f"Insufficient data for analysis. Entities: {entities}", "table": [], "chart_data": {}, - "sources": [] + "sources": [], + # propagate for downstream report logic + "is_comparison": False, + "entities": entities } def _write_single_entity_section(self, section_title: str, grouped_chunks: dict, entity: str, grouped_metadata: dict | None = None) -> dict: text = "\n\n".join(grouped_chunks[entity]) - + # Extract unique sources from metadata sources = [] if grouped_metadata and entity in grouped_metadata: @@ -260,7 +429,6 @@ def _write_single_entity_section(self, section_title: str, grouped_chunks: dict, }) seen_sources.add(source_key) - # OPTIMIZED: Shorter, more focused prompt for faster processing prompt = f"""Extract key data for {entity} on {section_title}. Return JSON: @@ -269,8 +437,12 @@ def _write_single_entity_section(self, section_title: str, grouped_chunks: dict, Data: {text[:2000]} -CRITICAL: Never use possessive forms (no apostrophes). Instead of "manager's approval" write "manager approval" or "approval from manager". Use "N/A" for missing data. Valid JSON only.""" - +CRITICAL RULES: +1. NEVER use possessive forms or apostrophes (no 's). + - Wrong: "Oracle's revenue", "company's growth" + - Right: "Oracle revenue", "company growth", "revenue of Oracle" +2. Use "N/A" for missing data. +3. Return valid JSON only - no apostrophes in text values.""" try: self.log_token_count(prompt, self.tokenizer, label=f"SingleEntity Prompt ({section_title})") @@ -288,14 +460,16 @@ def _write_single_entity_section(self, section_title: str, grouped_chunks: dict, chart_data = parsed.get("chart_data", {}) if isinstance(chart_data, str): try: - chart_data = ast.literal_eval(chart_data) + import ast as _ast + chart_data = _ast.literal_eval(chart_data) except Exception: chart_data = {} table = parsed.get("table", []) if isinstance(table, str): try: - table = ast.literal_eval(table) + import ast as _ast + table = _ast.literal_eval(table) except Exception: table = [] @@ -304,7 +478,10 @@ def _write_single_entity_section(self, section_title: str, grouped_chunks: dict, "text": parsed.get("text", ""), "table": table, "chart_data": chart_data, - "sources": sources + "sources": sources, + # NEW: carry entity info so charts/titles can highlight correctly + "is_comparison": False, + "entities": [entity] } except Exception as e: @@ -314,7 +491,9 @@ def _write_single_entity_section(self, section_title: str, grouped_chunks: dict, "text": f"Could not generate section due to error: {e}", "table": [], "chart_data": {}, - "sources": sources + "sources": sources, + "is_comparison": False, + "entities": [entity] } def _write_comparison_section(self, section_title: str, grouped_chunks: dict, entities: list[str], grouped_metadata: dict | None = None) -> dict: @@ -328,39 +507,43 @@ def _write_comparison_section(self, section_title: str, grouped_chunks: dict, en text_a = "\n\n".join(grouped_chunks[entity_a]) text_b = "\n\n".join(grouped_chunks[entity_b]) - # Construct prompt prompt = f""" - You are writing a structured section for a comparison report between {entity_a} and {entity_b}. +You are writing a structured section for a comparison report between {entity_a} and {entity_b}. - Topic: {section_title} +Topic: {section_title} - OBJECTIVE: - Summarize key data from the context and produce a clear, side-by-side comparison table. +OBJECTIVE: +Summarize key data from the context and produce a clear, side-by-side comparison table. - Always follow this exact structure in your JSON output: - - heading: A short, descriptive title for the section - - text: A 1–2 sentence overview comparing {entity_a} and {entity_b} - - table: List of dicts formatted as: Metric | {entity_a} | {entity_b} | Analysis - - chart_data: A dictionary of comparable numeric values to plot +Always follow this exact structure in your JSON output: +- heading: A short, descriptive title for the section +- text: A 1–2 sentence overview comparing {entity_a} and {entity_b} +- table: List of dicts formatted as: Metric | {entity_a} | {entity_b} | Analysis +- chart_data: A dictionary of comparable numeric values to plot - DATA: - === {entity_a} === - {text_a} +DATA: +=== {entity_a} === +{text_a} - === {entity_b} === - {text_b} +=== {entity_b} === +{text_b} - INSTRUCTIONS: - - Extract specific metrics (numbers, %, dates) from the data - - Use "N/A" if one entity is missing a value - - Use analysis terms like: "Higher", "Lower", "Similar", "{entity_a} Only", "{entity_b} Only" - - Do not echo file names or metadata - - Keep values human-readable (e.g., "18,500 tonnes CO2e") - - CRITICAL: Never use possessive forms (no apostrophes). Instead of "company's target" write "company target" or "target for company". +INSTRUCTIONS: +- Extract specific metrics (numbers, %, dates) from the data +- Use "N/A" if one entity is missing a value +- Use analysis terms like: "Higher", "Lower", "Similar", "{entity_a} only", "{entity_b} only" +- Do not echo file names or metadata +- Keep values human-readable (e.g., "18,500 tonnes CO2e") - Respond only in JSON format. - """ +CRITICAL RULES: +1. NEVER use possessive forms or apostrophes (no 's). + - Wrong: "Oracle's revenue", "company's performance" + - Right: "Oracle revenue", "company performance", "revenue of Oracle" +2. Ensure all JSON is valid - no apostrophes in text values. +3. Use proper escaping if quotes are needed in text. + +Respond only in valid JSON format. +""" try: if self.tokenizer: @@ -379,7 +562,6 @@ def _write_comparison_section(self, section_title: str, grouped_chunks: dict, en expected_structure="Object with 'heading', 'text', 'table', and 'chart_data' keys" ) - # Chart data cleanup chart_data = parsed.get("chart_data", {}) if isinstance(chart_data, str): try: @@ -390,7 +572,6 @@ def _write_comparison_section(self, section_title: str, grouped_chunks: dict, en if not isinstance(chart_data, dict): chart_data = {} - # Table cleanup table = parsed.get("table", []) if isinstance(table, str): try: @@ -415,7 +596,6 @@ def _write_comparison_section(self, section_title: str, grouped_chunks: dict, en if validated_row[entity_a] != "N/A" or validated_row[entity_b] != "N/A": validated.append(validated_row) - # Flatten chart_data if nested flat_chart_data = {} for k, v in chart_data.items(): if isinstance(v, dict): @@ -424,7 +604,7 @@ def _write_comparison_section(self, section_title: str, grouped_chunks: dict, en else: flat_chart_data[k] = v - # Extract unique sources from metadata + # Extract unique sources sources = [] if grouped_metadata: seen_sources = set() @@ -445,12 +625,14 @@ def _write_comparison_section(self, section_title: str, grouped_chunks: dict, en "text": parsed.get("text", ""), "table": validated, "chart_data": flat_chart_data, - "sources": sources + "sources": sources, + # NEW: signal comparison + entities for downstream styling and charts + "is_comparison": True, + "entities": [entity_a, entity_b] } except Exception as e: logger.error("⚠️ Failed to write comparison section: %s", e) - # Still try to extract sources sources = [] if grouped_metadata: seen_sources = set() @@ -465,39 +647,36 @@ def _write_comparison_section(self, section_title: str, grouped_chunks: dict, en "entity": entity }) seen_sources.add(source_key) - + return { "heading": section_title, "text": f"Could not generate summary due to error: {e}", "table": [], "chart_data": {}, - "sources": sources + "sources": sources, + "is_comparison": True, + "entities": entities } - class ReportWriterAgent: def __init__(self, doc=None, model_name: str = "unknown", llm=None): - # Don't store the document - create fresh one for each report self.model_name = model_name self.llm = llm # Store LLM for generating summaries def _generate_executive_summary(self, sections: list[dict], is_comparison: bool, entities: list[str], target_language: str = "english", query: str | None = None) -> str: - """Generate an executive summary based on actual section content and user query""" if not self.llm: return self._generate_intro_section(is_comparison, entities) - - # Extract key information from sections + section_summaries = [] for section in sections: heading = section.get("heading", "Unknown Section") text = section.get("text", "") if text: - section_summaries.append(f"**{heading}**: {text}") - + section_summaries.append(f"{heading}: {text}") + sections_text = "\n\n".join(section_summaries) - - # Add language instruction if not English + language_instruction = "" if target_language == "arabic": language_instruction = "\n\nIMPORTANT: Write the entire executive summary in Arabic (Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΨ©). Use professional Arabic business terminology." @@ -505,12 +684,9 @@ def _generate_executive_summary(self, sections: list[dict], is_comparison: bool, language_instruction = "\n\nIMPORTANT: Write the entire executive summary in Spanish. Use professional Spanish business terminology." elif target_language == "french": language_instruction = "\n\nIMPORTANT: Write the entire executive summary in French. Use professional French business terminology." - - # Include user query context if available - query_context = "" - if query: - query_context = f"\nUser's Original Request:\n{query}\n" - + + query_context = f"\nUser's Original Request:\n{query}\n" if query else "" + if is_comparison: prompt = f""" You are writing an executive summary for a comparison report between {entities[0]} and {entities[1]}. @@ -523,6 +699,8 @@ def _generate_executive_summary(self, sections: list[dict], is_comparison: bool, Section Summaries: {sections_text} +CRITICAL: Never use possessive forms (no apostrophes). Write "Oracle revenue" not "Oracle's revenue", "company performance" not "company's performance". + Write in a professional, analytical tone. Focus on answering the user's specific request.{language_instruction} """ else: @@ -537,9 +715,11 @@ def _generate_executive_summary(self, sections: list[dict], is_comparison: bool, Section Summaries: {sections_text} +CRITICAL: Never use possessive forms (no apostrophes). Write "Oracle revenue" not "Oracle's revenue", "company performance" not "company's performance". + Write in a professional, analytical tone. Focus on answering the user's specific request.{language_instruction} """ - + try: response = self.llm.invoke([type("Msg", (object,), {"content": prompt})()]).content.strip() return response @@ -548,31 +728,27 @@ def _generate_executive_summary(self, sections: list[dict], is_comparison: bool, return self._generate_intro_section(is_comparison, entities) def _generate_conclusion(self, sections: list[dict], is_comparison: bool, entities: list[str], target_language: str = "english", query: str | None = None) -> str: - """Generate a conclusion based on actual section content and user query""" if not self.llm: return "This analysis provides insights based on available data from retrieved documents." - - # Extract key findings from sections + key_findings = [] for section in sections: heading = section.get("heading", "Unknown Section") text = section.get("text", "") table = section.get("table", []) - - # Extract key metrics from tables + if table and isinstance(table, list): - for row in table[:3]: # Top 3 rows + for row in table[:3]: if isinstance(row, dict): metric = row.get("Metric", "") if metric: key_findings.append(f"{heading}: {metric}") - + if text: key_findings.append(f"{heading}: {text}") - - findings_text = "\n".join(key_findings[:8]) # Limit to prevent token overflow - - # Add language instruction if not English + + findings_text = "\n".join(key_findings[:8]) + language_instruction = "" if target_language == "arabic": language_instruction = "\n\nIMPORTANT: Write the entire conclusion in Arabic (Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΨ©). Use professional Arabic business terminology." @@ -580,12 +756,9 @@ def _generate_conclusion(self, sections: list[dict], is_comparison: bool, entiti language_instruction = "\n\nIMPORTANT: Write the entire conclusion in Spanish. Use professional Spanish business terminology." elif target_language == "french": language_instruction = "\n\nIMPORTANT: Write the entire conclusion in French. Use professional French business terminology." - - # Include user query context if available - query_context = "" - if query: - query_context = f"\nUser's Original Request:\n{query}\n" - + + query_context = f"\nUser's Original Request:\n{query}\n" if query else "" + if is_comparison: prompt = f""" Based on the analysis of {entities[0]} and {entities[1]}, write a conclusion that directly answers the user's request. @@ -599,6 +772,8 @@ def _generate_conclusion(self, sections: list[dict], is_comparison: bool, entiti - Provide actionable insights based on their specific needs - Include specific recommendations if appropriate +CRITICAL: Never use possessive forms (no apostrophes). Write "Oracle revenue" not "Oracle's revenue", "company growth" not "company's growth". + Focus on providing value for the user's specific use case.{language_instruction} """ else: @@ -614,97 +789,77 @@ def _generate_conclusion(self, sections: list[dict], is_comparison: bool, entiti - Provide actionable insights based on their specific needs - Include specific recommendations if appropriate +CRITICAL: Never use possessive forms (no apostrophes). Write "Oracle revenue" not "Oracle's revenue", "company growth" not "company's growth". + Focus on providing value for the user's specific use case.{language_instruction} """ - + try: response = self.llm.invoke([type("Msg", (object,), {"content": prompt})()]).content.strip() return response except Exception as e: logger.warning(f"Failed to generate conclusion: {e}") return "This analysis provides insights based on available data from retrieved documents." - + def _filter_failed_sections(self, sections: list[dict]) -> list[dict]: - """Filter out sections that contain error messages or failed processing""" filtered_sections = [] - + error_patterns = [ + "Could not generate", + "due to error:", + "Expecting ',' delimiter:", + "Failed to", + "Error:", + "Exception:", + "Traceback" + ] for section in sections: text = section.get("text", "") heading = section.get("heading", "") - - # Check for common error patterns - error_patterns = [ - "Could not generate", - "due to error:", - "Expecting ',' delimiter:", - "Failed to", - "Error:", - "Exception:", - "Traceback" - ] - - # Check if section contains error messages has_error = any(pattern in text for pattern in error_patterns) - if not has_error: filtered_sections.append(section) else: logger.info(f"🚫 Filtered out failed section: {heading}") - return filtered_sections - + def _apply_document_styling(self, doc): - """Apply professional styling to the document""" from docx.shared import Pt, RGBColor - from docx.enum.text import WD_ALIGN_PARAGRAPH - - # Set default font for the document style = doc.styles['Normal'] font = style.font font.name = 'Times New Roman' font.size = Pt(12) - - # Style headings heading1_style = doc.styles['Heading 1'] heading1_style.font.name = 'Times New Roman' heading1_style.font.size = Pt(18) heading1_style.font.bold = True - heading1_style.font.color.rgb = RGBColor(0x00, 0x00, 0x00) # Black - + heading1_style.font.color.rgb = RGBColor(0x00, 0x00, 0x00) heading2_style = doc.styles['Heading 2'] heading2_style.font.name = 'Times New Roman' heading2_style.font.size = Pt(14) heading2_style.font.bold = True - heading2_style.font.color.rgb = RGBColor(0x00, 0x00, 0x00) # Black - + heading2_style.font.color.rgb = RGBColor(0x00, 0x00, 0x00) + def _generate_report_title(self, is_comparison: bool, entities: list[str], query: str | None, sections: list[dict]) -> str: - """Generate a dynamic, informative report title based on user query""" if query and self.llm: - # Use LLM to generate a more specific title based on the query try: entity_context = f"{entities[0]} vs {entities[1]}" if is_comparison and len(entities) >= 2 else entities[0] if entities else "Organization" - prompt = f"""Generate a concise, professional report title (max 10 words) based on: User Query: {query} Entities: {entity_context} Type: {'Comparison' if is_comparison else 'Analysis'} Report +CRITICAL: Never use possessive forms (no apostrophes). Write "Oracle Performance" not "Oracle's Performance". + Return ONLY the title, no quotes or extra text.""" - title = self.llm.invoke([type("Msg", (object,), {"content": prompt})()]).content.strip() - # Clean up the title title = title.replace('"', '').replace("'", '').strip() - # Ensure it's not too long if len(title) > 100: title = title[:97] + "..." return title except Exception as e: logger.warning(f"Failed to generate dynamic title: {e}") - # Fall back to default title generation - - # Default title generation logic + if query: - # Extract key topics from the query query_lower = query.lower() if "esg" in query_lower or "sustainability" in query_lower: topic_type = "ESG & Sustainability" @@ -719,7 +874,6 @@ def _generate_report_title(self, is_comparison: bool, entities: list[str], query else: topic_type = "Business Analysis" else: - # Infer from section headings section_topics = [s.get("heading", "") for s in sections[:3]] if any("climate" in h.lower() or "carbon" in h.lower() for h in section_topics): topic_type = "Climate & Environmental" @@ -727,161 +881,123 @@ def _generate_report_title(self, is_comparison: bool, entities: list[str], query topic_type = "ESG & Sustainability" else: topic_type = "Business Analysis" - + if is_comparison and len(entities) >= 2: return f"{topic_type} Report: {entities[0]} vs {entities[1]}" elif entities: return f"{topic_type} Report: {entities[0]}" else: return f"{topic_type} Report" - + def _add_report_header(self, doc, report_title: str, is_comparison: bool, entities: list[str]): - """Add a professional report header with title, date, and metadata""" from docx.shared import Pt, RGBColor from docx.enum.text import WD_ALIGN_PARAGRAPH - - # Main title + title_paragraph = doc.add_heading(report_title, level=1) title_paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER - - # Add subtitle with entity information + if is_comparison and len(entities) >= 2: subtitle = f"Comparative Analysis: {entities[0]} and {entities[1]}" elif entities: subtitle = f"Analysis of {entities[0]}" else: subtitle = "Comprehensive Analysis Report" - + subtitle_paragraph = doc.add_paragraph(subtitle) subtitle_paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER subtitle_run = subtitle_paragraph.runs[0] subtitle_run.font.size = Pt(12) subtitle_run.italic = True - - # Add generation date and metadata + now = datetime.datetime.now() date_str = now.strftime("%B %d, %Y") time_str = now.strftime("%H:%M") - - doc.add_paragraph() # spacing - - # Create a professional metadata section + + doc.add_paragraph() metadata_paragraph = doc.add_paragraph() metadata_paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER - metadata_text = f"Generated on {date_str} at {time_str}\nPowered by OCI Generative AI" metadata_run = metadata_paragraph.add_run(metadata_text) metadata_run.font.size = Pt(10) - metadata_run.font.color.rgb = RGBColor(0x70, 0x70, 0x70) # Gray color - - # Add separator line + metadata_run.font.color.rgb = RGBColor(0x70, 0x70, 0x70) + doc.add_paragraph() separator = doc.add_paragraph("─" * 50) separator.alignment = WD_ALIGN_PARAGRAPH.CENTER separator_run = separator.runs[0] separator_run.font.color.rgb = RGBColor(0x70, 0x70, 0x70) - - doc.add_paragraph() # spacing after header - + doc.add_paragraph() + def _detect_target_language(self, query: str | None) -> str: - """Detect the target language from the query""" if not query: return "english" - - query_lower = query.lower() - - # Arabic language indicators + q = query.lower() arabic_indicators = [ - "Ψ¨Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΨ©", "Ψ¨Ψ§Ω„Ω„ΨΊΨ© Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΨ©", "in arabic", "arabic report", "ΨͺΩ‚Ψ±ΩŠΨ±", + "Ψ¨Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΨ©", "Ψ¨Ψ§Ω„Ω„ΨΊΨ© Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΨ©", "in arabic", "arabic report", "ΨͺΩ‚Ψ±ΩŠΨ±", "ΨͺΨ­Ω„ΩŠΩ„", "Ψ¨Ψ§Ω„Ω„ΨΊΨ© Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΩ‡", "عربي", "arabic language" ] - - # Check for Arabic script arabic_chars = any('\u0600' <= char <= '\u06FF' for char in query) - - # Check for explicit language requests - if any(indicator in query_lower for indicator in arabic_indicators) or arabic_chars: + if any(ind in q for ind in arabic_indicators) or arabic_chars: return "arabic" - - # Add more languages as needed - if "en espaΓ±ol" in query_lower or "in spanish" in query_lower: + if "en espaΓ±ol" in q or "in spanish" in q: return "spanish" - - if "en franΓ§ais" in query_lower or "in french" in query_lower: + if "en franΓ§ais" in q or "in french" in q: return "french" - return "english" - + def _ensure_language_consistency(self, sections: list[dict], target_language: str, query: str | None) -> list[dict]: - """Ensure all sections are in the target language""" if not self.llm or target_language == "english": return sections - logger.info(f"πŸ”„ Ensuring language consistency for {target_language}") - corrected_sections = [] - for section in sections: corrected_section = section.copy() - - # Check and translate heading if needed heading = section.get("heading", "") + text = section.get("text", "") + table = section.get("table", []) + if heading and not self._is_in_target_language(heading, target_language): corrected_section["heading"] = self._translate_text(heading, target_language, "section heading") - - # Check and translate text if needed - text = section.get("text", "") if text and not self._is_in_target_language(text, target_language): corrected_section["text"] = self._translate_text(text, target_language, "section text") - - # Handle table translations - table = section.get("table", []) + if table and isinstance(table, list): corrected_table = [] for row in table: if isinstance(row, dict): corrected_row = {} for key, value in row.items(): - # Translate table headers and values - translated_key = self._translate_text(str(key), target_language, "table header") if not self._is_in_target_language(str(key), target_language) else str(key) - translated_value = self._translate_text(str(value), target_language, "table value") if not self._is_in_target_language(str(value), target_language) and not str(value).replace('.', '').replace(',', '').isdigit() else str(value) + k = str(key) + v = str(value) + translated_key = self._translate_text(k, target_language, "table header") if not self._is_in_target_language(k, target_language) else k + # keep numeric strings unchanged + if not self._is_in_target_language(v, target_language) and not v.replace('.', '').replace(',', '').isdigit(): + translated_value = self._translate_text(v, target_language, "table value") + else: + translated_value = v corrected_row[translated_key] = translated_value corrected_table.append(corrected_row) corrected_section["table"] = corrected_table - + corrected_sections.append(corrected_section) - return corrected_sections - + def _is_in_target_language(self, text: str, target_language: str) -> bool: - """Check if text is already in the target language""" if not text or target_language == "english": return True - if target_language == "arabic": - # Check if text contains Arabic characters arabic_chars = sum(1 for char in text if '\u0600' <= char <= '\u06FF') total_chars = sum(1 for char in text if char.isalpha()) if total_chars == 0: - return True # No alphabetic characters, assume it's fine - return arabic_chars / total_chars > 0.3 # At least 30% Arabic characters - - # Add more language detection logic as needed - return True # Default to assuming it's correct - + return True + return arabic_chars / total_chars > 0.3 + return True + def _translate_text(self, text: str, target_language: str, context: str = "") -> str: - """Translate text to target language using LLM""" if not text or not self.llm: return text - - language_names = { - "arabic": "Arabic", - "spanish": "Spanish", - "french": "French" - } - + language_names = {"arabic": "Arabic", "spanish": "Spanish", "french": "French"} target_lang_name = language_names.get(target_language, target_language.title()) - prompt = f"""Translate the following {context} to {target_lang_name}. Maintain the professional tone and technical accuracy. If it's already in {target_lang_name}, return it unchanged. @@ -889,7 +1005,6 @@ def _translate_text(self, text: str, target_language: str, context: str = "") -> Text to translate: {text} Translation:""" - try: response = self.llm.invoke([type("Msg", (object,), {"content": prompt})()]).content.strip() logger.info(f"Translated {context}: '{text[:50]}...' β†’ '{response[:50]}...'") @@ -897,33 +1012,25 @@ def _translate_text(self, text: str, target_language: str, context: str = "") -> except Exception as e: logger.warning(f"Failed to translate {context}: {e}") return text - + def _generate_intro_section(self, is_comparison: bool, entities: list[str]) -> str: - """Fallback intro section when LLM is not available""" if is_comparison: - comparison_note = ( - f"This report compares data between {entities[0]} and {entities[1]} across key topics." - ) + comparison_note = f"This report compares data between {entities[0]} and {entities[1]} across key topics." else: comparison_note = f"This report presents information for {entities[0]}." - return ( f"{comparison_note} All data is sourced from retrieved documents and structured using LLM-based analysis.\n\n" "The analysis includes tables and charts where possible. Missing data is noted explicitly." ) - + def _organize_sections_with_llm(self, sections: list[dict], query: str | None, entities: list[str]) -> list[dict]: - """Use LLM to intelligently organize sections into a hierarchical structure""" if not query or not self.llm or not sections: return sections - - # Create a list of section titles section_info = [] for i, section in enumerate(sections): section_info.append(f"{i+1}. {section.get('heading', 'Untitled Section')}") - sections_list = "\n".join(section_info) - + prompt = f"""You are organizing sections for a report about {', '.join(entities)}. User's Original Request: @@ -942,7 +1049,7 @@ def _organize_sections_with_llm(self, sections: list[dict], query: str | None, e {{ "title": "Main Category Title from User's Request", "level": 1, - "sections": [1, 3, 5] // section numbers that belong under this category + "sections": [1, 3, 5] }}, {{ "title": "Another Main Category", @@ -950,7 +1057,7 @@ def _organize_sections_with_llm(self, sections: list[dict], query: str | None, e "sections": [2, 4, 6] }} ], - "orphan_sections": [7, 8] // sections that don't fit under any main category + "orphan_sections": [7, 8] }} IMPORTANT: @@ -964,39 +1071,29 @@ def _organize_sections_with_llm(self, sections: list[dict], query: str | None, e try: response = self.llm.invoke([type("Msg", (object,), {"content": prompt})()]).content.strip() - - # Clean and parse JSON response - import json - import re - - # Extract JSON from response + import json, re json_match = re.search(r'\{.*\}', response, re.DOTALL) if json_match: json_str = json_match.group() structure = json.loads(json_str) - - # Build organized sections list + organized = [] used_sections = set() - + for category in structure.get("structure", []): - # Add main category as a header-only section organized.append({ "heading": category.get("title", "Category"), "level": 1, "is_category_header": True }) - - # Add sections under this category for section_num in category.get("sections", []): - idx = section_num - 1 # Convert to 0-based index + idx = section_num - 1 if 0 <= idx < len(sections) and idx not in used_sections: section_copy = sections[idx].copy() section_copy["level"] = 2 organized.append(section_copy) used_sections.add(idx) - - # Add orphan sections at the end + for section_num in structure.get("orphan_sections", []): idx = section_num - 1 if 0 <= idx < len(sections) and idx not in used_sections: @@ -1004,33 +1101,23 @@ def _organize_sections_with_llm(self, sections: list[dict], query: str | None, e section_copy["level"] = 2 organized.append(section_copy) used_sections.add(idx) - - # Add any sections not mentioned in the structure + for i, section in enumerate(sections): if i not in used_sections: section_copy = section.copy() section_copy["level"] = 2 organized.append(section_copy) - + return organized - except Exception as e: logger.warning(f"Failed to organize sections with LLM: {e}") - # Return original sections if organization fails - pass - - # Return original sections if LLM organization fails or isn't attempted + return sections - - - + def _build_references_section(self, sections: list[dict]) -> tuple[dict, str]: - """Build a references section from all sources in sections and return citation map""" all_sources = [] citation_map = {} citation_counter = 1 - - # Collect all unique sources seen_sources = set() for section in sections: sources = section.get("sources", []) @@ -1041,137 +1128,108 @@ def _build_references_section(self, sections: list[dict]) -> tuple[dict, str]: citation_map[source_key] = citation_counter citation_counter += 1 seen_sources.add(source_key) - - # Build references text + references_text = [] for i, source in enumerate(all_sources, 1): file_name = source.get("file", "Unknown") sheet = source.get("sheet", "") entity = source.get("entity", "") - if sheet: ref_text = f"[{i}] {file_name}, Sheet: {sheet}" else: ref_text = f"[{i}] {file_name}" - if entity: ref_text += f" ({entity})" - references_text.append(ref_text) - + return citation_map, "\n".join(references_text) - + def write_report(self, sections: list[dict], filter_failures: bool = True, query: str | None = None) -> str: if not isinstance(sections, list): raise TypeError("Expected list of sections") - - # Detect requested language from query + target_language = self._detect_target_language(query) logger.info(f"🌐 Detected target language: {target_language}") - - # Filter out failed sections if requested + if filter_failures: sections = self._filter_failed_sections(sections) logger.info(f"πŸ“Š After filtering failures: {len(sections)} sections remaining") - - # Validate and fix language consistency across all sections + if target_language != "english": sections = self._ensure_language_consistency(sections, target_language, query) - - # Create a fresh document for each report to prevent accumulation + doc = Document() - - # Apply professional document styling self._apply_document_styling(doc) - - # Create reports directory if it doesn't exist + reports_dir = "reports" os.makedirs(reports_dir, exist_ok=True) - - # Extract metadata from sections - is_comparison = sections[0].get("is_comparison", False) if sections else False - entities = sections[0].get("entities", []) if sections else [] - - # Generate dynamic report title + + # NEW: infer comparison/entity context from first valid section (or defaults) + is_comparison = False + entities: list[str] = [] + for s in sections: + if "entities" in s: + entities = list(s.get("entities") or []) + if "is_comparison" in s: + is_comparison = bool(s.get("is_comparison")) + if entities: + break + report_title = self._generate_report_title(is_comparison, entities, query, sections) - - # Add professional header self._add_report_header(doc, report_title, is_comparison, entities) - # PARALLEL GENERATION of executive summary and conclusion while processing sections - from concurrent.futures import ThreadPoolExecutor, as_completed - - summary_and_conclusion_futures = [] - - if self.llm: # Only if LLM is available for intelligent generation + from concurrent.futures import ThreadPoolExecutor + if self.llm: with ThreadPoolExecutor(max_workers=2) as summary_executor: - # Start executive summary generation in parallel summary_future = summary_executor.submit( self._generate_executive_summary, sections, is_comparison, entities, target_language, query ) - summary_and_conclusion_futures.append(("summary", summary_future)) - - # Start conclusion generation in parallel conclusion_future = summary_executor.submit( self._generate_conclusion, sections, is_comparison, entities, target_language, query ) - summary_and_conclusion_futures.append(("conclusion", conclusion_future)) - - # Add executive summary + doc.add_heading("Executive Summary", level=2) - executive_summary = summary_future.result() # Wait for completion + executive_summary = summary_future.result() + add_inline_markdown_paragraph(doc, executive_summary) doc.add_paragraph(executive_summary) - doc.add_paragraph() # spacing + doc.add_paragraph() - # Organize sections hierarchically using LLM organized_sections = self._organize_sections_with_llm(sections, query, entities) - - # Build citation map before adding sections citation_map, references_text = self._build_references_section(organized_sections) - - # Add organized sections with citations + for section in organized_sections: if section.get("is_category_header"): - # This is a main category header doc.add_heading(section.get("heading", "Category"), level=1) else: - # Regular section with appropriate level and citations level = section.get("level", 2) append_to_doc(doc, section, level=level, citation_map=citation_map) - doc.add_paragraph() # spacing between sections + doc.add_paragraph() - # Add conclusion doc.add_heading("Conclusion", level=2) - conclusion = conclusion_future.result() # Wait for completion + conclusion = conclusion_future.result() + add_inline_markdown_paragraph(doc, conclusion) doc.add_paragraph(conclusion) - - # Add References section (already built above) + if references_text: - doc.add_paragraph() # spacing + doc.add_paragraph() doc.add_heading("References", level=2) doc.add_paragraph(references_text) else: - # Fallback for when no LLM is available doc.add_heading("Executive Summary", level=2) executive_summary = self._generate_intro_section(is_comparison, entities) doc.add_paragraph(executive_summary) - doc.add_paragraph() # spacing + doc.add_paragraph() - # Build citation map citation_map, references_text = self._build_references_section(sections) - - # Add all sections with citations (no LLM available for organization) for section in sections: append_to_doc(doc, section, level=2, citation_map=citation_map) - doc.add_paragraph() # spacing between sections + doc.add_paragraph() doc.add_heading("Conclusion", level=2) conclusion = "This analysis provides insights based on available data from retrieved documents." doc.add_paragraph(conclusion) - - # Add References section (already built above) if references_text: - doc.add_paragraph() # spacing + doc.add_paragraph() doc.add_heading("References", level=2) doc.add_paragraph(references_text) @@ -1181,15 +1239,19 @@ def write_report(self, sections: list[dict], filter_failures: bool = True, query save_doc(doc, filepath) return filepath + # Example usage if __name__ == "__main__": doc = Document() sample_section = { "heading": "Climate Commitments", - "text": "Both Elinexa and Aelwyn have committed to net-zero targets...", - "table": [{"Bank": "Elinexa", "Target": "Net-zero 2050"}, - {"Bank": "Aelwyn", "Target": "Net-zero 2050"}], - "chart_data": {"Elinexa": 42, "Aelwyn": 36} + "text": "Both Acme Bank and Globex Bank have committed to net-zero targets...", + "table": [{"Bank": "Acme Bank", "Target": "Net-zero 2050"}, + {"Bank": "Globex Bank", "Target": "Net-zero 2050"}], + "chart_data": {"Acme Bank": 42, "Globex Bank": 36}, + # NEW: tell the pipeline which two entities are being compared + "entities": ["Acme Bank", "Globex Bank"], + "is_comparison": True } agent = ReportWriterAgent(doc) agent.write_report([sample_section]) diff --git a/ai/generative-ai-service/complex-document-rag/files/gradio.css b/ai/generative-ai-service/complex-document-rag/files/gradio.css index 9d847f9c6..3296d7199 100644 --- a/ai/generative-ai-service/complex-document-rag/files/gradio.css +++ b/ai/generative-ai-service/complex-document-rag/files/gradio.css @@ -1,12 +1,15 @@ /* ===== CLEAN LIGHT THEME ===== */ :root { - --primary-color: #ff6b35; - --secondary-color: #6c757d; - --background-color: #ffffff; - --surface-color: #ffffff; + --primary-color: #c74634; + --oracle-red: #c74634; + --secondary-color: #6f757e; + --background-color: #fffefe; + --surface-color: #fffefe; + --off-white: #fffefe; --border-color: #dee2e6; - --text-color: #212529; - --text-muted: #6c757d; + --text-color: #312d2a; + --text-muted: #6f7572; + --dark-grey: #404040; } /* ===== GLOBAL STYLING ===== */ @@ -36,7 +39,7 @@ /* ===== BUTTONS ===== */ .gr-button, button, .primary-button, .secondary-button { - background: white !important; + background: var(--off-white) !important; color: var(--primary-color) !important; border: 1px solid var(--primary-color) !important; padding: 10px 20px !important; @@ -46,43 +49,95 @@ letter-spacing: 0.5px !important; cursor: pointer !important; font-size: 12px !important; - transition: color 0.2s ease !important; + transition: background-color 0.2s ease, color 0.2s ease !important; } .gr-button:hover, button:hover, .primary-button:hover, .secondary-button:hover { background: #f8f8f8 !important; color: var(--primary-color) !important; + padding: 10px 20px !important; /* Keep same padding to prevent jumpy behavior */ } .gr-button:active, button:active, .primary-button:active, .secondary-button:active { background: #f0f0f0 !important; color: var(--primary-color) !important; + padding: 10px 20px !important; /* Keep same padding to prevent jumpy behavior */ } /* ===== TABS ===== */ -.gr-tabs .gr-tab-nav button { - background: #6c757d !important; - color: white !important; +/* Target all possible tab button selectors for Gradio */ +.gr-tabs .tab-nav button, +.gr-tabs .gr-tab-nav button, +div[role="tablist"] button, +button[role="tab"], +.gradio-container .gr-tabs button[role="tab"], +.gradio-container button.tab-nav-button { + background: #c74634 !important; + background-color: #c74634 !important; + color: #fffefe !important; border: none !important; + border-bottom: 3px solid transparent !important; /* Remove orange underline */ padding: 12px 20px !important; font-weight: 500 !important; text-transform: uppercase !important; letter-spacing: 0.5px !important; border-radius: 4px 4px 0 0 !important; margin-right: 2px !important; -} - -.gr-tabs .gr-tab-nav button.selected { - background: #495057 !important; -} - -.gr-tabs .gr-tab-nav button:hover { - background: #5a6268 !important; + transition: background-color 0.3s ease, border-bottom 0.3s ease !important; + opacity: 0.8 !important; +} + +/* Selected/Active tab with black underline */ +.gr-tabs .tab-nav button.selected, +.gr-tabs .gr-tab-nav button.selected, +div[role="tablist"] button.selected, +button[role="tab"][aria-selected="true"], +button[role="tab"].selected, +.gradio-container .gr-tabs button[role="tab"].selected, +.gradio-container button.tab-nav-button.selected { + background: #c74634 !important; + background-color: #c74634 !important; + opacity: 1 !important; + color: #fffefe !important; + font-weight: 500 !important; /* Keep same weight as non-selected to prevent jumpy behavior */ + border-bottom: 3px solid #312d2a !important; /* Black underline for active tab */ + padding: 12px 20px !important; /* Keep same padding */ +} + +/* Hover state for non-selected tabs */ +.gr-tabs .tab-nav button:hover:not(.selected), +.gr-tabs .gr-tab-nav button:hover:not(.selected), +div[role="tablist"] button:hover:not(.selected), +button[role="tab"]:hover:not([aria-selected="true"]), +button[role="tab"]:hover:not(.selected), +.gradio-container .gr-tabs button[role="tab"]:hover:not(.selected), +.gradio-container button.tab-nav-button:hover:not(.selected) { + background: #404040 !important; + background-color: #404040 !important; + color: #fffefe !important; + opacity: 1 !important; + padding: 12px 20px !important; /* Keep same padding */ +} + +/* Additional override for any nested spans or text elements in tabs */ +.gr-tabs button span, +button[role="tab"] span, +.gr-tabs button *, +button[role="tab"] * { + color: inherit !important; +} + +/* Remove any orange borders/underlines that might appear */ +button[role="tab"]::after, +button[role="tab"]::before, +.gr-tabs button::after, +.gr-tabs button::before { + display: none !important; } /* ===== COMPACT UPLOAD SECTIONS ===== */ .upload-section { - background: white !important; + background: var(--off-white) !important; border: 1px solid var(--border-color) !important; border-radius: 8px !important; padding: 12px !important; @@ -113,13 +168,13 @@ margin: 8px 0 !important; display: block !important; padding: 12px !important; - background: white !important; + background: var(--off-white) !important; border: 1px solid var(--primary-color) !important; } /* ===== INFERENCE LAYOUT ===== */ .inference-left-column, .inference-right-column { - background: white !important; + background: var(--off-white) !important; padding: 20px !important; } @@ -127,30 +182,37 @@ margin-bottom: 16px !important; } -.model-controls, .collection-controls { - background: white !important; +/* Make control sections more compact */ +.model-controls, .collection-controls, .processing-controls { + background: var(--off-white) !important; border: 1px solid var(--border-color) !important; border-radius: 6px !important; - padding: 12px !important; - margin-bottom: 12px !important; + padding: 8px !important; /* Reduced padding for compactness */ + margin-bottom: 8px !important; /* Reduced margin */ } .processing-controls { - background: white !important; border: 1px solid var(--primary-color) !important; - border-radius: 6px !important; - padding: 12px !important; - margin-bottom: 12px !important; } -.compact-query textarea { - min-height: 120px !important; - max-height: 150px !important; +/* Compact headers in control sections */ +.model-controls h4, +.collection-controls h4, +.processing-controls h4 { + font-size: 12px !important; + margin-bottom: 4px !important; +} + +/* Make query textarea much larger */ +.compact-query textarea, +.query-section textarea { + min-height: 360px !important; /* 3x larger than before */ + max-height: 450px !important; } /* ===== INPUT FIELDS ===== */ .gr-textbox, .gr-textbox textarea, .gr-textbox input { - background: white !important; + background: var(--off-white) !important; border: 1px solid var(--border-color) !important; border-radius: 4px !important; color: var(--text-color) !important; @@ -159,15 +221,22 @@ /* ===== DROPDOWNS ===== */ .gr-dropdown, .gr-dropdown select { - background: white !important; + background: var(--off-white) !important; border: 1px solid var(--border-color) !important; border-radius: 4px !important; color: var(--text-color) !important; } +/* Make dropdowns more compact */ +.model-controls .gr-dropdown, +.collection-controls .gr-dropdown { + padding: 6px !important; + font-size: 13px !important; +} + /* ===== FILE UPLOAD ===== */ .gr-file { - background: white !important; + background: var(--off-white) !important; border: 2px dashed var(--primary-color) !important; border-radius: 8px !important; padding: 20px !important; @@ -212,17 +281,68 @@ display: none !important; } -/* ===== FORCE WHITE BACKGROUNDS ===== */ +/* ===== FORCE OFF-WHITE BACKGROUNDS ===== */ .gr-group, .gr-form, .gr-block { - background: white !important; + background: var(--off-white) !important; } /* ===== DELETE BUTTON ===== */ .gr-button[variant="stop"] { - background: #dc3545 !important; - color: white !important; + background: var(--oracle-red) !important; + color: var(--off-white) !important; + border: 1px solid var(--oracle-red) !important; } .gr-button[variant="stop"]:hover { - background: #c82333 !important; + background: #a13527 !important; + border: 1px solid #a13527 !important; +} + +/* ===== CHECKBOXES - MORE COMPACT ===== */ +.gr-checkbox-group { + display: flex !important; + gap: 12px !important; + flex-wrap: wrap !important; +} + +.gr-checkbox-group label { + font-size: 13px !important; + margin-bottom: 0 !important; +} + +/* ===== COMPACT SETTINGS SECTION ===== */ +.compact-settings { + background: var(--off-white) !important; + border: 1px solid var(--border-color) !important; + border-radius: 6px !important; + padding: 8px !important; + margin-top: 8px !important; +} + +.compact-settings .gr-row { + margin-bottom: 4px !important; +} + +.compact-settings .gr-dropdown { + margin-bottom: 4px !important; +} + +.compact-settings .gr-dropdown label { + font-size: 12px !important; + margin-bottom: 2px !important; +} + +.compact-settings .gr-checkbox { + margin: 0 !important; + padding: 4px !important; +} + +.compact-settings .gr-checkbox label { + font-size: 12px !important; + margin: 0 !important; +} + +/* Remove extra spacing in compact settings */ +.compact-settings > div { + gap: 4px !important; } diff --git a/ai/generative-ai-service/complex-document-rag/files/gradio_app.py b/ai/generative-ai-service/complex-document-rag/files/gradio_app.py index e41b98ebe..9e52af416 100644 --- a/ai/generative-ai-service/complex-document-rag/files/gradio_app.py +++ b/ai/generative-ai-service/complex-document-rag/files/gradio_app.py @@ -1,6 +1,9 @@ #!/usr/bin/env python3 """Oracle Enterprise RAG System Interface.""" +# Disable telemetry first to prevent startup errors +import disable_telemetry + import gradio as gr import logging import os @@ -82,7 +85,7 @@ def __init__(self) -> None: self._initialize_vector_store(self.current_embedding_model) self._initialize_rag_agent( self.current_llm_model, - collection="Multi-Collection", + collection="multi", embedding_model=self.current_embedding_model ) @@ -257,7 +260,7 @@ def _initialize_processors(self) -> Tuple[Optional[XLSXIngester], Optional[PDFIn def _initialize_rag_agent( self, llm_model: str, - collection: str = "Multi-Collection", + collection: str = "multi", embedding_model: Optional[str] = None ) -> bool: """ @@ -265,7 +268,7 @@ def _initialize_rag_agent( Args: llm_model: Name of the LLM model to use - collection: Name of the collection to use (default: "Multi-Collection") + collection: Name of the collection to use (default: "multi") embedding_model: Optional embedding model to switch to Returns: @@ -483,72 +486,30 @@ def create_oracle_interface(): placeholder="Deletion results will appear here..." ) - with gr.Tab("SEARCH COLLECTIONS", id="search"): - gr.Markdown("### Search through your vector store collections") - - # Add embedding model selector for search tab - with gr.Row(): - embedding_model_selector_search = gr.Dropdown( - choices=rag_system.available_embedding_models, - value=rag_system.current_embedding_model, - label="Embedding Model for Search", - info="Select the embedding model to use for searching" - ) - - with gr.Row(): - search_query = gr.Textbox( - label="Search Query", - placeholder="Enter search terms...", - scale=3 - ) - search_collection = gr.Dropdown( - choices=["PDF Documents", "XLSX Documents"], - value="XLSX Documents", - label="Collection", - scale=1 - ) - search_results_count = gr.Slider( - minimum=1, - maximum=20, - value=5, - step=1, - label="Results", - scale=1 - ) - - search_btn = gr.Button("Search", variant="secondary", elem_classes=["secondary-button"]) - - search_results = gr.Textbox( - elem_id="scientific-results-box", - label="Search Results", - lines=25, - max_lines=30, - placeholder="Search results will appear here..." - ) - with gr.Tab("INFERENCE & QUERY", id="inference"): with gr.Row(): - # Left Column - Input Controls + # Left Column - Query Input with gr.Column(scale=1, elem_classes=["inference-left-column"]): - # Query Section + # Large Query Section with gr.Group(elem_classes=["query-section"]): query_input = gr.Textbox( label="Query", - lines=4, - max_lines=6, + lines=15, # Much larger query area + max_lines=20, placeholder="Enter your query here...", elem_classes=["compact-query"] ) + query_btn = gr.Button( "Run Query", elem_classes=["primary-button"], - size="sm", + size="lg", elem_id="run-query-btn" ) - # Model Configuration - with gr.Group(elem_classes=["model-controls"]): - gr.HTML("

Model Configuration

") + # Compact Configuration Section - All in one group + with gr.Group(elem_classes=["compact-settings"]): + # Model Configuration in one row with gr.Row(): llm_model_selector = gr.Dropdown( choices=rag_system.available_llm_models, @@ -559,26 +520,16 @@ def create_oracle_interface(): embedding_model_selector_query = gr.Dropdown( choices=rag_system.available_embedding_models, value=rag_system.current_embedding_model, - label="Embedding Model", + label="Embeddings", interactive=True, scale=1 ) - - # Data Sources - with gr.Group(elem_classes=["collection-controls"]): - gr.HTML("

Data Sources

") + + # Data Sources and Processing Mode in one compact row with gr.Row(): - collection_pdf = gr.Checkbox(label="Include PDF Collection", value=False) - collection_xlsx = gr.Checkbox(label="Include XLSX Collection", value=False) - - # Processing Mode - with gr.Group(elem_classes=["processing-controls"]): - gr.HTML("

Processing Mode

") - agent_mode = gr.Checkbox( - label="Use Agentic Workflow", - value=False, - info="Enable advanced reasoning and multi-step processing" - ) + collection_pdf = gr.Checkbox(label="Include PDF", value=False, scale=1) + collection_xlsx = gr.Checkbox(label="Include XLSX", value=False, scale=1) + agent_mode = gr.Checkbox(label="Agentic Mode", value=False, scale=1) # Right Column - Results with gr.Column(scale=1, elem_classes=["inference-right-column"]): @@ -636,11 +587,11 @@ def process_pdf_and_clear(file, model, entity): outputs=[collection_documents] ) - search_btn.click( - fn=lambda q, coll, emb, n: search_chunks(q, coll, emb, rag_system, n), - inputs=[search_query, search_collection, embedding_model_selector_search, search_results_count], - outputs=[search_results] - ) + # search_btn.click( + # fn=lambda q, coll, emb, n: search_chunks(q, coll, emb, rag_system, n), + # inputs=[search_query, search_collection, embedding_model_selector_search, search_results_count], + # outputs=[search_results] + # ) list_chunks_btn.click( fn=lambda coll, emb: list_all_chunks(coll, emb, rag_system), @@ -697,8 +648,11 @@ def handle_query_with_download(query, llm_model, embedding_model, include_pdf, i gr.update(visible=False) ) - # Actually process the query - response, report_path = process_query(query, llm_model, embedding_model, include_pdf, include_xlsx, agentic, rag_system) + # Actually process the query with entity parameters + # Pass empty strings for entities to trigger automatic detection + entity1 = "" # Will be automatically detected by the LLM + entity2 = "" # Will be automatically detected by the LLM + response, report_path = process_query(query, llm_model, embedding_model, include_pdf, include_xlsx, agentic, rag_system, entity1, entity2) progress(1.0, desc="Complete!") @@ -720,6 +674,7 @@ def handle_query_with_download(query, llm_model, embedding_model, include_pdf, i query_btn.click( fn=handle_query_with_download, + # inputs=[query_input, llm_model_selector, embedding_model_selector_query, collection_pdf, collection_xlsx, agent_mode, entity1_input, entity2_input], inputs=[query_input, llm_model_selector, embedding_model_selector_query, collection_pdf, collection_xlsx, agent_mode], outputs=[status_box, response_box, download_file], show_progress="full" diff --git a/ai/generative-ai-service/complex-document-rag/files/handlers/pdf_handler.py b/ai/generative-ai-service/complex-document-rag/files/handlers/pdf_handler.py index 6fb4caeed..b4889de09 100644 --- a/ai/generative-ai-service/complex-document-rag/files/handlers/pdf_handler.py +++ b/ai/generative-ai-service/complex-document-rag/files/handlers/pdf_handler.py @@ -53,8 +53,8 @@ def progress(*args, **kwargs): return "❌ ERROR: Vector store not initialized", "" file_path = Path(file.name) - chunks, doc_id = rag_system.pdf_processor.process_pdf(file_path, entity=entity) - + chunks, doc_id, _ = rag_system.pdf_processor.ingest_pdf(file_path, entity=entity) + print("PDF processor type:", type(rag_system.pdf_processor)) progress(0.7, desc="Adding to vector store...") converted_chunks = [ diff --git a/ai/generative-ai-service/complex-document-rag/files/handlers/query_handler.py b/ai/generative-ai-service/complex-document-rag/files/handlers/query_handler.py index f16a174d8..b9b5f9abf 100644 --- a/ai/generative-ai-service/complex-document-rag/files/handlers/query_handler.py +++ b/ai/generative-ai-service/complex-document-rag/files/handlers/query_handler.py @@ -20,9 +20,11 @@ def process_query( include_xlsx: bool, agentic: bool, rag_system, + entity1: str = "", + entity2: str = "", progress=gr.Progress() ) -> Tuple[str, Optional[str]]: - """Process a query using the RAG system""" + """Process a query using the RAG system with optional entity specification""" if not query.strip(): return "ERROR: Please enter a query", None @@ -108,11 +110,24 @@ def _safe_query(collection_label: str, text: str, n: int = 5): progress(0.8, desc="Generating response...") + # Prepare provided entities if any + provided_entities = [] + if entity1 and entity1.strip(): + provided_entities.append(entity1.strip().lower()) + if entity2 and entity2.strip(): + provided_entities.append(entity2.strip().lower()) + + # Log entities being used + if provided_entities: + logger.info(f"Using provided entities: {provided_entities}") + if all_results: + # Pass provided entities to the RAG system result = rag_system.rag_agent.process_query_with_multi_collection_context( query, all_results, - collection_mode=active_collection + collection_mode=active_collection, + provided_entities=provided_entities if provided_entities else None ) # Ensure result is a dictionary if not isinstance(result, dict): @@ -140,12 +155,12 @@ def query_collection(collection_type): """Query a single collection in parallel""" try: if collection_type == "pdf": - # Increased to 20 chunks for non-agentic workflows - results = _safe_query("pdf", query, n=20) + # Optimized to 10 chunks for faster processing + results = _safe_query("pdf", query, n=10) return ("PDF", results if results else []) elif collection_type == "xlsx": - # Increased to 20 chunks for non-agentic workflows - results = _safe_query("xlsx", query, n=20) + # Optimized to 10 chunks for faster processing + results = _safe_query("xlsx", query, n=10) return ("XLSX", results if results else []) else: return (collection_type.upper(), []) @@ -178,8 +193,11 @@ def query_collection(collection_type): return "No relevant information found in selected collections.", None # Use more chunks for better context in non-agentic mode - # Take top 20 chunks total (or all if less than 20) - chunks_to_use = retrieved_chunks[:20] + # Optimize chunk usage based on model + if llm_model == "grok-4": + chunks_to_use = retrieved_chunks[:15] # Can handle more context + else: + chunks_to_use = retrieved_chunks[:10] # Optimized for speed context_str = "\n\n".join(chunk["content"] for chunk in chunks_to_use) prompt = f"""You are an expert assistant. diff --git a/ai/generative-ai-service/complex-document-rag/files/handlers/vector_handler.py b/ai/generative-ai-service/complex-document-rag/files/handlers/vector_handler.py index 782700a0a..f37cfada1 100644 --- a/ai/generative-ai-service/complex-document-rag/files/handlers/vector_handler.py +++ b/ai/generative-ai-service/complex-document-rag/files/handlers/vector_handler.py @@ -319,8 +319,13 @@ def delete_all_chunks_in_collection(collection_name: str, embedding_model: str, client = rag_system.vector_store.client all_colls = client.list_collections() - # Find all physical collections for this logical group (e.g., xlsx_documents_*) - targets = [c for c in all_colls if c.name.startswith(f"{base_prefix}_")] + # Find all physical collections for this logical group (e.g., xlsx_documents_* or pdf_documents_*) + targets = [] + for c in all_colls: + # Handle both collection objects and dict representations + coll_name = getattr(c, 'name', None) or (c.get('name') if isinstance(c, dict) else str(c)) + if coll_name and coll_name.startswith(f"{base_prefix}_"): + targets.append((coll_name, c)) if not targets: return f"Collection group '{collection_name}' has no collections to delete." @@ -328,21 +333,31 @@ def delete_all_chunks_in_collection(collection_name: str, embedding_model: str, # Delete them all total_deleted_chunks = 0 deleted_names = [] - for coll in targets: + for coll_name, coll_obj in targets: try: count = 0 try: - count = coll.count() + # Get the actual collection object if we only have the name + if isinstance(coll_obj, str): + actual_coll = client.get_collection(coll_name) + else: + actual_coll = coll_obj + count = actual_coll.count() except Exception: pass + total_deleted_chunks += count - client.delete_collection(coll.name) - deleted_names.append(coll.name) - # Also drop from in-memory map if present + client.delete_collection(coll_name) + deleted_names.append(coll_name) + + # Clean up all in-memory references if hasattr(rag_system.vector_store, "collections"): - rag_system.vector_store.collections.pop(coll.name, None) + rag_system.vector_store.collections.pop(coll_name, None) + if hasattr(rag_system.vector_store, "collection_map"): + rag_system.vector_store.collection_map.pop(coll_name, None) + except Exception as e: - logging.error(f"Failed to delete collection '{coll.name}': {e}") + logging.error(f"Failed to delete collection '{coll_name}': {e}") # Recreate the CURRENT model's empty collection so the app keeps a live handle # Build full name like: {base_prefix}_{model_name}_{dimensions} @@ -357,16 +372,28 @@ def delete_all_chunks_in_collection(collection_name: str, embedding_model: str, new_full_name = f"{base_prefix}_{model_name}_{dims}" new_collection = client.get_or_create_collection(name=new_full_name, metadata=metadata) - # Refresh vector_store references for this base prefix + # Refresh ALL vector_store references comprehensively if hasattr(rag_system.vector_store, "collections"): rag_system.vector_store.collections[new_full_name] = new_collection - # Also store under the base key for compatibility with older code paths - rag_system.vector_store.collections[base_prefix] = new_collection - + + if hasattr(rag_system.vector_store, "collection_map"): + rag_system.vector_store.collection_map[new_full_name] = new_collection + # Also ensure the collection_map is properly updated + rag_system.vector_store.collection_map = { + k: v for k, v in rag_system.vector_store.collection_map.items() + if not k.startswith(f"{base_prefix}_") or k == new_full_name + } + rag_system.vector_store.collection_map[new_full_name] = new_collection + + # Update the specific collection references if base_prefix == "xlsx_documents": rag_system.vector_store.xlsx_collection = new_collection + if hasattr(rag_system.vector_store, "current_xlsx_collection_name"): + rag_system.vector_store.current_xlsx_collection_name = new_full_name elif base_prefix == "pdf_documents": rag_system.vector_store.pdf_collection = new_collection + if hasattr(rag_system.vector_store, "current_pdf_collection_name"): + rag_system.vector_store.current_pdf_collection_name = new_full_name # Nice summary deleted_list = "\n".join(f" β€’ {name}" for name in deleted_names) if deleted_names else " β€’ (none)" @@ -374,7 +401,7 @@ def delete_all_chunks_in_collection(collection_name: str, embedding_model: str, "βœ… DELETION COMPLETED\n\n" f"Logical collection: {collection_name}\n" f"Collections removed: {len(deleted_names)}\n" - f"Total chunks deleted (best-effort): {total_deleted_chunks}\n" + f"Total chunks deleted: {total_deleted_chunks}\n" f"Deleted collections:\n{deleted_list}\n\n" "Recreated empty collection for current model:\n" f" β€’ {new_full_name}\n" diff --git a/ai/generative-ai-service/complex-document-rag/files/handlers/xlsx_handler.py b/ai/generative-ai-service/complex-document-rag/files/handlers/xlsx_handler.py index 3c4ca27c0..95a118aaf 100644 --- a/ai/generative-ai-service/complex-document-rag/files/handlers/xlsx_handler.py +++ b/ai/generative-ai-service/complex-document-rag/files/handlers/xlsx_handler.py @@ -56,7 +56,15 @@ def progress(*args, **kwargs): return "❌ ERROR: Vector store not initialized", "" file_path = Path(file.name) - chunks, doc_id = rag_system.xlsx_processor.ingest_xlsx(file_path, entity=entity) + # Now returns 3 values: chunks, doc_id, and chunks_to_delete + result = rag_system.xlsx_processor.ingest_xlsx(file_path, entity=entity) + + # Handle both old (2-tuple) and new (3-tuple) return formats + if len(result) == 3: + chunks, doc_id, chunks_to_delete = result + else: + chunks, doc_id = result + chunks_to_delete = [] progress(0.7, desc="Adding to vector store...") @@ -69,6 +77,17 @@ def progress(*args, **kwargs): for chunk in chunks ] + # Delete original chunks FIRST if they were rewritten + if chunks_to_delete and hasattr(rag_system.vector_store, 'delete_chunks'): + progress(0.7, desc="Removing original chunks that were rewritten...") + try: + rag_system.vector_store.delete_chunks('xlsx_documents', chunks_to_delete) + logger.info(f"Deleted {len(chunks_to_delete)} original chunks that were rewritten") + except Exception as e: + logger.warning(f"Could not delete original chunks: {e}") + + # THEN add the new rewritten chunks to vector store + progress(0.8, desc="Adding rewritten chunks to vector store...") rag_system.vector_store.add_xlsx_chunks(converted_chunks, doc_id) progress(1.0, desc="Complete!") @@ -78,6 +97,9 @@ def progress(*args, **kwargs): actual_collection_name = rag_system.vector_store.xlsx_collection.name collection_name = f"{actual_collection_name} ({embedding_model})" + + # Count rewritten chunks + rewritten_count = sum(1 for chunk in chunks if chunk.get('metadata', {}).get('rewritten', False)) summary = f""" βœ… **XLSX PROCESSING COMPLETE** @@ -86,6 +108,7 @@ def progress(*args, **kwargs): **Document ID:** {doc_id} **Entity:** {entity} **Chunks created:** {len(chunks)} +**Chunks with rewritten content:** {rewritten_count} **Embedding model:** {embedding_model} **Collection:** {collection_name} @@ -123,4 +146,3 @@ def progress(*args, **kwargs): error_msg = f"❌ ERROR: Processing XLSX file failed: {str(e)}" logger.error(f"{error_msg}\n{traceback.format_exc()}") return error_msg, traceback.format_exc() - diff --git a/ai/generative-ai-service/complex-document-rag/files/ingest_pdf.py b/ai/generative-ai-service/complex-document-rag/files/ingest_pdf.py index 83110ebd5..f2542c54c 100644 --- a/ai/generative-ai-service/complex-document-rag/files/ingest_pdf.py +++ b/ai/generative-ai-service/complex-document-rag/files/ingest_pdf.py @@ -1,159 +1,355 @@ +# pdf_ingester_v2.py +import logging, time, uuid, re, os from pathlib import Path from typing import List, Dict, Any, Optional, Tuple -import uuid -import time -import re import tiktoken +import pandas as pd + +# Hard deps you likely already have: import pdfplumber -import logging + +# Optional but recommended for tables +try: + import camelot + _HAS_CAMELOT = True +except Exception: + _HAS_CAMELOT = False + +# Optional for embedded files +try: + from pypdf import PdfReader + _HAS_PYPDF = True +except Exception: + _HAS_PYPDF = False logger = logging.getLogger(__name__) class PDFIngester: - def __init__(self, tokenizer: str = "BAAI/bge-small-en-v1.5", chunk_rewriter=None): + """ + PDF -> chunks with consistent semantics to XLSXIngester. + Strategy: + 1) Detect embedded spreadsheets -> delegate to XLSXIngester + 2) Try Camelot (lattice->stream) for vector tables + 3) Fallback to pdfplumber tables + 4) Extract remaining prose blocks + 5) Batch + select + batch-rewrite (same as XLSX flow) + """ + + def __init__(self, tokenizer: str = "BAAI/bge-small-en-v1.5", + chunk_rewriter=None, + batch_size: int = 16): + self.tokenizer_name = tokenizer self.chunk_rewriter = chunk_rewriter + self.batch_size = batch_size self.accurate_tokenizer = tiktoken.get_encoding("cl100k_base") - self.tokenizer_name = tokenizer self.stats = { 'total_chunks': 0, 'rewritten_chunks': 0, - 'processing_time': 0, - 'rewriting_time': 0 + 'high_value_chunks': 0, + 'processing_time': 0.0, + 'extraction_time': 0.0, + 'rewriting_time': 0.0, + 'selection_time': 0.0 } - logger.info("πŸ“„ PDF processor initialized") - + # ---------- Utility parity with XLSX ---------- def _count_tokens(self, text: str) -> int: if not text or not text.strip(): return 0 return len(self.accurate_tokenizer.encode(text)) - def _should_rewrite(self, text: str) -> bool: - if not text.strip() or self._count_tokens(text) < 120: - return False + def _is_high_value_chunk(self, text: str, metadata: Dict[str, Any]) -> int: + # Same heuristic as your XLSX version (copy/paste with tiny tweaks) + if len(text.strip()) < 100: + return 0 + score = 0 + if re.search(r'\d+\.?\d*\s*(%|MW|GW|tCO2|ktCO2|MtCO2|€|\$|Β£|million|billion)', + text, re.IGNORECASE): + score += 2 + key_terms = ['revenue','guidance','margin','cash flow','eps', + 'emission','target','reduction','scope','net-zero', + 'renewable','sustainability','biodiversity'] + score += min(2, sum(1 for term in key_terms if term in text.lower())) + if text.count('|') > 5: + score += 1 + skip_indicators = ['cover', 'disclaimer', 'notice', 'table of contents'] + if any(skip in text.lower()[:200] for skip in skip_indicators): + score = max(0, score - 2) + return min(5, score) - pipe_count = text.count('|') - number_ratio = sum(c.isdigit() for c in text) / len(text) if text else 0 - line_count = len(text.splitlines()) + def _batch_rows_by_token_count(self, rows: List[str], max_tokens: int = 400) -> List[List[str]]: + chunks, current, tok = [], [], 0.0 + for row in rows: + if not row or not row.strip(): + continue + est = len(row.split()) * 1.3 + if tok + est > max_tokens: + if current: chunks.append(current) + current, tok = [row], est + else: + current.append(row); tok += est + if current: chunks.append(current) + return chunks - is_tabular = (pipe_count > 10 or number_ratio > 0.3 or line_count > 20) - messy = 'nan' in text.lower() or 'null' in text.lower() - sentence_count = len([s for s in text.split('.') if s.strip()]) - is_prose = sentence_count > 3 and pipe_count < 5 + def _batch_rewrite_chunks(self, chunks_to_rewrite: List[Tuple[str, Dict[str, Any], int]]): + if not chunks_to_rewrite or not self.chunk_rewriter: + return chunks_to_rewrite + start = time.time() + results = [] - return (is_tabular or messy) and not is_prose + # Fast path if your rewriter supports batch + if hasattr(self.chunk_rewriter, 'rewrite_chunks_batch'): + BATCH_SIZE = min(self.batch_size, len(chunks_to_rewrite)) + batches = [chunks_to_rewrite[i:i+BATCH_SIZE] + for i in range(0, len(chunks_to_rewrite), BATCH_SIZE)] - def _rewrite_chunk(self, text: str, metadata: Dict[str, Any]) -> str: - if not self.chunk_rewriter: - return text + for bidx, batch in enumerate(batches, 1): + batch_input = [{'text': t, 'metadata': m} for (t, m, _) in batch] + try: + rewritten = self.chunk_rewriter.rewrite_chunks_batch(batch_input, batch_size=BATCH_SIZE) + except Exception as e: + logger.warning(f"⚠️ Batch {bidx} failed: {e}") + rewritten = [None]*len(batch) + for i, (orig_text, meta, idx) in enumerate(batch): + new_text = rewritten[i] if i < len(rewritten) else None + if new_text and new_text != orig_text: + meta = meta.copy() + meta['rewritten'] = True + self.stats['rewritten_chunks'] += 1 + results.append((new_text, meta, idx)) + else: + results.append((orig_text, meta, idx)) + else: + # Sequential fallback + for (t, m, idx) in chunks_to_rewrite: + try: + new_t = self.chunk_rewriter.rewrite_chunk(t, metadata=m).strip() + except Exception as e: + logger.warning(f"⚠️ Rewrite failed for chunk {idx}: {e}") + new_t = None + if new_t and new_t != t: + m = m.copy(); m['rewritten'] = True + self.stats['rewritten_chunks'] += 1 + results.append((new_t, m, idx)) + else: + results.append((t, m, idx)) + self.stats['rewriting_time'] += time.time() - start + return results + + # ---------- Ingestion helpers ---------- + def _find_embedded_spreadsheets(self, pdf_path: Path) -> List[Tuple[str, bytes]]: + if not _HAS_PYPDF: + return [] try: - rewritten = self.chunk_rewriter.rewrite_chunk(text, metadata=metadata).strip() - if rewritten: - self.stats['rewritten_chunks'] += 1 - return rewritten - except Exception as e: - logger.warning(f"⚠️ Rewrite failed: {e}") - return text - - def process_pdf( - self, - file_path: str | Path, - entity: Optional[str] = None, - max_rewrite_chunks: int = 100 - ) -> Tuple[List[Dict[str, Any]], str]: - start_time = time.time() - self.stats = { - 'total_chunks': 0, - 'rewritten_chunks': 0, - 'processing_time': 0, - 'rewriting_time': 0 - } - all_chunks = [] - rewrite_candidates = [] - document_id = str(uuid.uuid4()) + reader = PdfReader(str(pdf_path)) + names_tree = reader.trailer.get("/Root", {}).get("/Names", {}) + efiles = names_tree.get("/EmbeddedFiles", {}) + names = efiles.get("/Names", []) + pairs = list(zip(names[::2], names[1::2])) + out = [] + for fname, ref in pairs: + spec = ref.getObject() + if "/EF" in spec and "/F" in spec["/EF"]: + data = spec["/EF"]["/F"].getData() + if str(fname).lower().endswith((".xlsx", ".xls", ".csv")): + out.append((str(fname), data)) + return out + except Exception: + return [] - # -------- 1. Validate Inputs -------- + def _extract_tables_with_camelot(self, pdf_path: Path, pages="all") -> List[pd.DataFrame]: + if not _HAS_CAMELOT: + return [] + dfs: List[pd.DataFrame] = [] try: - file = Path(file_path) - if not file.exists() or not file.is_file(): - raise FileNotFoundError(f"File not found: {file_path}") - if not str(file).lower().endswith(('.pdf',)): - raise ValueError(f"File must be a PDF: {file_path}") + # 1) lattice first + tables = camelot.read_pdf(str(pdf_path), pages=pages, flavor="lattice", line_scale=40) + dfs.extend([t.df for t in tables] if tables else []) + # 2) stream fallback if sparse + if not dfs: + tables = camelot.read_pdf(str(pdf_path), pages=pages, flavor="stream", edge_tol=200) + dfs.extend([t.df for t in tables] if tables else []) except Exception as e: - logger.error(f"❌ Error opening file: {e}") - return [], document_id + logger.info(f"Camelot failed: {e}") + return dfs - if not entity or not isinstance(entity, str): - logger.error("❌ Entity name must be provided as a non-empty string when ingesting a PDF file.") - return [], document_id - entity = entity.strip().lower() - - logger.info(f"πŸ“„ Processing {file.name}") - - # -------- 2. Main Extraction -------- - try: - with pdfplumber.open(file) as pdf: - for page_num, page in enumerate(pdf.pages): - try: - text = page.extract_text() - except Exception as e: - logger.warning(f"⚠️ Failed to extract text from page {page_num+1}: {e}") + def _extract_tables_with_pdfplumber(self, pdf_path: Path) -> List[Tuple[pd.DataFrame, int]]: + out = [] + with pdfplumber.open(str(pdf_path)) as pdf: + for pno, page in enumerate(pdf.pages, 1): + try: + tables = page.extract_tables() or [] + except Exception: + tables = [] + for tbl in tables: + if not tbl or len(tbl) < 2: # need header + at least 1 row continue + df = pd.DataFrame(tbl[1:], columns=tbl[0]) + out.append((df, pno)) + return out - if not text or len(text.strip()) < 50: - logger.debug(f"Skipping short/empty page {page_num+1}") - continue + def _df_to_rows(self, df: pd.DataFrame) -> List[str]: + # Normalize like your XLSX rows + df = df.copy() + df = df.replace(r'\n', ' ', regex=True) + df.columns = [str(c).strip() for c in df.columns] + return [ " | ".join([str(v) for v in row if (pd.notna(v) and str(v).strip())]) + for _, row in df.iterrows() ] - metadata = { - "page": page_num + 1, - "source": str(file), - "filename": file.name, - "entity": entity, - "document_id": document_id, - "type": "pdf_page" - } + def _extract_prose_blocks(self, pdf_path: Path) -> List[Tuple[str, int]]: + blocks = [] + with pdfplumber.open(str(pdf_path)) as pdf: + for pno, page in enumerate(pdf.pages, 1): + try: + text = page.extract_text() or "" + except Exception: + text = "" + text = re.sub(r'[ \t]+\n', '\n', text) # unwrap ragged whitespace + text = re.sub(r'\n{3,}', '\n\n', text) + if len(text.strip()) >= 40: + blocks.append((text.strip(), pno)) + return blocks - self.stats['total_chunks'] += 1 + # ---------- Public API ---------- + def ingest_pdf(self, + file_path: str | Path, + entity: Optional[str] = None, + max_rewrite_chunks: int = 30, + min_chunk_score: int = 2, + delete_original_if_rewritten: bool = True, + prefer_tables_first: bool = True + ) -> Tuple[List[Dict[str, Any]], str, List[str]]: + """ + Returns (chunks, document_id, original_chunk_ids_to_delete) + """ + start = time.time() + self.stats = {k: 0.0 if 'time' in k else 0 for k in self.stats} + all_chunks: List[Dict[str, Any]] = [] + original_chunks_to_delete: List[str] = [] + doc_id = str(uuid.uuid4()) - if self._should_rewrite(text): - rewrite_candidates.append((text, metadata)) - else: - all_chunks.append({"content": text.strip(), "metadata": metadata}) - except Exception as e: - logger.error(f"❌ PDF read error: {e}") - return [], document_id + file = Path(file_path) + if not file.exists() or not file.is_file() or not file.suffix.lower() == ".pdf": + raise FileNotFoundError(f"Not a PDF: {file_path}") + if not entity or not isinstance(entity, str): + raise ValueError("Entity name must be provided") + entity = entity.strip().lower() - # -------- 3. Rewrite Candidates (if needed) -------- - rewritten_chunks = [] - try: - if self.chunk_rewriter and rewrite_candidates: - logger.info(f"🧠 Rewriting {min(len(rewrite_candidates), max_rewrite_chunks)} of {len(rewrite_candidates)} chunks") - rewrite_candidates = rewrite_candidates[:max_rewrite_chunks] - for text, metadata in rewrite_candidates: - rewritten = self._rewrite_chunk(text, metadata) - metadata = dict(metadata) # make a copy for safety - metadata["rewritten"] = True - rewritten_chunks.append({"content": rewritten, "metadata": metadata}) + # 0) Router: embedded spreadsheets? + embedded = self._find_embedded_spreadsheets(file) + if embedded: + # Save, then delegate to your XLSX flow for each + from your_xlsx_module import XLSXIngester # <-- import your class + xlsx_ingester = XLSXIngester(chunk_rewriter=self.chunk_rewriter) + for fname, data in embedded: + tmp = file.with_name(f"__embedded__{fname}") + with open(tmp, "wb") as f: f.write(data) + x_chunks, _, _ = xlsx_ingester.ingest_xlsx( + tmp, entity=entity, + max_rewrite_chunks=max_rewrite_chunks, + min_chunk_score=min_chunk_score, + delete_original_if_rewritten=delete_original_if_rewritten + ) + # Tag source and page unknown for embedded + for ch in x_chunks: + ch['metadata']['source_pdf'] = str(file) + ch['metadata']['embedded_file'] = fname + all_chunks.append(ch) + try: os.remove(tmp) + except Exception: pass + # Note: continue to extract PDF content as well (often desirable) + + # 1) Tables (Camelot β†’ pdfplumber) + extraction_start = time.time() + table_chunks: List[Dict[str, Any]] = [] + if prefer_tables_first: + dfs = self._extract_tables_with_camelot(file) + if not dfs: + for df, pno in self._extract_tables_with_pdfplumber(file): + rows = self._df_to_rows(df) + if not rows: continue + chunks = self._batch_rows_by_token_count(rows) + for cidx, rows_batch in enumerate(chunks): + content = f"Detected Table (pdfplumber)\n" + "\n".join(rows_batch) + meta = { + "page": pno, + "source": str(file), + "filename": file.name, + "entity": entity, + "document_id": doc_id, + "type": "pdf_table", + "extractor": "pdfplumber" + } + table_chunks.append({'id': f"{doc_id}_chunk_{len(all_chunks)+len(table_chunks)}", + 'content': content, 'metadata': meta}) else: - rewritten_chunks = [{"content": text, "metadata": metadata} for text, metadata in rewrite_candidates] - except Exception as e: - logger.warning(f"⚠️ Error rewriting chunks: {e}") - for text, metadata in rewrite_candidates: - rewritten_chunks.append({"content": text, "metadata": metadata}) + # Camelot doesn't preserve page numbers directly; we’ll mark unknown unless available on t.parsing_report + for t_idx, df in enumerate(dfs): + rows = self._df_to_rows(df) + if not rows: continue + chunks = self._batch_rows_by_token_count(rows) + for cidx, rows_batch in enumerate(chunks): + content = f"Detected Table (camelot)\n" + "\n".join(rows_batch) + meta = { + "page": None, # could be added by parsing report if needed + "source": str(file), + "filename": file.name, + "entity": entity, + "document_id": doc_id, + "type": "pdf_table", + "extractor": "camelot", + "table_index": t_idx + } + table_chunks.append({'id': f"{doc_id}_chunk_{len(all_chunks)+len(table_chunks)}", + 'content': content, 'metadata': meta}) - all_chunks.extend(rewritten_chunks) + # 2) Prose blocks + prose_chunks: List[Dict[str, Any]] = [] + for text, pno in self._extract_prose_blocks(file): + meta = { + "page": pno, "source": str(file), "filename": file.name, + "entity": entity, "document_id": doc_id, "type": "pdf_page_text" + } + prose_chunks.append({'id': f"{doc_id}_chunk_{len(all_chunks)+len(table_chunks)+len(prose_chunks)}", + 'content': text, 'metadata': meta}) - # -------- 4. Finalize IDs and Metadata -------- - try: - for i, chunk in enumerate(all_chunks): - chunk["id"] = f"{document_id}_chunk_{i}" - chunk.setdefault("metadata", {}) - chunk["metadata"]["document_id"] = document_id - except Exception as e: - logger.warning(f"⚠️ Error finalizing chunk IDs: {e}") + extracted = (table_chunks + prose_chunks) if prefer_tables_first else (prose_chunks + table_chunks) + all_chunks.extend(extracted) + self.stats['extraction_time'] = time.time() - extraction_start + self.stats['total_chunks'] = len(all_chunks) + + # 3) Smart selection + rewriting (same semantics as XLSX) + if self.chunk_rewriter and max_rewrite_chunks > 0 and all_chunks: + # score + selection_start = time.time() + scored = [] + for i, ch in enumerate(all_chunks): + s = self._is_high_value_chunk(ch['content'], ch['metadata']) + if s >= min_chunk_score: + scored.append((ch['content'], ch['metadata'], i, s)) + self.stats['high_value_chunks'] += 1 + scored.sort(key=lambda x: x[3], reverse=True) + to_rewrite = [(t, m, idx) for (t, m, idx, _) in scored[:max_rewrite_chunks]] + self.stats['selection_time'] = time.time() - selection_start + + # rewrite + rewritten = self._batch_rewrite_chunks(to_rewrite) + for new_text, new_meta, original_idx in rewritten: + if new_meta.get('rewritten'): + original_id = all_chunks[original_idx]['id'] + if delete_original_if_rewritten: + # replace in place but mark original id for vector-store deletion + original_chunks_to_delete.append(original_id) + new_id = f"{original_id}_rewritten" + all_chunks[original_idx]['id'] = new_id + all_chunks[original_idx]['content'] = new_text + all_chunks[original_idx]['metadata'] = {**all_chunks[original_idx]['metadata'], **new_meta, + "original_chunk_id": original_id} - self.stats['processing_time'] = time.time() - start_time - logger.info(f"βœ… PDF processing complete in {self.stats['processing_time']:.2f}s β€” Total: {len(all_chunks)}") + self.stats['processing_time'] = time.time() - start + logger.info(f"βœ… PDF processed: {file.name} β€” chunks: {len(all_chunks)}; " + f"extract {self.stats['extraction_time']:.2f}s; " + f"rewrite {self.stats['rewriting_time']:.2f}s") - return all_chunks, document_id + return all_chunks, doc_id, original_chunks_to_delete diff --git a/ai/generative-ai-service/complex-document-rag/files/ingest_xlsx.py b/ai/generative-ai-service/complex-document-rag/files/ingest_xlsx.py index a3729ba4d..22d3d355a 100644 --- a/ai/generative-ai-service/complex-document-rag/files/ingest_xlsx.py +++ b/ai/generative-ai-service/complex-document-rag/files/ingest_xlsx.py @@ -116,8 +116,8 @@ def _batch_rows_by_token_count(self, rows: List[str], max_tokens: int = 400) -> return chunks - def _batch_rewrite_chunks(self, chunks_to_rewrite: List[Tuple[str, Dict[str, Any]]]) -> List[Tuple[str, Dict[str, Any]]]: - """Fast parallel batch rewriting""" + def _batch_rewrite_chunks(self, chunks_to_rewrite: List[Tuple[str, Dict[str, Any], int]]) -> List[Tuple[str, Dict[str, Any], int]]: + """Fast parallel batch rewriting - now returns tuples with indices""" if not chunks_to_rewrite or not self.chunk_rewriter: return chunks_to_rewrite @@ -140,22 +140,29 @@ def _batch_rewrite_chunks(self, chunks_to_rewrite: List[Tuple[str, Dict[str, Any logger.info(f"πŸ“¦ Processing {len(batches)} batches of size {BATCH_SIZE}") - def process_batch(batch_idx: int, batch: List[Tuple[str, Dict[str, Any]]]): - batch_input = [{'text': text, 'metadata': metadata} for text, metadata in batch] + def process_batch(batch_idx: int, batch: List[Tuple[str, Dict[str, Any], int]]): + batch_input = [{'text': text, 'metadata': metadata} for text, metadata, _ in batch] try: logger.info(f" Processing batch {batch_idx + 1}/{len(batches)}") rewritten_texts = self.chunk_rewriter.rewrite_chunks_batch(batch_input, batch_size=BATCH_SIZE) batch_result = [] - for i, (original_text, metadata) in enumerate(batch): + for i, (original_text, metadata, chunk_idx) in enumerate(batch): rewritten_text = rewritten_texts[i] if i < len(rewritten_texts) else None - if rewritten_text and rewritten_text != original_text: + # Check for None (failure) or empty string (failure) explicitly + if rewritten_text is None or rewritten_text == "": + logger.warning(f" ⚠️ Chunk {chunk_idx} rewriting failed, keeping original") + batch_result.append((original_text, metadata, chunk_idx)) + elif rewritten_text != original_text: + # Successfully rewritten and different from original metadata = metadata.copy() metadata["rewritten"] = True + metadata["original_chunk_id"] = f"{metadata.get('document_id', '')}_chunk_{chunk_idx}" self.stats['rewritten_chunks'] += 1 - batch_result.append((rewritten_text, metadata)) + batch_result.append((rewritten_text, metadata, chunk_idx)) else: - batch_result.append((original_text, metadata)) + # Rewritten but same as original (no changes needed) + batch_result.append((original_text, metadata, chunk_idx)) logger.info(f" βœ… Batch {batch_idx + 1} complete") return batch_result @@ -178,19 +185,20 @@ def process_batch(batch_idx: int, batch: List[Tuple[str, Dict[str, Any]]]): else: # Fallback to sequential processing logger.info(f"πŸ”„ Sequential rewriting for {len(chunks_to_rewrite)} chunks") - for text, metadata in chunks_to_rewrite: + for text, metadata, chunk_idx in chunks_to_rewrite: try: rewritten = self.chunk_rewriter.rewrite_chunk(text, metadata=metadata).strip() if rewritten: metadata = metadata.copy() metadata["rewritten"] = True + metadata["original_chunk_id"] = f"{metadata.get('document_id', '')}_chunk_{chunk_idx}" self.stats['rewritten_chunks'] += 1 - results.append((rewritten, metadata)) + results.append((rewritten, metadata, chunk_idx)) else: - results.append((text, metadata)) + results.append((text, metadata, chunk_idx)) except Exception as e: logger.warning(f"Failed to rewrite chunk: {e}") - results.append((text, metadata)) + results.append((text, metadata, chunk_idx)) self.stats['rewriting_time'] = time.time() - start_time return results @@ -200,9 +208,14 @@ def ingest_xlsx( file_path: str | Path, entity: Optional[str] = None, max_rewrite_chunks: int = 30, # Reasonable default - min_chunk_score: int = 2 # Only rewrite chunks with score >= 2 - ) -> Tuple[List[Dict[str, Any]], str]: - """Fast XLSX processing with smart chunk selection""" + min_chunk_score: int = 2, # Only rewrite chunks with score >= 2 + delete_original_if_rewritten: bool = True # New parameter + ) -> Tuple[List[Dict[str, Any]], str, List[str]]: + """Fast XLSX processing with smart chunk selection + + Returns: + Tuple of (chunks, document_id, original_chunk_ids_to_delete) + """ start_time = time.time() self.stats = { @@ -216,6 +229,7 @@ def ingest_xlsx( } all_chunks = [] document_id = str(uuid.uuid4()) + original_chunks_to_delete = [] # Validate inputs file = Path(file_path) @@ -297,38 +311,43 @@ def ingest_xlsx( # Smart chunk selection for rewriting selection_start = time.time() if self.chunk_rewriter and max_rewrite_chunks > 0: - # Score all chunks + # Score all chunks and include their indices scored_chunks = [] - for chunk in all_chunks: + for i, chunk in enumerate(all_chunks): score = self._is_high_value_chunk(chunk['content'], chunk['metadata']) if score >= min_chunk_score: - scored_chunks.append((chunk['content'], chunk['metadata'], score)) + scored_chunks.append((chunk['content'], chunk['metadata'], i, score)) self.stats['high_value_chunks'] += 1 # Sort by score and take top N - scored_chunks.sort(key=lambda x: x[2], reverse=True) - chunks_to_rewrite = [(text, meta) for text, meta, _ in scored_chunks[:max_rewrite_chunks]] + scored_chunks.sort(key=lambda x: x[3], reverse=True) + chunks_to_rewrite = [(text, meta, idx) for text, meta, idx, _ in scored_chunks[:max_rewrite_chunks]] self.stats['selection_time'] = time.time() - selection_start - logger.info(f"🎯 Selected {len(chunks_to_rewrite)} high-value chunks from {self.stats['high_value_chunks']} candidates in {self.stats['selection_time']:.2f}s") + logger.info(f"Selected {len(chunks_to_rewrite)} high-value chunks from {self.stats['high_value_chunks']} candidates in {self.stats['selection_time']:.2f}s") if chunks_to_rewrite: # Rewrite selected chunks rewritten = self._batch_rewrite_chunks(chunks_to_rewrite) - # Create mapping for quick lookup - rewritten_map = {} - for text, meta in rewritten: - if meta.get('rewritten'): - key = f"{meta['sheet']}_{meta.get('chunk_index', 0)}" - rewritten_map[key] = text - # Update original chunks with rewritten content - for chunk in all_chunks: - key = f"{chunk['metadata']['sheet']}_{chunk['metadata'].get('chunk_index', 0)}" - if key in rewritten_map: - chunk['content'] = rewritten_map[key] - chunk['metadata']['rewritten'] = True + for rewritten_text, rewritten_meta, original_idx in rewritten: + if rewritten_meta.get('rewritten'): + # Store the original chunk ID for deletion + original_chunk_id = all_chunks[original_idx]['id'] + if delete_original_if_rewritten: + original_chunks_to_delete.append(original_chunk_id) + + # Create NEW ID for rewritten chunk (append _rewritten) + new_chunk_id = f"{original_chunk_id}_rewritten" + + # Update the chunk with rewritten content and NEW ID + all_chunks[original_idx]['id'] = new_chunk_id + all_chunks[original_idx]['content'] = rewritten_text + all_chunks[original_idx]['metadata'] = rewritten_meta + all_chunks[original_idx]['metadata']['original_chunk_id'] = original_chunk_id + + logger.info(f"βœ… Replaced chunk {original_idx} with rewritten version (new ID: {new_chunk_id})") self.stats['processing_time'] = time.time() - start_time @@ -339,6 +358,8 @@ def ingest_xlsx( logger.info(f"πŸ“Š Total chunks: {len(all_chunks)}") logger.info(f"🎯 High-value chunks: {self.stats['high_value_chunks']}") logger.info(f"πŸ”₯ Rewritten chunks: {self.stats['rewritten_chunks']}") + if original_chunks_to_delete: + logger.info(f"πŸ—‘οΈ Original chunks to delete: {len(original_chunks_to_delete)}") logger.info(f"\n⏱️ TIMING BREAKDOWN:") logger.info(f" Extraction: {self.stats['extraction_time']:.2f}s") logger.info(f" Selection: {self.stats['selection_time']:.2f}s") @@ -348,7 +369,7 @@ def ingest_xlsx( logger.info(f" Speed: {len(all_chunks)/self.stats['processing_time']:.1f} chunks/sec") logger.info(f"{'='*60}\n") - return all_chunks, document_id + return all_chunks, document_id, original_chunks_to_delete def main(): """CLI interface""" @@ -359,6 +380,7 @@ def main(): parser.add_argument("--max-rewrite", type=int, default=30, help="Maximum chunks to rewrite") parser.add_argument("--min-score", type=int, default=2, help="Minimum score for rewriting (0-5)") parser.add_argument("--no-rewrite", action="store_true", help="Skip chunk rewriting") + parser.add_argument("--keep-originals", action="store_true", help="Keep original chunks even if rewritten") args = parser.parse_args() @@ -388,11 +410,12 @@ def main(): # Process file try: - chunks, doc_id = processor.ingest_xlsx( + chunks, doc_id, chunks_to_delete = processor.ingest_xlsx( args.input, entity=args.entity, max_rewrite_chunks=args.max_rewrite, - min_chunk_score=args.min_score + min_chunk_score=args.min_score, + delete_original_if_rewritten=not args.keep_originals ) # Save results @@ -400,7 +423,8 @@ def main(): result_data = { "document_id": doc_id, "chunks": chunks, - "stats": processor.stats + "stats": processor.stats, + "original_chunks_to_delete": chunks_to_delete } with open(args.output, "w", encoding="utf-8") as f: diff --git a/ai/generative-ai-service/complex-document-rag/files/local_rag_agent.py b/ai/generative-ai-service/complex-document-rag/files/local_rag_agent.py index b5bf1c580..5b8b29f55 100644 --- a/ai/generative-ai-service/complex-document-rag/files/local_rag_agent.py +++ b/ai/generative-ai-service/complex-document-rag/files/local_rag_agent.py @@ -74,7 +74,7 @@ class OCIModelHandler: "grok-4": { "model_id": os.getenv("OCI_GROK_4_MODEL_ID"), "request_type": "generic", - "max_output_tokens": 120000, + "max_output_tokens": 8000, # Reduced from 120000 for faster response "default_params": { "temperature": 1, "top_p": 1 @@ -84,7 +84,7 @@ class OCIModelHandler: "model_id": os.getenv("OCI_GROK_3_MODEL_ID", os.getenv("GROK_MODEL_ID")), "request_type": "generic", - "max_output_tokens": 16000, + "max_output_tokens": 8000, # Reduced from 16000 for consistency "default_params": { "temperature": 0.7, "top_p": 0.9 @@ -94,7 +94,7 @@ class OCIModelHandler: "model_id": os.getenv("OCI_GROK_3_FAST_MODEL_ID", os.getenv("GROK_MODEL_ID")), "request_type": "generic", - "max_output_tokens": 16000, + "max_output_tokens": 4000, # Optimized for speed "default_params": { "temperature": 0.7, "top_p": 0.9 @@ -197,13 +197,29 @@ def __init__(self, model_name: str = "grok-3", config_profile: str = "DEFAULT", region = self.model_config.get("region", "us-chicago-1") self.endpoint = f"https://inference.generativeai.{region}.oci.oraclecloud.com" - # Initialize OCI client + # Initialize OCI client with better retry and timeout settings config = oci.config.from_file("~/.oci/config", config_profile) + + # Create a custom retry strategy for chunk rewriting operations + retry_strategy = oci.retry.RetryStrategyBuilder( + max_attempts=3, + retry_max_wait_between_calls_seconds=10, + retry_base_sleep_time_seconds=2, + retry_exponential_growth_multiplier=2, + retry_eligible_service_errors=[429, 500, 502, 503, 504], + service_error_retry_config={ + -1: [] # Retry on timeout errors + } + ).add_service_error_check( + service_error_retry_config={-1: []}, + service_error_retry_on_any_5xx=True + ).get_retry_strategy() + self.client = oci.generative_ai_inference.GenerativeAiInferenceClient( config=config, service_endpoint=self.endpoint, - retry_strategy=oci.retry.NoneRetryStrategy(), - timeout=(10, 240) + retry_strategy=retry_strategy, + timeout=(30, 120) # Increased timeout: 30s connect, 120s read for chunk rewriting ) print(f"βœ… Initialized OCI handler for {model_name}") @@ -359,7 +375,7 @@ def get_model_info(self) -> Dict[str, Any]: class RAGSystem: def __init__(self, vector_store: EnhancedVectorStore = None, model_name: str = None, use_cot: bool = False, skip_analysis: bool = False, - quantization: str = None, use_oracle_db: bool = True, collection: str = "Multi-Collection", + quantization: str = None, use_oracle_db: bool = True, collection: str = "multi", embedding_model: str = "cohere-embed-multilingual-v3.0"): """Initialize local RAG agent with vector store and local LLM @@ -484,9 +500,54 @@ def __init__(self, vector_store: EnhancedVectorStore = None, model_name: str = N tokenizer=self.tokenizer ) logger.info(f"Agents initialized: {list(self.agents.keys())}") + # --- known tag cache loaded from vector store - helps identify entities in the query --- + self.known_tags: set[str] = set() + try: + self.refresh_known_tags() + except Exception as e: + logger.warning(f"[RAG] Could not load known tags on init: {e}") + def _vector_store_all_ids(self) -> list[str]: + """ + Return ALL canonical document/entity IDs (tags) from the vector store. + Tries a few common method names to avoid tight coupling. + """ + vs = self.vector_store + # Try common APIs + for attr in ("list_ids", "get_all_ids", "get_all_document_ids", "all_ids"): + if hasattr(vs, attr) and callable(getattr(vs, attr)): + try: + ids = getattr(vs, attr)() + return [str(x) for x in ids] + except Exception as e: + logger.debug(f"[RAG] {_safe_name(vs)}.{attr} failed: {e}") + # Fallback: try listing collections and aggregating + try: + if hasattr(vs, "list_collections"): + coll_names = vs.list_collections() + ids = [] + for c in coll_names: + try: + ids.extend(vs.list_ids(collection=c)) + except Exception: + pass + return [str(x) for x in ids] + except Exception as e: + logger.debug(f"[RAG] Could not enumerate collections: {e}") + return [] + def refresh_known_tags(self) -> None: + """ + Populate self.known_tags (lowercased) from the vector store. + Call this after any ingest/update that changes IDs. + """ + ids = self._vector_store_all_ids() + self.known_tags = {s.lower() for s in ids if isinstance(s, str)} + logger.info(f"[RAG] known_tags loaded: {len(self.known_tags)}") + def _safe_name(obj) -> str: + return getattr(obj, "__class__", type(obj)).__name__ + def _initialize_sub_agents(self, llm_model: str) -> bool: """ Initializes agents for agentic workflows (planner, researcher, etc.) @@ -521,22 +582,28 @@ def _initialize_sub_agents(self, llm_model: str) -> bool: def process_query_with_multi_collection_context(self, query: str, multi_collection_context: List[Dict[str, Any]], is_comparison_report: bool = False, - collection_mode: str = "multi") -> Dict[str, Any]: - """Process a query with pre-retrieved multi-collection context""" + collection_mode: str = "multi", + provided_entities: Optional[List[str]] = None) -> Dict[str, Any]: + """Process a query with pre-retrieved multi-collection context and optional provided entities""" logger.info(f"Processing query with {len(multi_collection_context)} multi-collection chunks") + if provided_entities: + logger.info(f"Using provided entities: {provided_entities}") if self.use_cot: - return self._process_query_with_report_agent(query, multi_collection_context, is_comparison_report, collection_mode=collection_mode) + return self._process_query_with_report_agent(query, multi_collection_context, is_comparison_report, + collection_mode=collection_mode, provided_entities=provided_entities) else: # For non-CoT mode, use the context directly return self._generate_response(query, multi_collection_context) + def _process_query_with_report_agent( self, query: str, multi_collection_context: Optional[List[Dict[str, Any]]] = None, is_comparison_report: bool = False, - collection_mode: str = "multi" + collection_mode: str = "multi", + provided_entities: Optional[List[str]] = None ) -> Dict[str, Any]: """ Report agent pipeline: @@ -558,8 +625,10 @@ def _process_query_with_report_agent( # STEP 1: Plan the report logger.info("Planning report sections...") + if provided_entities: + logger.info(f"Using provided entities for planning: {provided_entities}") try: - result = planner.plan(query, is_comparison_report=is_comparison_report) + result = planner.plan(query, is_comparison_report=is_comparison_report, provided_entities=provided_entities) if not isinstance(result, tuple) or len(result) != 3: raise ValueError(f"Planner returned unexpected format: {type(result)} β†’ {result}") plan, entities, is_comparison = result @@ -799,7 +868,7 @@ def main(): parser = argparse.ArgumentParser(description="Query documents using local LLM") parser.add_argument("--query", required=True, help="Query to search for") parser.add_argument("--embed", default="oracle", choices=["oracle", "chromadb"], help="embed backend to use") - parser.add_argument("--model", default="qwen2", help="Model to use (default: qwen2)") + parser.add_argument("--model", default="grok3", help="Model to use (default: qwen2)") parser.add_argument("--collection", help="Collection to search (PDF, Repository, General Knowledge)") parser.add_argument("--use-cot", action="store_true", help="Use Chain of Thought reasoning") parser.add_argument("--store-path", default="embed", help="Path to ChromaDB store") diff --git a/ai/generative-ai-service/complex-document-rag/files/oci_embedding_handler.py b/ai/generative-ai-service/complex-document-rag/files/oci_embedding_handler.py index 2d7a7a19f..82404bb64 100644 --- a/ai/generative-ai-service/complex-document-rag/files/oci_embedding_handler.py +++ b/ai/generative-ai-service/complex-document-rag/files/oci_embedding_handler.py @@ -88,6 +88,9 @@ def __init__(self, config_profile: OCI config profile to use compartment_id: OCI compartment ID """ + # Load environment variables from .env file if not already loaded + load_dotenv() + self.model_name = model_name # Validate model name @@ -100,6 +103,10 @@ def __init__(self, # Set compartment ID - check both OCI_COMPARTMENT_ID and COMPARTMENT_ID for compatibility self.compartment_id = compartment_id or os.getenv("OCI_COMPARTMENT_ID") or os.getenv("COMPARTMENT_ID") + # Log if compartment ID is missing + if not self.compartment_id: + logger.error("❌ No compartment ID found. Please set COMPARTMENT_ID or OCI_COMPARTMENT_ID in .env file") + # Set endpoint region based on model configuration (supports multiple OCI regions) endpoint_region = self.model_config.get("endpoint", "us-chicago-1") self.endpoint = f"https://inference.generativeai.{endpoint_region}.oci.oraclecloud.com" diff --git a/ai/generative-ai-service/complex-document-rag/files/requirements.txt b/ai/generative-ai-service/complex-document-rag/files/requirements.txt index c1f17bffb..9d81f7410 100644 --- a/ai/generative-ai-service/complex-document-rag/files/requirements.txt +++ b/ai/generative-ai-service/complex-document-rag/files/requirements.txt @@ -13,7 +13,7 @@ pdfplumber==0.11.4 python-docx==1.1.2 # NLP and Embeddings -transformers==4.53.0 +transformers==4.44.2 tokenizers==0.19.1 tiktoken==0.7.0 diff --git a/ai/generative-ai-service/complex-document-rag/files/vector_store.py b/ai/generative-ai-service/complex-document-rag/files/vector_store.py index 04eecc5b4..f5f4ce16f 100644 --- a/ai/generative-ai-service/complex-document-rag/files/vector_store.py +++ b/ai/generative-ai-service/complex-document-rag/files/vector_store.py @@ -4,7 +4,7 @@ Extends the existing VectorStore to support OCI Cohere embeddings alongside ChromaDB defaults """ from oci_embedding_handler import OCIEmbeddingHandler, EmbeddingModelManager -import logging +import logging, numbers from typing import List, Dict, Any, Optional, Union, Tuple from pathlib import Path import chromadb @@ -27,143 +27,137 @@ def __init__(self, *args, **kwargs): "VectorStore is an abstract base class. Use EnhancedVectorStore instead." ) - - class EnhancedVectorStore(VectorStore): """Enhanced vector store with multi-embedding model support (SAFER VERSION)""" - def __init__(self, persist_directory: str = "embeddings", embedding_model: str = "cohere-embed-multilingual-v3.0", embedder=None): + def __init__(self, persist_directory: str = "embeddings", + embedding_model: str = "cohere-embed-multilingual-v3.0", + embedder=None): self.embedding_manager = EmbeddingModelManager() - self.embedding_model_name = embedding_model # string (name) - self.embedder = embedder # object (has .embed_query/.embed_documents) + self.embedding_model_name = embedding_model + self.embedder = embedder self.embedding_dimensions = getattr(embedder, "model_config", {}).get("dimensions", None) if embedder else None - - # If embedder is provided, use it; otherwise fall back to embedding manager - if embedder: - self.embedding_model = embedder - else: - self.embedding_model = self.embedding_manager.get_model(embedding_model) + # Resolve embedding handler + self.embedding_model = embedder or self.embedding_manager.get_model(embedding_model) + + # Chroma client (ensure Settings import: from chromadb.config import Settings) self.client = chromadb.PersistentClient( path=persist_directory, - settings=Settings(allow_reset=True) + settings=Settings(allow_reset=True, anonymized_telemetry=False) ) - # Always get dimensions from the embedding manager or embedder - embedding_dim = None - if embedder: - # Use the provided embedder's dimensions - info = embedder.get_model_info() - if info and "dimensions" in info: - embedding_dim = info["dimensions"] - else: - raise ValueError( - f"Cannot determine embedding dimensions from provided embedder." - ) - elif isinstance(self.embedding_model, str): - # Try to get from embedding_manager - embedding_info = self.embedding_manager.get_model_info(self.embedding_model_name) - if embedding_info and "dimensions" in embedding_info: - embedding_dim = embedding_info["dimensions"] - else: - raise ValueError( - f"Unknown embedding dimension for model '{self.embedding_model_name}'." - " Please update your EmbeddingModelManager to include this info." - ) - else: - # Should have a get_model_info() method - info = self.embedding_model.get_model_info() - if info and "dimensions" in info: - embedding_dim = info["dimensions"] + # Resolve dimensions once + self._embedding_dim = self._resolve_dimensions() + + # Internal maps/handles + self.collections: dict[str, Any] = {} + self.collection_map = self.collections # alias + + # Create/bind base collections (pdf/xlsx) for current model+dim + self._ensure_base_collections(self._embedding_dim) + + logger.info(f"βœ… Enhanced vector store initialized with {self.embedding_model_name} ({self._embedding_dim}D)") + + # --- Utility: sanitize metadata before sending to Chroma --- + def _safe_metadata(self, metadata: dict) -> dict: + """Ensure Chroma-compatible metadata (convert everything non-str β†’ str).""" + safe = {} + for k, v in (metadata or {}).items(): + key = str(k) + if isinstance(v, str): + safe[key] = v + elif isinstance(v, numbers.Number): # catches numpy.int64, Decimal, etc. + safe[key] = str(v) + elif v is None: + continue else: - raise ValueError( - f"Cannot determine embedding dimensions for non-string embedding model {self.embedding_model}." - ) + safe[key] = str(v) + return safe + + def _as_int(self, x): + try: + return int(x) + except Exception: + return None + def _resolve_dimensions(self) -> int: + if self.embedder: + info = self.embedder.get_model_info() + if info and "dimensions" in info: + return int(info["dimensions"]) + raise ValueError("Cannot determine embedding dimensions from provided embedder.") + if isinstance(self.embedding_model, str): + info = self.embedding_manager.get_model_info(self.embedding_model_name) + if info and "dimensions" in info: + return int(info["dimensions"]) + raise ValueError(f"Unknown embedding dimension for model '{self.embedding_model_name}'.") + # non-string handler + info = self.embedding_model.get_model_info() + if info and "dimensions" in info: + return int(info["dimensions"]) + raise ValueError("Cannot determine embedding dimensions for non-string embedding model.") + + def _ensure_base_collections(self, embedding_dim: int): + base_collection_names = ["pdf_documents", "xlsx_documents"] metadata = { "hnsw:space": "cosine", - "embedding_model": self.embedding_model_name, - "embedding_dimensions": embedding_dim + "embedding_model": self.embedding_model_name, # keep int in memory + "embedding_dimensions": embedding_dim # keep int in memory } - base_collection_names = [ - "pdf_documents", "xlsx_documents" - ] - - self.collections = {} - for base_name in base_collection_names: full_name = f"{base_name}_{self.embedding_model_name}_{embedding_dim}" - try: - # Check for exact match first - existing_collections = self.client.list_collections() - by_name = {c.name: c for c in existing_collections} - if full_name in by_name: - coll = by_name[full_name] - actual_dim = coll.metadata.get("embedding_dimensions", None) - if actual_dim != embedding_dim: - # This should never happen unless DB is corrupt - logger.error( - f"❌ Dimension mismatch for collection '{full_name}'. Expected {embedding_dim}, found {actual_dim}." - ) - raise ValueError( - f"Collection '{full_name}' has dim {actual_dim}, but expected {embedding_dim}." - ) - collection = coll - logger.info(f"🎯 Using existing collection '{full_name}' ({embedding_dim}D, {coll.count()} chunks)") - else: - # Safe: only ever create the *fully qualified* name - collection = self.client.get_or_create_collection( - name=full_name, - metadata=metadata - ) - logger.info(f"πŸ—‚οΈ Created new collection '{full_name}' with dimension {embedding_dim}") + # Prefer fast path: get_or_create with safe metadata + coll = self.client.get_or_create_collection( + name=full_name, + metadata=self._safe_metadata(metadata) # ← sanitize only here + ) - self.collections[full_name] = collection + # Defensive dim check (cast back to int if Chroma stored as str) + actual_dim = self._as_int((coll.metadata or {}).get("embedding_dimensions")) + if actual_dim and actual_dim != embedding_dim: + logger.error(f"❌ Dimension mismatch for '{full_name}'. Expected {embedding_dim}, found {actual_dim}.") + raise ValueError(f"Collection '{full_name}' has dim {actual_dim}, expected {embedding_dim}.") - # For direct access: always the selected model/dim + self.collections[full_name] = coll if base_name == "pdf_documents": - self.pdf_collection = collection - elif base_name == "xlsx_documents": - self.xlsx_collection = collection + self.pdf_collection = coll + self.current_pdf_collection_name = full_name + else: + self.xlsx_collection = coll + self.current_xlsx_collection_name = full_name + logger.info(f"πŸ—‚οΈ Ready collection '{full_name}' ({embedding_dim}D, {coll.count()} chunks)") except Exception as e: logger.error(f"❌ Failed to create or get collection '{full_name}': {e}") raise - # Only include full names in the map; never ambiguous short names - self.collection_map = self.collections - - logger.info(f"βœ… Enhanced vector store initialized with {embedding_model} ({embedding_dim}D)") - - def get_collection_key(self, base_name: str) -> str: - # Build the correct key for a base collection name - embedding_dim = ( - self.get_embedding_info()["dimensions"] - if hasattr(self, "get_embedding_info") - else 1024 - ) - return f"{base_name}_{self.embedding_model_name}_{embedding_dim}" + return f"{base_name}_{self.embedding_model_name}_{self._embedding_dim}" def _find_collection_variants(self, base_name: str): """ - Yield (name, collection) for all collections in the DB that start with base_name + "_", - across ANY embedding model/dimension (not just the ones cached at init). + Yield (name, collection) for all collections that start with base_name+"_". + Never create hereβ€”only fetch existing collections. """ for c in self.client.list_collections(): try: - name = c.name - except Exception: - # Some clients return plain dicts name = getattr(c, "name", None) or (c.get("name") if isinstance(c, dict) else None) + except Exception: + name = None if not name: continue - if name.startswith(base_name + "_"): - # get_or_create is fine; if it exists it just returns it - yield name, self.client.get_or_create_collection(name=name) + if not name.startswith(base_name + "_"): + continue + try: + coll = self.client.get_collection(name=name) # ← get (NOT get_or_create) + yield name, coll + except Exception as e: + logger.warning(f"Skip collection {name}: {e}") + def list_documents(self, collection_name: str) -> List[Dict[str, Any]]: """ @@ -524,145 +518,249 @@ def _add_cite(self, meta: Union[Dict[str, Any], "Metadata"]) -> Dict[str, Any]: return meta + def delete_chunks(self, collection_name: str, chunk_ids: List[str]): + """Delete specific chunks from a collection by their IDs + + Args: + collection_name: Name of the collection (e.g., 'xlsx_documents', 'pdf_documents') + chunk_ids: List of chunk IDs to delete + """ + if not chunk_ids: + return + + try: + # Get the appropriate collection + if collection_name == "xlsx_documents": + collection = self.xlsx_collection + elif collection_name == "pdf_documents": + collection = self.pdf_collection + else: + # Try to get from collection map + collection = self.collection_map.get(collection_name) + if not collection: + # Try with current model/dimension suffix + full_name = self.get_collection_key(collection_name) + collection = self.collection_map.get(full_name) + + if not collection: + logger.error(f"Collection {collection_name} not found") + return + + # Delete the chunks + collection.delete(ids=chunk_ids) + logger.info(f"βœ… Deleted {len(chunk_ids)} chunks from {collection_name}") + + except Exception as e: + logger.error(f"❌ Failed to delete chunks: {e}") + raise + def add_xlsx_chunks(self, chunks: List[Dict[str, Any]], document_id: str): """Add XLSX chunks to the vector store with proper embedding handling""" if not chunks: return - + # Extract texts and metadata - texts = [chunk["content"] for chunk in chunks] - metadatas = [chunk["metadata"] for chunk in chunks] - ids = [chunk["id"] for chunk in chunks] - - # Check collection metadata to see what dimensions are expected + texts = [c["content"] for c in chunks] + metadatas = [self._add_cite(c.get("metadata", {})) for c in chunks] # add cite & normalize + ids = [c["id"] for c in chunks] + + # Normalize expected dimensions/model from collection metadata collection_metadata = self.xlsx_collection.metadata or {} - expected_dimensions = collection_metadata.get('embedding_dimensions') - expected_model = collection_metadata.get('embedding_model') - - # Handle embeddings based on model type + expected_dimensions = self._as_int(collection_metadata.get("embedding_dimensions")) + expected_model = collection_metadata.get("embedding_model") + + # Path A: chroma-default (Chroma embeds on add) if isinstance(self.embedding_model, str): - # ChromaDB default - let ChromaDB handle embeddings + # If the collection expects non-384, error early (your policy) if expected_dimensions and expected_dimensions != 384: logger.error(f"❌ Collection expects {expected_dimensions}D but using ChromaDB default (384D)") - raise ValueError(f"Dimension mismatch: collection expects {expected_dimensions}D, ChromaDB default is 384D") - - self.xlsx_collection.add( - documents=texts, - metadatas=metadatas, - ids=ids - ) - else: - # Use OCI embeddings + raise ValueError( + f"Dimension mismatch: collection expects {expected_dimensions}D, ChromaDB default is 384D" + ) + + # Optional: warn if the collection was created without an embedding function bound (older Chroma) try: - embeddings = self.embedding_model.embed_documents(texts) - actual_dimensions = len(embeddings[0]) if embeddings and embeddings[0] else 0 - - if expected_dimensions and actual_dimensions != expected_dimensions: - # Try to find or create the correct collection - correct_collection_name = f"xlsx_documents_{self.embedding_model_name}_{actual_dimensions}" - logger.warning(f"⚠️ Dimension mismatch: collection '{self.xlsx_collection.name}' expects {expected_dimensions}D, embedder produces {actual_dimensions}D") - logger.info(f"πŸ” Looking for correct collection: {correct_collection_name}") - - try: - # Try to get the correct collection - correct_collection = self.client.get_collection(correct_collection_name) - logger.info(f"βœ… Found correct collection: {correct_collection_name}") - except: - # Create new collection with correct dimensions - metadata = { - "hnsw:space": "cosine", - "embedding_model": self.embedding_model_name, - "embedding_dimensions": actual_dimensions - } - correct_collection = self.client.create_collection( - name=correct_collection_name, - metadata=metadata - ) - logger.info(f"βœ… Created new collection: {correct_collection_name}") - - # Add to the correct collection - correct_collection.add( - documents=texts, - metadatas=metadatas, - ids=ids, - embeddings=embeddings - ) - - # Update the reference for future use - self.xlsx_collection = correct_collection - self.collections[correct_collection_name] = correct_collection - - logger.info(f"βœ… Added {len(chunks)} XLSX chunks to {correct_collection_name}") - else: - # Dimensions match, proceed normally - self.xlsx_collection.add( - documents=texts, - metadatas=metadatas, - ids=ids, - embeddings=embeddings - ) - logger.info(f"βœ… Added {len(chunks)} XLSX chunks to {self.embedding_model_name}") - + self.xlsx_collection.add(documents=["probe"], metadatas=[{}], ids=["__probe__tmp__"]) + self.xlsx_collection.delete(ids=["__probe__tmp__"]) except Exception as e: - logger.error(f"❌ Failed to add chunks with OCI embeddings: {e}") - raise # Don't silently fall back - this causes dimension mismatches + logger.warning(f"⚠️ Chroma default embedding may not be bound; add() failed probe: {e}") + + # Add documents directly (Chroma will embed) + # Consider batching if many chunks + self.xlsx_collection.add(documents=texts, metadatas=metadatas, ids=ids) + logger.info(f"βœ… Added {len(chunks)} XLSX chunks to {self.embedding_model_name} (chroma-default)") + return + + # Path B: OCI (you provide embeddings explicitly) + try: + embeddings = self.embedding_model.embed_documents(texts) + if not embeddings or not embeddings[0] or not hasattr(embeddings[0], "__len__"): + raise RuntimeError("Embedder returned empty/invalid embeddings") + + actual_dimensions = len(embeddings[0]) + + if expected_dimensions and actual_dimensions != expected_dimensions: + # Try to find or create the correct collection + correct_collection_name = f"xlsx_documents_{self.embedding_model_name}_{actual_dimensions}" + logger.warning( + f"⚠️ Dimension mismatch: collection '{self.xlsx_collection.name}' " + f"expects {expected_dimensions}D, embedder produces {actual_dimensions}D" + ) + logger.info(f"πŸ” Looking for correct collection: {correct_collection_name}") + + try: + correct_collection = self.client.get_collection(correct_collection_name) + logger.info(f"βœ… Found correct collection: {correct_collection_name}") + except Exception: + # Create new collection with correct dimensions (sanitize metadata for Chroma) + metadata = { + "hnsw:space": "cosine", + "embedding_model": self.embedding_model_name, + "embedding_dimensions": actual_dimensions, # keep as int internally + } + correct_collection = self.client.create_collection( + name=correct_collection_name, + metadata=self._safe_metadata(metadata) # ← sanitize only here + ) + logger.info(f"βœ… Created new collection: {correct_collection_name}") + + # Add to the correct collection (explicit vectors) + # Consider batching if many chunks + correct_collection.add( + documents=texts, + metadatas=metadatas, + ids=ids, + embeddings=embeddings + ) + + # Update the reference for future use + self.xlsx_collection = correct_collection + self.collections[correct_collection_name] = correct_collection + + logger.info(f"βœ… Added {len(chunks)} XLSX chunks to {correct_collection_name}") + else: + # Dimensions match, proceed normally + self.xlsx_collection.add( + documents=texts, + metadatas=metadatas, + ids=ids, + embeddings=embeddings + ) + logger.info(f"βœ… Added {len(chunks)} XLSX chunks to {self.embedding_model_name}") + + except Exception as e: + logger.error(f"❌ Failed to add chunks with OCI embeddings: {e}") + raise # Keep explicit; prevents silent dimension drift + def add_pdf_chunks(self, chunks: List[Dict[str, Any]], document_id: str): - """Add PDF chunks to the vector store with proper embedding handling""" + """Add PDF chunks to the vector store with proper embedding handling.""" if not chunks: return - - # Extract texts and metadata - texts = [chunk["content"] for chunk in chunks] - metadatas = [chunk["metadata"] for chunk in chunks] - ids = [chunk["id"] for chunk in chunks] - - # Check collection metadata to see what dimensions are expected - collection_metadata = self.pdf_collection.metadata or {} - expected_dimensions = collection_metadata.get('embedding_dimensions') - expected_model = collection_metadata.get('embedding_model') - - # Handle embeddings based on model type and expected dimensions + + # Extract texts and metadata; add cite + normalize metadata + texts = [c["content"] for c in chunks] + metadatas = [self._add_cite(c.get("metadata", {})) for c in chunks] + ids = [c["id"] for c in chunks] + + # Collection expectations (cast back to int to avoid string/int mismatches) + coll_meta = self.pdf_collection.metadata or {} + expected_dimensions = self._as_int(coll_meta.get("embedding_dimensions")) + expected_model = coll_meta.get("embedding_model") + + # A) chroma-default path (Chroma embeds on add) if isinstance(self.embedding_model, str): - # String identifier - check if it matches expected model if expected_model and self.embedding_model_name != expected_model: - logger.warning(f"⚠️ Model mismatch: collection expects '{expected_model}', got '{self.embedding_model_name}'") - - if expected_dimensions == 384 or self.embedding_model_name == "chromadb-default": - # ChromaDB default - let ChromaDB handle embeddings - logger.info(f"πŸ“ Using ChromaDB default embeddings ({expected_dimensions or 384}D)") - self.pdf_collection.add( + logger.warning( + f"⚠️ Model mismatch: collection expects '{expected_model}', got '{self.embedding_model_name}'" + ) + + # Your policy: chroma-default is 384D only + if expected_dimensions and expected_dimensions != 384: + raise ValueError( + f"Dimension mismatch: collection expects {expected_dimensions}D, " + f"but chroma-default produces 384D. Recreate the collection with chroma-default " + f"or switch to the correct OCI embedder." + ) + + # Optional: probe add for older Chroma builds without an embedding_function bound + try: + self.pdf_collection.add(documents=["__probe__"], metadatas=[{}], ids=["__probe__"]) + self.pdf_collection.delete(ids=["__probe__"]) + except Exception as e: + logger.warning(f"⚠️ Chroma default embedder may not be bound; add() probe failed: {e}") + + # Add (consider batching if very large) + self.pdf_collection.add(documents=texts, metadatas=metadatas, ids=ids) + logger.info(f"βœ… Added {len(chunks)} PDF chunks via chroma-default (384D)") + return + + # B) OCI path (explicit embeddings) + try: + embeddings = self.embedding_model.embed_documents(texts) + if not embeddings or not embeddings[0] or not hasattr(embeddings[0], "__len__"): + raise RuntimeError("Embedder returned empty/invalid embeddings") + + actual_dimensions = len(embeddings[0]) + + # If the target collection's dim doesn't match, route/create the correct one + if expected_dimensions and actual_dimensions != expected_dimensions: + logger.warning( + f"⚠️ Dimension mismatch: collection '{self.pdf_collection.name}' expects " + f"{expected_dimensions}D, embedder produced {actual_dimensions}D" + ) + correct_name = f"pdf_documents_{self.embedding_model_name}_{actual_dimensions}" + try: + correct_collection = self.client.get_collection(correct_name) + # Sanity check: if it already contains data of a different dim (shouldn’t happen), bail + probe_meta = correct_collection.metadata or {} + probe_dim = self._as_int(probe_meta.get("embedding_dimensions")) + if probe_dim and probe_dim != actual_dimensions: + raise RuntimeError( + f"Existing collection '{correct_name}' is {probe_dim}D, expected {actual_dimensions}D" + ) + logger.info(f"βœ… Found correct PDF collection: {correct_name}") + except Exception: + # Create with sanitized metadata (only at API boundary) + md = { + "hnsw:space": "cosine", + "embedding_model": self.embedding_model_name, + "embedding_dimensions": actual_dimensions, # keep int internally + } + correct_collection = self.client.get_or_create_collection( + name=correct_name, + metadata=self._safe_metadata(md) # sanitize here + ) + logger.info(f"πŸ†• Created PDF collection: {correct_name}") + + # Add to the correct collection + correct_collection.add( documents=texts, metadatas=metadatas, - ids=ids + ids=ids, + embeddings=embeddings ) + + # Re-point handles + self.pdf_collection = correct_collection + self.collections[correct_name] = correct_collection + self.current_pdf_collection_name = correct_name + + logger.info(f"βœ… Added {len(chunks)} PDF chunks to {correct_name}") else: - # Expected OCI model but got string - this is a configuration error - logger.error(f"❌ Configuration error: Expected {expected_model} ({expected_dimensions}D) but OCI embedding handler failed to initialize") - logger.error(f"πŸ’‘ Falling back to ChromaDB default, but this will cause dimension mismatch!") - raise ValueError(f"Cannot add {expected_dimensions}D embeddings using ChromaDB default (384D). Please fix OCI configuration or recreate collection with chromadb-default.") - else: - # Use OCI embeddings - try: - embeddings = self.embedding_model.embed_documents(texts) - actual_dimensions = len(embeddings[0]) if embeddings and embeddings[0] else 0 - - if expected_dimensions and actual_dimensions != expected_dimensions: - logger.error(f"❌ Dimension mismatch: collection expects {expected_dimensions}D, embedder produces {actual_dimensions}D") - raise ValueError(f"Dimension mismatch: collection expects {expected_dimensions}D, got {actual_dimensions}D") - - logger.info(f"πŸ“ Using OCI embeddings ({actual_dimensions}D)") + # Dimensions match; add directly self.pdf_collection.add( documents=texts, metadatas=metadatas, ids=ids, embeddings=embeddings ) - except Exception as e: - logger.error(f"❌ Failed to add PDF chunks with OCI embeddings: {e}") - raise # Don't fall back silently - this causes dimension mismatches - - logger.info(f"βœ… Added {len(chunks)} PDF chunks to {self.embedding_model_name}") + logger.info(f"βœ… Added {len(chunks)} PDF chunks ({actual_dimensions}D)") + + except Exception as e: + logger.error(f"❌ Failed to add PDF chunks with OCI embeddings: {e}") + raise # keep explicit; prevents silent dimension drift + @@ -875,7 +973,7 @@ def query_pdf_collection( } self.pdf_collection = self.client.get_or_create_collection( name=correct_collection_name, - metadata=metadata + metadata=self._safe_metadata(metadata) ) logger.info(f"βœ… Created new PDF collection: {correct_collection_name}") actual_dim = handler_dim @@ -923,75 +1021,6 @@ def query_pdf_collection( return [] - def OLD_query_pdf_collection(self, query: str, n_results: int = 3, entity: Optional[str] = None, add_cite: bool = False) -> List[Dict[str, Any]]: - """Query PDF collection with embedding support and optional citation markup.""" - try: - # Build filter - where_filter = {"entity": entity.lower()} if entity else None - - # βœ… Minimal guard – blow up early if dims mismatch - if (self.pdf_collection.metadata or {}).get("embedding_dimensions") != (self.get_embedding_info() or {}).get("dimensions"): - raise ValueError( - f"EMBEDDING_DIMENSION_MISMATCH: collection expects " - f"{(self.pdf_collection.metadata or {}).get('embedding_dimensions')}D, " - f"current handler has {(self.get_embedding_info() or {}).get('dimensions')}D" - ) - - # Query by embedding or text, depending on backend - if isinstance(self.embedding_model, str): - # ChromaDB default - results = self.pdf_collection.query( - query_texts=[query], - n_results=n_results, - where=where_filter, - include=["documents", "metadatas", "distances"] - ) - else: - try: - query_embedding = self.embedding_model.embed_query(query) - results = self.pdf_collection.query( - query_embeddings=[query_embedding], - n_results=n_results, - where=where_filter, - include=["documents", "metadatas", "distances"] - ) - except Exception as e: - logger.error(f"❌ OCI query embedding failed: {e}") - # Fallback to text query - results = self.pdf_collection.query( - query_texts=[query], - n_results=n_results, - where=where_filter, - include=["documents", "metadatas", "distances"] - ) - - # Format results with optional citation - formatted_results = [] - docs = results.get("documents", [[]])[0] - metas = results.get("metadatas", [[]])[0] - dists = results.get("distances", [[]])[0] if "distances" in results else [0.0] * len(docs) - - for i, (doc, meta, dist) in enumerate(zip(docs, metas, dists)): - out = { - "content": doc, - "metadata": meta if meta else {}, - "distance": dist - } - if add_cite and hasattr(self, "_add_cite"): - meta_with_cite = self._add_cite(meta) - out["metadata"] = meta_with_cite - out["content"] = f"{doc} {meta_with_cite['cite']}" - formatted_results.append(out) - - return formatted_results - - except Exception as e: - logger.error(f"❌ Error querying PDF collection: {e}") - return [] - - - - def inspect_xlsx_chunk_metadata(self, limit: int = 10): """ Print stored metadata from the XLSX vector store for debugging. @@ -1070,14 +1099,21 @@ def bind_collections_for_model(self, embedding_model: str) -> None: "embedding_model": self.embedding_model_name, "embedding_dimensions": embedding_dim } - + logger.info( + "Create/get collections: PDF=%r, XLSX=%r | meta=%r (dim_field=%s:%s)", + pdf_name, + xlsx_name, + metadata, + "embedding_dimensions", + type(metadata.get("embedding_dimensions")).__name__, + ) self.pdf_collection = self.client.get_or_create_collection( name=pdf_name, - metadata=metadata + metadata=self._safe_metadata(metadata) ) self.xlsx_collection = self.client.get_or_create_collection( name=xlsx_name, - metadata=metadata + metadata=self._safe_metadata(metadata) ) # Cache for debugging From 0f2b1734342efe82daea8bb5393f84d4c352a691 Mon Sep 17 00:00:00 2001 From: Brona Nilsson Date: Mon, 3 Nov 2025 13:55:14 +0100 Subject: [PATCH 3/5] Adopt upstream README as base and integrate local updates --- .../complex-document-rag/README.md | 379 +++++++++--------- 1 file changed, 200 insertions(+), 179 deletions(-) diff --git a/ai/generative-ai-service/complex-document-rag/README.md b/ai/generative-ai-service/complex-document-rag/README.md index 71659174a..dc0a5c3d7 100644 --- a/ai/generative-ai-service/complex-document-rag/README.md +++ b/ai/generative-ai-service/complex-document-rag/README.md @@ -1,240 +1,261 @@ -# Enterprise RAG Report Generator with Oracle OCI Gen AI +# RAG Report Generator -A sophisticated Retrieval-Augmented Generation (RAG) system built with Oracle OCI Generative AI, designed for enterprise document analysis and automated report generation. This application processes complex documents (PDFs, Excel files) and generates comprehensive analytical reports using multi-agent workflows. +An enterprise-grade Retrieval-Augmented Generation (RAG) system for generating comprehensive business reports from multiple document sources using Oracle Cloud Infrastructure (OCI) Generative AI services. -**Reviewed: 19.09.2025** +Reviewed date: 22.09.2025 -## Features +## Features -### Document Processing -- **Multi-format Support**: Process PDF documents and Excel spreadsheets (.xlsx, .xls) -- **Entity-aware Ingestion**: Automatically detect and tag entities within documents -- **Smart Chunking**: Intelligent document segmentation with context preservation -- **Multi-language Support**: Powered by Cohere's multilingual embedding models - -### Advanced RAG Capabilities -- **Multi-Collection Search**: Query across different document collections simultaneously -- **Hybrid Search**: Combine vector similarity and keyword matching for optimal results -- **Entity Filtering**: Filter search results by specific organizations or entities -- **Dimension-aware Storage**: Automatic handling of different embedding model dimensions - -### Intelligent Report Generation -- **Multi-Agent Architecture**: Specialized agents for planning, research, and writing -- **Comparison Reports**: Generate side-by-side comparisons of multiple entities -- **Structured Output**: Automated section generation with tables and charts -- **Chain-of-Thought Reasoning**: Advanced reasoning capabilities for complex queries - -### Model Flexibility -- **Multiple LLM Support**: - - Grok-3 and Grok-4 - - Llama 3.3 - - Cohere Command - - Dedicated AI Clusters (DAC) -- **Embedding Model Options**: - - Cohere Multilingual (1024D) - - ChromaDB Default (384D) - - Custom OCI embeddings - -### User Interface -- **Gradio Web Interface**: Clean, intuitive UI for document processing and querying -- **Vector Store Viewer**: Explore and manage your document collections -- **Real-time Progress Tracking**: Monitor processing and generation status -- **Report Downloads**: Export generated reports in Markdown format +- **Multi-Document Processing**: Ingest and process PDF and XLSX documents +- **Multiple Embedding Models**: Support for Cohere multilingual and v4.0 embeddings +- **Advanced LLM Support**: Integration with OCI models (Grok-3, Grok-4, Llama 3.3, Cohere Command) +- **Agentic Workflows**: Multi-agent system for intelligent report generation +- **Hierarchical Report Structure**: Automatically organizes content based on user queries +- **Citation Tracking**: Source attribution with references +- **Multi-Language Support**: Generate reports in English, Arabic, Spanish, and French +- **Visual Analytics**: Automatic chart and table generation from data ## Prerequisites -### Oracle OCI Configuration -- Set up your Oracle Cloud Infrastructure (OCI) account -- Obtain the following: - - Compartment OCID - - Generative AI Service Endpoint - - Model IDs for your chosen LLMs - - API keys and authentication credentials -- Configure your `~/.oci/config` file with your profile details - -### Python Environment -- Python 3.8 or later -- Virtual environment recommended -- Sufficient disk space for vector storage +- Python 3.11+ +- OCI Account with Generative AI service access +- OCI CLI configured with appropriate credentials -## Installation +## Installation -1. **Clone the repository:** +1. Clone the repository: +```bash +git clone +cd agentic_rag +``` -2. **Create and activate a virtual environment:** +2. Create a virtual environment: ```bash python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` -3. **Install dependencies:** +3. Install dependencies: ```bash -pip install -r files/requirements.txt +pip install -r requirements.txt ``` -4. **Configure environment variables:** +4. Configure OCI credentials: ```bash -cp files/.env.example files/.env -# Edit .env with your OCI credentials and model IDs +# Create OCI config directory if it doesn't exist +mkdir -p ~/.oci + +# Add your OCI configuration to ~/.oci/config +# See: https://docs.oracle.com/en-us/iaas/Content/API/Concepts/sdkconfig.htm ``` -5. **Set up OCI configuration:** -Ensure your `~/.oci/config` file contains: -```ini -[DEFAULT] -user=ocid1.user.oc1..xxxxx -fingerprint=xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx -tenancy=ocid1.tenancy.oc1..xxxxx -region=us-chicago-1 -key_file=~/.oci/oci_api_key.pem +5. Set up environment variables: +```bash +# Create .env file with your configuration +cat > .env << EOF +# OCI Configuration +OCI_COMPARTMENT_ID=your-compartment-id +COMPARTMENT_ID_DAC=your-dac-compartment-id # If using dedicated cluster + +# Model IDs (get from OCI Console) +OCI_GROK_3_MODEL_ID=your-grok3-model-id +OCI_GROK_4_MODEL_ID=your-grok4-model-id +OCI_LLAMA_3_3_MODEL_ID=your-llama-model-id +OCI_COHERE_COMMAND_A_MODEL_ID=your-cohere-model-id + +# Default Models (optional) +DEFAULT_EMBEDDING_MODEL=cohere-embed-multilingual-v3.0 +DEFAULT_LLM_MODEL=grok-3 +EOF ``` -## Usage +## Quick Start -### Starting the Application +1. Launch the Gradio interface: ```bash -cd files python gradio_app.py ``` -The application will launch at `http://localhost:7863` - -### Document Processing Workflow - -1. **Upload Documents** - - Navigate to the "DOCUMENT PROCESSING" tab - - Select your embedding model - - Upload PDF or Excel files - - Specify the entity/organization name - - Click "Process" to ingest documents - -2. **Query Your Documents** - - Go to the "INFERENCE & QUERY" tab - - Enter your query or question - - Select data sources (PDF/XLSX collections) - - Choose between standard or agentic workflow - - Click "Run Query" to generate response - -3. **Generate Reports** - - Enable "Use Agentic Workflow" for comprehensive reports - - Specify entities for comparison reports - - Download generated reports in Markdown format - -### Advanced Features - -**Vector Store Management:** -- View collection statistics -- Search across collections -- List and manage document chunks -- Delete collections when needed - -**Multi-Entity Comparison:** -```python -# Example: Compare ESG metrics between two companies -Query: "Compare sustainability initiatives" -Entity 1: "CompanyA" -Entity 2: "CompanyB" -``` -## πŸ“ File Structure +2. Open your browser to `http://localhost:7863` + +3. Follow these steps in the interface: + - **Document Processing Tab**: Upload and process your documents (PDF/XLSX) - see samples in sample_data folder + - **Vector Store Viewer Tab**: View and manage your document collections + - **Inference & Query Tab**: Enter queries and generate reports - see sample queries in sample_queries folder + +## Usage Guide + +### Document Processing + +1. Select an embedding model (e.g., cohere-embed-multilingual-v3.0) +2. Upload documents: + - **XLSX**: Financial data, ESG metrics, structured data + - **PDF**: Reports, policies, unstructured documents +3. Specify the entity name for each document, i.e. the bank or institition's name +4. Click "Process" to ingest into the vector store + +### Generating Reports + +1. In the **Inference & Query** tab: + - Enter your query (can be structured with numbered sections) + - Select LLM model (Grok-3 recommended for reports) + - Choose data sources (PDF/XLSX collections) + - Enable "Agentic Workflow" for comprehensive multi-agent reports + - Click "Run Query" + +2. Example structured query: ``` -. -β”œβ”€β”€ files/ -β”‚ β”œβ”€β”€ gradio_app.py # Main application interface -β”‚ β”œβ”€β”€ local_rag_agent.py # RAG system core logic -β”‚ β”œβ”€β”€ vector_store.py # Vector storage management -β”‚ β”œβ”€β”€ oci_embedding_handler.py # OCI embedding integration -β”‚ β”œβ”€β”€ disable_telemetry.py # Telemetry management -β”‚ β”œβ”€β”€ agents/ # Multi-agent components -β”‚ β”‚ β”œβ”€β”€ agent_factory.py # Agent initialization -β”‚ β”‚ └── report_writer_agent.py # Report generation -β”‚ β”œβ”€β”€ handlers/ # Document processors -β”‚ β”‚ β”œβ”€β”€ pdf_handler.py # PDF processing -β”‚ β”‚ β”œβ”€β”€ xlsx_handler.py # Excel processing -β”‚ β”‚ └── query_handler.py # Query processing -β”‚ └── requirements.txt # Python dependencies -β”œβ”€β”€ README.md # Project documentation -└── LICENSE # License information +Prepare a comprehensive ESG comparison report between Company A and Company B: + +1) Climate Impact & Emissions + - Net-zero commitments and targets + - Scope 1, 2, and 3 emissions + +2) Social & Governance + - Diversity targets + - Board oversight + +3) Financial Performance + - Revenue and profitability + - ESG investments ``` -## Screenshots +### Report Features -### Main Interface -[Screenshot: Document Processing Tab] +Generated reports include: +- Executive summary addressing your specific query +- Hierarchically organized sections +- Data tables and visualizations +- Source citations [1], [2] for traceability +- References section with full source details +- Professional formatting (Times New Roman, black headings) -### Query Interface -[Screenshot: Inference & Query Tab] +## Project Structure -### Vector Store Viewer -[Screenshot: Collection Management] +``` +agentic_rag/ +β”œβ”€β”€ gradio_app.py # Main application interface +β”œβ”€β”€ local_rag_agent.py # Core RAG system logic +β”œβ”€β”€ vector_store.py # Vector database management +β”œβ”€β”€ oci_embedding_handler.py # OCI embedding services +β”œβ”€β”€ agents/ +β”‚ β”œβ”€β”€ agent_factory.py # Agent creation and management +β”‚ └── report_writer_agent.py # Report generation logic +β”œβ”€β”€ handlers/ +β”‚ β”œβ”€β”€ query_handler.py # Query processing +β”‚ β”œβ”€β”€ pdf_handler.py # PDF document processing +β”‚ β”œβ”€β”€ xlsx_handler.py # Excel document processing +β”‚ └── vector_handler.py # Vector store operations +β”œβ”€β”€ ingest_pdf.py # PDF ingestion pipeline +β”œβ”€β”€ ingest_xlsx.py # Excel ingestion pipeline +β”œβ”€β”€ sample_data/ # Sample documents for testing +β”œβ”€β”€ sample_queries/ # Example queries for reports +└── utils/ + └── demo_logger.py # Logging utilities +``` -### Generated Report Example -[Screenshot: Sample Report Output] +## Advanced Configuration -## πŸ”§ Configuration +### Embedding Models -### Model Selection -Configure available models in your `.env` file: -```env -# LLM Models -OCI_GROK_3_MODEL_ID=ocid1.generativeaimodel.oc1... -OCI_GROK_4_MODEL_ID=ocid1.generativeaimodel.oc1... -OCI_LLAMA_3_3_MODEL_ID=ocid1.generativeaimodel.oc1... +Available embedding models: +- `cohere-embed-multilingual-v3.0` (1024 dimensions) +- `cohere-embed-v4.0` (1024 dimensions) +- `chromadb-default` (384 dimensions, local) -# Embedding Models -DEFAULT_EMBEDDING_MODEL=cohere-embed-multilingual-v3.0 +### LLM Models -# Compartment Configuration -OCI_COMPARTMENT_ID=ocid1.compartment.oc1... -``` +Supported OCI Generative AI models: +- **Grok-3**: Best for comprehensive reports (16K output tokens) +- **Grok-4**: Advanced reasoning (120K output tokens) +- **Llama 3.3**: Fast inference (4K output tokens) +- **Cohere Command**: Instruction following (4K output tokens) + +### Vector Store Management -### Performance Tuning -- Adjust chunk sizes in `ingest_pdf.py` and `ingest_xlsx.py` -- Configure parallel processing in `report_writer_agent.py` -- Modify token limits in model configurations +- Collections are automatically created per embedding model +- Switch between models without data loss +- Delete collections via the Vector Store Viewer tab -## πŸ› Troubleshooting +## Troubleshooting ### Common Issues -**Vector Store Dimension Mismatch:** -- Ensure consistent embedding model usage -- Clear existing collections when switching models -- Check collection metadata for dimension conflicts +1. **OCI Authentication Error** + - Verify ~/.oci/config is properly configured + - Check compartment ID in .env file + - Ensure your user has appropriate IAM policies -**OCI Authentication Errors:** -- Verify `~/.oci/config` configuration -- Check API key permissions -- Ensure compartment access rights +2. **Embedding Model Errors** + - Verify model IDs in .env file + - Check OCI service limits and quotas + - Ensure embedding service is enabled in your region -**Memory Issues:** -- Reduce chunk sizes for large documents -- Consider using dedicated AI clusters for heavy workloads +3. **Memory Issues** + - For large documents, process in smaller batches + - Adjust chunk size in ingestion settings + - Consider using pagination for large result sets -## Contributing +### Logs + +Check `logs/app.log` for detailed debugging information. -This project welcomes contributions from the community. Before submitting a pull request, please: +## API Usage (Optional) + +For programmatic access: + +```python +from local_rag_agent import RAGSystem +from vector_store import EnhancedVectorStore + +# Initialize system +vector_store = EnhancedVectorStore( + persist_directory="embed-cohere-embed-multilingual-v3.0", + embedding_model="cohere-embed-multilingual-v3.0" +) + +rag_system = RAGSystem( + vector_store=vector_store, + model_name="grok-3", + use_cot=True +) + +# Process query +response = rag_system.process_query("Your query here") +print(response["answer"]) +``` + +## Contributing 1. Fork the repository 2. Create a feature branch -3. Commit your changes -4. Push to the branch -5. Open a pull request +3. Make your changes +4. Run tests: `python -m pytest tests/` +5. Submit a pull request -Please review our contribution guidelines for coding standards and best practices. +## License -## πŸ”’ Security +[Your License Here] -Please consult the security guide for our responsible security vulnerability disclosure process. Report security issues to the maintainers privately. +## Support -## πŸ“„ License +For issues and questions: +- Check the logs in `logs/app.log` +- Review the troubleshooting section +- Open an issue on GitHub -Copyright (c) 2024 Oracle and/or its affiliates. +## Acknowledgments -Licensed under the Universal Permissive License (UPL), Version 1.0. +- Oracle Cloud Infrastructure for Generative AI services +- Gradio for the web interface +- ChromaDB for vector storage +- The open-source community -See LICENSE for more details. +## License +Copyright (c) 2025 Oracle and/or its affiliates. -## ⚠️ Disclaimer +Licensed under the Universal Permissive License (UPL), Version 1.0. -ORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK. +See [LICENSE](LICENSE.txt) for more details. +ORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK. From 91ccb056cd42ebd090fafde930203b3adb0fced9 Mon Sep 17 00:00:00 2001 From: Brona Nilsson Date: Mon, 3 Nov 2025 14:54:02 +0100 Subject: [PATCH 4/5] Add screenshot reference to README --- .../files/images/screenshot1.png | Bin 0 -> 90635 bytes 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 ai/generative-ai-service/complex-document-rag/files/images/screenshot1.png diff --git a/ai/generative-ai-service/complex-document-rag/files/images/screenshot1.png b/ai/generative-ai-service/complex-document-rag/files/images/screenshot1.png new file mode 100644 index 0000000000000000000000000000000000000000..b93876012bdca63649e17bc5a957ace8fe043a7a GIT binary patch literal 90635 zcmeFZWpo`owgzZBW@g63iJ4<&W@ct)W{#PenVFgGm?>ta7-KqShB?07uV44=`87Xh zJulTds!~bXC2ehe8{{xK8BqjSTv!kg5Cm~CAq5bSPu;-F3mOvm#8<}!cmt|oE+{A` zE+|MW=U{7MZe^1dt}JqP<`rl4hm#sSo}`*t`5inxnDPz$juj^Qm%LHHupE-oJKedm2vnc z8{IwZdD1sVgahfIzxY8zqAVgrd}X#y`}{C-cgV$u>cLCm!CiA0JRMAJdQc3gu0WlA zKUmW#m1~5bl`Xdqw8x7b3T@)ffT>2icfbvQkh2@zMeB2qp-WF&a0-7&Ze1fkf%q-@ zTdjMLuF0;d-O_8CxL7s(KqL6Oz(NeX($6|Sa1#dMmPcMZTNA})$N?UImVB}J z0Xg9~H<)kmQlCXC6<9G~?02_XuDhPd=|wvvpt@7KAg5IQ2nX*)bj%UPbO;!-o?Vmi zS5`AsxK-GGJ)XDQe&V}8HM3iKWpmjWqCj@b@8Q(diq!2p?NP)B4z`N1nz)IyGzcXy z4-EnuVh#ch%z*+gT;K(qllWi|DBv$D@Dj=e`=8QJ-MOFsmG}C1kzY|zTpakTXy{;U zY~yHV>lA{EC=YCE(OgN*NljXc!_d~6R^Q0hz?jy}+U}zY2$vfNFl%k>q)+T-ZDr%g z;l};>uQxb=`H#nRpNao^#mSQUvzoLVv7oJkF)<4*JuUrb9#~>xVlD?G6AlF-k$+YP z-f@36b8@odprdnjb)|J>qP2A}rDI@cXQ!iQq+?{H0lq=w=x*br??z+e_~jps{7*YV z#*T&#=5|izwl>5c?dluYIy-TH{`}F=zyAI~r?H#)|Mq0#_|MM*ejwdP2^|A1J>9?B z23F1&;QQMKP&%x;Xf;K(S3aRzm3E{hWW3jz;Wh*<)Zu7obkYtY0}bxfbfBc z3-K$tfu3eTrJ@U?1)q$t2kvD$Vj}nP(fff43&V#7;DFB6V8a~vfueEi^{Bid*sjjGK=OCeAU=SxK(2c_)xLjLdPGL~RUUa}|0`;&1XFtPucrEzc6TdY@RcW@2n&DzSDOS5`G7xX_CjQeWplQbaHZbhdpSC-HqiA*BWXydfmITwL|Vi15U|<357a? zuj~bGm*~SRR%&$P$+QUtVWL@vo+x#D%ll}hdON7I-}9J$sd3=HzEmTUQo zW#2`MUvE=h%DpTX%4sGw8|fpdRI901>W!}_KT}YMFIKC60X|6lS@<*eI~$p;&VGTj zPA(s)B1_^NRq57X1QuL&iW$UX61ltpWnv9+TH;C=x}8RgOV!KcAra$ij<3;JBCqr- ztL3JE#bRaIv_3|eQd#}pR2F9`_QBf?_j0?nAWl?t^m zCi(6d29?oSd(%NG5niEsmq*i^3d}Jbb6{Y)f4$GyhBE(^n+K20CN`#8ik$e3nY}Kw`XWhba^atrWiLHSCrjH z@E(7H4wV@)v5u7*-cK7eN3C17NSaRO<0ep?WbnD%geF$>v;pk)Yw1_7v)#U zl*4QEGY) zxojVcSZb}-U-#@c$;O9M1dN`GR;Krr>hGhtO>%-|yy^iQL>xC+#6>L%MG_(_O~$D_ z-tAo6X0h@wa|+F7kIN^H;vX`-PQ(v$ORWMWIRspcm>N{g1jx};&+`x-b3;)g3{PiE zx>=5Tx8?E$#fb_K-!rWgTV&L-*BMnYcCHts=@W(LvoMzkSx&Wg4MJbdochNgEH&w_Poi(q&Qiy!>`FFHO#Uy2S%mf6WB1 zPSUx*Nj^Ro9x8m|<{YZqJLw7o`m zDArM-+umk#gWRo*5U2$gjBuV zsf^+iwG{#`i&Di-9dA`_WiqugrSjspz}*L=ENoq0W(gj=$1=yeFEKpduyma23_)AY zFATeJvMjtw5~eJ;ZL3r!Ce#($O?K7&JQ-7epsA^oybNDx^op9aidzI+oV=^O9rig`T%`i~3*zkQCWg z6~SEaChH}cq~}S-hy9d^ZX(}Iy>;aQ<_iYGR`-a8iBuiO_&3)%>5csXB;MFOK&Q2f zKDGVHVpWk|eGxhV&e>`UE3YSFGB#WkhY!;V1g_2AgD~gfdY0+;6SwFYbwdo|Ot_^T zL;?=T1sV>pa@89Bq5FUeWBg+cyGQN{xesq=qRA3TNle4&6iX(i6}7g1NjP z${y?(?@oncDID|F8>uq6y8MK}s0|xN+VNLHM5k=M%+sP2ZNfVaJ2>4|JO4R(1Evga z&*sLSZ2{dutSu6!(CtG^C$X1@0dUbvZotd6^r3{seVfk)2TqAZI|liUoW~j4JuC+0 zSXIj=Egq}Y_i(yk{mc@8?8)Nc52t+@I-X`Jt#}|j8QGYA%$CAixHGt1s+Q(8U{~D2!dfAgVHRz+}xv{Iqz|yGG^~5hx^Xx?~ zS_YBqP$E0+Opb3nmYn25%cI$1BaL!F#$A)9p(6)BgJahhgq2=cd-QobZcW&GI(viB z@c^Ez0#2!ndJatUtuSs>0RZn$6e=RVKV~NrNga)UpylJp=?62G(i#8b8kz&M=o%n z!K`rkO|iJ-^5b@Q(4^m_BtKRISq2bzCP-&4eE~%ka|M(0*`N((cRYZ#)DIY^8QOmw z^|GG-oU204KY-R3|Mp;5FAk=vqn5j|gyei)wu+=G*}@2JX>O8ovR6iqRjD`_PJma^ z;r;9astJuABh4}8vJ((wf`?L%!n`yc)Fw2FYLR4PG|NA8MfvN1kMf;gp3l9qgE~7W z!MI7f$V?xpMz=+$So(Qf+_72%{)_#bs|*_qmEn+=AxGLQ%#cAIz_)L{-A^yBfhcIs0Dr_esvhpgGTlUFOEHWo1p^AkQvN|5#o2aJB3IswA^LQa2f+$gt|`d&84$R=T@1-bJ~v(9Qo)xID&%v1+G zt#4YkRm7;*$&+~{RqTb;(RUhbCDj@p5m+P!v$KI& zs+CQyS^@;3p6j-l4-Lg249Em(l;s6s#!{GC|7@^(?65Z$HX3xx z$T7!ZndS@^%91qAwDz$A?08$!Zw?|mOy|KMzCOEYNiDUcIJt0+0BQ_sH);!p1LWw% zBFBNJ+pE_H`-4>ztA=Px9_8vk#mFcxu0l9%7`ex|zYrY142Q2TS7pFh1z&k_x zsOp`JMMp=e#cJd39Ul_;I&mOfoQ>36vxlIPH=XCI9A@ND*&Ir40>`2oCQ3Bj?~DY?lRk$@bdC;*bY z8OGsy)cYoze)44medy_69E$@Qap;I-mH`~rax%SX5_|cH;{bimm+!zlwxZZH`wyP5 z%ln(R34Ez|={t#;q6LbL^=d+wPzlHZrO%&1EPTsv0dyo|b{1fg8=)vfqv#0q`xxuC ziU`GMp?aB&S2e6%u6S8Uz7-L)=%1n22lcHR#@G^XwJoyh<5_KW_|eUEN&OT0Gv%8-6V9AIB81>-T9 zNg*Rh%oJ&iu#Xb^8+-$FABlh}b@e1_uh*#Rdn!)8Xb4`tpQFhu>$p+abA>`tc=5D8 zSUHITdnMokvaB%jwBgkCweghG?I^cy{WB80`be3S07bWYz0tz`@8#5UZjNtUsDt-z zkA?f)ZnPgecqTkppYMXq5QaO#hw3c$$v>Oh=8hXL9I5z}4}3*V0uI*+5BZOhHY$mj zX0w)h!S{t3v1XO+^G@j`*q%R%(fB+D`H!Dzs?)xdunZ}a)Ph+Scy?0W8ERlh4mid% zH2sC* z$?YB?RoG=SOu5xF4DM@cKScGYoL3A!cfS9mz;ww|kj&waoSa@~4F54b;Rt7ODX#M5 zGsEiNZYmp*y-3q`v1={$V>MXp*^_FmuZStl?hL(*-0Uolqe0fK%mqqMoi#e36$ZBaiy1DLqHAnn~o)}xQHG~p+dCZ3p{tyYfYaYxa38gqTCP^^Xl{o+=q*R5(U3Y`sIf)YZGHUwpe*QYdP8Yc z%cHT6^5+hqSdl?#Qb!F`q7c+r_Kb$qYzgo@jc>I|J_x@Z4!mtUx4_gAgGQ-X@r1$pwYUl#_7Y^R5)^!1C6NVR} zN^cfihVU@cHK9QnLb9yn{H=`>LMAe6727@AD%W{QS~qqfV(9kyR&hoC;d#2_>X&O- z-n2h~pV^qnVW=-RY>v-0d5@0juW6vN(03q%KlS$KtTuF7uc;6W;K_M)RX^@8JcFtj zPOu|LK9(y}6;;ZRU|JAfLPs2Az|+3F4UyEo5OSl|zCC@Fk&=oJgoR;xCd&aEoq&Apn|gAp|e7>;pG%#o|j1(|xHqsM?5 zI#MRr?K965m*UoY#k?F3sgX=T&k&lYu-Do+9$z$9`!0SO_`+y;r#9q&T4 z8<>qDgy&>)rY~e@zOUN!HB{iRPpSpoi3sxGoBjcTe}a+8sRw0;p#{~&@{mRC61Kgr zX}I^U=|se3o#}2BHFAIKIY-R}D@5qYbmxcDaS-Kp3S8}v<1HrP$-9xgvn#)4^9Vj! zaH~IB?UmHa1!lLYuNv013PHgK_^7Yc>IdqT<5Af$cOREq?xh`OyKbU+Qa61anoOSM zxT|o*j}XZlaN=<3Hqp8d+yRWs;>uCtQsfSRBbRXtyWgyE30958wVn^07@W53?(x35 zbUr;HA6}K-mnoKjw=Hr)C8++&tF>659G+}z_92`46*{Uy-;KSyQPsOPppE6?>p?w( z65@&!FctlJd`@_r#_h0I+l}RqNYd&0O`*+2=RsWK;bDK+`(xijjsQ2ij2hI?|G?1w zXupOugJ3cCk7>1yWP1ECAQ99en=GpxkDr()Cb7Hr$D@R?(rUEY1VoP6&&@4Wj_j!% zovF7tqBh}$7pu1S<${o-?5LnkKCy~%0;tg24kk6l^sV7j6z|1b#v8Z~h3MpU$c0m! zk4fKOu3N(Nxb^$*Ikv0~+kS^rDXP`U8o49WL)-j3q8>?O3ZX21RBW0Myc}#~G^-TM z6GkRe%dS1caN_92HHEmv2>YC<$VtE2Y+1tJTs#Wsc{e%GBeq{E^qF`X!+d}Q#w7rVQV)KhVJSCkQ zgHnke^zPJsCwb-yCIu_z!}QKajgtEtPqL}69ta79gh6alQgnTSL=0)HrviUJofL*S z$R%mj=pl};G=M3_xy1yn1}9kowKD#YJUgXo1!ekI7^9&`UuVLd9~6nx4s2rhdGB<{`jWDvtdc6a3X=I*X1d=3!pGaAsL}V1_do(eb|7d~Hf+RxLU& z(;aV8CP1+j3~QK*)73=+WDt?x4H=m-8G`l*JH&eb88AHldwzo5%MXGhfS8TWyZ7NVnM(g6%X?kvXfEWs*~Hm8 zH^H77jNx@XmO=-Jq!>yxoO6<~N}Pl3R;|5}dEnXZ1G{T-Zf_O~23h9%yl*)74zo-V$4Q zm02H#TnniZ@5rp+7`+#I6sHA#tKOoceV7%tho52cn~v9VejmIm`#n|N&F8~Zuq$CZ z9cE0QmCGv)f=Ok(2jHkW8Ax4BK7qL`)0t>tqAm|q_k}L%>6;OSgY<* zTCl|ix@|~Nd1m&88$UK5yKBtLZJ@3_VTk7-&o6gQJ@hklI83I(RZ;r%HYACfY1= ziah?G?K-|)ec6@RAQ4ZTIp}$%URAb4?8wlzt9izB>JNiTdbrn#zZMS|gI? zo)U5Lu&0Iz&e^^fF9{)^#rs76tM2lcDiDGftxfB&!&rn9R_MwWWT~^Dky#&DQV#HNEMCE+=O>X%;WRAENRY^ed&NZQ;NeH2E6bwf zH_Hwe<}D7bljt=YW(ikNWxoyQ5C$O6ZH|%;&FIiF^R$}6{$(~r^FL~T|NUo5O$Dg? z!V7{dVoqSK-nO4)d5RF1EO0=;tiLUmMVzj;fmTQpC`ThBuqqWG7#@``mF`654RrPZ zA+5_*6BQd#ToIAap1l7_qvp3iY`cK&=f$p*AJrl4Sf29uAz#PNP~ZNLeQ|=4<_@uDZGV<5^R;d@z$MJWO>o(t3I|PSCP*xjQsk81}Y17;}$H zcFg6*a&zi#`Rhii6>}(O;0x&Dvl_CK1Ru&tX}Cva=q0xK+9sDNIU0x3P$Uiwr^_k5 z;TQ@Vg5fU7g9|#f+ONvz*Ay~Yovt3g{>a^0^tH#L*Vz(dqRMX8t#!e6m#(Q%$bBcO zuBh+nELsgaLx7S>PO@Nim|?OBb1tC|U4TKHm;{C?Q}M&#NGt-=Tjc!VrWHLbn8>7P zNK}M?o`Z2u;wEoBS*pR35_kH3S;~&6I_+RQ4pA(p-Z4bpXGX75rtq~A0}}H$1l&P} zy<@A1dsJw6XXjGm2)z!b;qf=wH|^1bJXrtLw7HY)F>C` zb(Qc{XuY*u6b-k5uTb!~S?w1xuxoZDi)wipR>r%|_Yk0Q>l-&O_mkRm&F zehLlfc0R$p&JOl4wbHIu@6w3B$su9_?v_{bzGdPv&i>WapgElcM9^j-ZE5ztZaf$G ztL%0g?!O88pF8Xo^|*M8!^_p`;?XI!mLmZh#VCXFJ;q~+3%Y6O(;buJ&j_%t(wVzm zvVdo^YlN1c9k8m_LTMtRP~bN2sRHndV)|-)r2rL;M?FO2e=Z1MgutbeCNvI&}xFQ@i5--X<_#g5*!RFPX!`iYDqBADft(iY~fuEHD07gvp2W`d1>R1aoqZ zhQ&hr*nK3yGT$(*nhY9I|^toajRfRPWiW9{?Q$%S)ph>IC;cTmHNKf*uV6HjQ^Xif6Uwe zf2XTnC@DQ`PKfw&n_vCq4q5$u*Nwf?&@=t&_b91dJIaVgZf>HPXP#2qt;}d5x96|3 zuY??=6+KT{%`dkl_$ytUN7$@l#9xp{;#q6Z*&Biq|2QQX0mEpHk{^A*ceX|FQK0X& zhG(p6OM#4OFF^beOGhCyhPyQQW0v#m?!4afvcI((2{Rj9=DUBVa=G{%YY6)o9qz4T z4i53NN)_8oipd)(@-?_a5)EC`+kI)l%hhY>Bc&ne7vgR2>Mv|T;(vi8ix267yK?o**hy)v^2ZTSb)~s;pf^9yM-hfFz)sZSb6tRN6iI@MVG(w z4c?P3AgpIl0Evu4vFo9;vn>YBJpPw$AF}n6VFkM=@ZT{YIr|S4Ov$O>e@UJMB}4o< z&v?FT#u5n7`{#a!t`gy-t8cS$e$QEUH6^=WQ!!vNoxr@3WjP#lV!;{lfx-o+ddz*` z?+Y}x4!F`L)ov#{{w`en;YEQ`BDlg~lm74W!-yKVa+y@TuWkM=C5Q)r%ROVKJd^dm zqy$a`P*g=KZ*^+^eYua=f!7jc7mZc9rgcbgIS56cuTg0hOUD%nZpD7_ zyocPdzPw%)PvQi`od!jSDH!DA1L58Qyql8CR1RKZwccy6!nMCB=`haVR}x1P_D;u; z%6Iz!Y9GT|rS?Y_3ir#Rt2N`Zz$x>)P3V;#{j8TIfKQ$AYL`=;RL>y`8A0{hOJH1k z7IBZZPaY{D4`~w-0kQCu)lOj@RfbOv>2)^2Hh#<6om)lC(CBeZPh!(nR5B)!kG^Y( z8sTy)fAQuz^PrEW2)>vE7iCDZbpl3?7x{@Vw&`qpj3w-V!(?iW(O@W2z}1*`b?5xq zd*vRbwuZWKoFHfOc>R6ISh}qUL=oIDI+ry;yDUeod4bfKPt}U@)o;l{tCEVZvV#)E zozJXRvBEVy)SAUo$xpe$NtJXw~-k+RjJtmOiV<{R& z50BowuRIT4rm=$2uEoo#>dwvfGP)xUTN^9Fp;guHwX7 zr*ehlAf9cqRS;~({r*VyM!oSdx_8f_c$$oA5!=%F@*RM;{$=2v)HDoL;jvjKzCexP z8lGUYOL8&#QG@06DMFeu$)-&kcD~xO`SP4_vrbSY=1st5BK=R{rRJz+QF-FAUD{ri zhO};n)Du1+E~@8+Jj4~Sct~+uzq3;AVQM?7SXR$tx&kph-Hh`-cC{mzHI8BJTuoI& zD&3&V^2Q>+81ud;zDwKFps3nBnMym(>$cP-$#^(Hahu~K-~9Hf8h($kmE*NgJ{}3> zAvur|d_)8BWP^r87>k2+7iR?lCy_Or13NZtGvVFdGw*T4pw8-3S($_EHf`;piN#_a zc&Hm^J@=lR^o3XE7RQeFOPFZPn7^~7(W58;ZnMfQ*=xa448xnY>Qtv$4YTVhe)(yL z)zRhM=dM->aP8;A89(9L=irR^tp^vK<%q;5L!*&XJ#v!ku!x)xB(ZPOA5 zxm9n`b)Eo0$6K;hy?@YH`3NVr!091ArBF1<6b(46e!RZj@KQysH-dTOn)E8IabMh6 z8uAcWtW>z}4k2}Y43Br(t992+?tU(2O#`@)0yGgO@fhEOO0u|4IUDXe80vS10a+Xl zsH8Q^91YIc%UM@hG@i<;6sy0f(@K_>drX=V5V}nVHL||ArChWm34CAzAiBZ)z(lg{ zj(>QJO>wMk54Y!d;AjGy;ATBrxIkvleOB6Hd^cI?SIS0 z7r4QgKi%0;tT(nLRPO~56kS>ZN@qQTtuk*KDhW7O$p!YSyH)TC+iqgqA(S)WIRBi% zCg!gjk7r1077HNYNp4cEZp&bP&z1j{$V%e4JtHVcD!JKN{L%{byw+h6#11zc*PTe| zBbQlaM5mzSkWBqmJS2wga9fLUvC;S^yWD21>Y3;>92@`ouq&Oq z)tK6Tb4Ss!ws)4-l~VcVN!}N&+!vedmkj-&U^d)u3mzLMg_2QaU5njK62+Kt(rDTmJBym;J_Q)cb>_}jDjFLCj+GU1Wwc)xSch3(zW0%% zZP%Ya115MprO8F)%B%KhHDylfGB*+ps-_bT21F&!VKJgU*Q+G?a+wBYWKtyAwhf45 z*-~|2P9#vfUx|geU_oMVySBj=pCehpem}0SU`(g5#N^*-^$l@7|9*&JZQtyIdaGE} z5-Xp`^3F2aN%wVCn>fVv%xQ|f)jFb$5O;xvhb2jS3wMpHbCESe&Py|Dp;deXZ} zhbGMts+-@8OS5jj<=!SmZySSr>InanW)1r6;rWq&pvk?;+4guMcSl*yq^}Ii^F@?^ zIw}%@!}+w%$*;Sr18(x!7izM#ere6rn+Fy3KF_QwJredt=ZKZ0RVKe(7e@} zP#4sEnP2W;U%zETI3o;ghuMzb`GOG$HFrBE(NGEBs@KLVzGv@I>#SP^mxpv&s-*3f zaUuq#O^V%DM-u9hU%GDJvb zX?#wY*`LR4na_4D`nq~=Y*YcyxpmlYi+?^h-9X~rU+a2lP#zOGe*abwR{(`5cDs2$ zzU8`UK+QBb#lCeDyCxDc3HSImccRjfW)K!^cYx0>#s1!Pe{ML8hC1)iM0K=&?VDzh z7-;uMFwXbWLGa7%`XyS?>4J6?^K^0DaNe)%`(1!SzO`plW496FlDkzdB?ZjYVboHo zLG8q;=EYTm!(Az$Fw%vGdM+3VH!_Yf>-AwkhG6x?5augxROP6uv(qt17N|RwGV^G7 zi@7azhhn?4z0A03Rd5@N3$@p+u<`{h<6?UepliRd5B4F?zCnftBKnA%-)`ECWm}y7 z@mO+y`#@R-b&p8qj|1w^(O4X}XQVJ90hO6uzXAkBo{A$hM@Z$VW{b=Ludmch!}AV; z{9C)tkfZ+4>pk?<_p8fih2!{fX{f=wwD)vvIlGhc`c`q8Jj`9j z_Ow+#dHrz2-@o5IBDJgP1%n$`*c?-y_tdXr2ipflc#-ar{k~Q^B*5<9v~;3hXMRO$ z$n0utuA*NwW-kI`>gwNqDy4LsAL!FIHj@6jM zIXNCn`!mD|D`R+pG@7j>x%5RqpCdFNYtr|P&Q$%!kVwt?WBCsSyY z%8hkr5?n`f!4kyxJ~n5m;isog8dXiyvG;X0prvc?jN~BehBMeZZ&O|Efc5pQ!X0V` zG|02eL)125vVX6~H}7iC=L2wt8iO@H3^|_F`>M7kG1 zlgo_YhEFV-D@J|A;JyiJ)Q5p^yj9g@lc=HTufN}7@SQkwAsuhp@-9#r_87ud>9%B@ zGuz;LToRK(koZJ}v)(fEOGNfA4EehoQrWXvvKFYM-#xBfi(vYf=R5Y0t5OW76qaOQ zNW~{HsC`rk!%LR4*&uTTg?RJKK3sE%EY!7lZ6|v!%kGyUMw=Lb2Al3(75hQH$-QSv z#Fp1&rsh*URac1A073Y4`iw3lMVDCUOV^l4IJ|+^@Byz6)2)(N4i%M~j{ctjPhShL zL839#PBQ!=)<@D-%7IA+X8(P{O@!KfO@CCxX1CxYmCK}GLb!&ZzzW0VpF)bW^*<8UOLYk8)3RLJmiF#_u&D#bL1ncrqd}vUy ze4VI3C$x?wZIXlS453Er2qtGoQ#&t2;^5gkM5XMxqU%(Kp;q11Qt8)6OYrq?49&FM z;9Zl(v;eCiC5%OeBy1*Ep=dAPp;HJ_?%Pu;MUq+$zw(RfSnR$Zzs9wqwRV1sXA42eiDr~q-yxZOiLl} zm%bEE;>Ent*>a7u*JaiL6vPsZT}p7THoz$nf%lczSZRlMcfHx9la>Q>JF3?HcMN|^ z1msp4zz4k%P@`GRiok#R)9r87NLhS~RkPU{ z6rwJ>92!@@5r=5-rfW2RiEr0)7zrja_5)7V; z2IJ*IlWL2UlRQ3Y!B11MXdQ{MRXAYq0zSY%KeP^uF8|ZUukeXW=%TYkg|2m~e3~~d ze8UrWoLTFoN)Nx@{XKs8Y4z1-Ty8o+SEJL_^k^%gKU6zU0?>H~We&w}!&GV3<})r> z7k=k`-O=o-`)aBSyV`V?V?H!Dehz;>C4i$-y-9_w8lx{E+BE07UqZC1*hcyhgWLLX zmWj54=2k${5@Nlx(vsoL&K+V=3HpuR(2E{?G5Q7>yx(8At+jnD&%bUv32UDDJRMzq zpvXFm>|r>1g(+~Be{`*I@i!0^Wn9McL(SFB82rRqMavny88VbQs1zlH9;=U87~U(V;?#0D zMD*wf-xY{Ws#n#v9C|V6WLS3{*RoG~alUp&QTN8O=*5k+4PMLF+=c*QFRv~{*6LWP zm7U;tTVuSHwTJ3+K>Rrht7Wv4Z?Tb8mBBP5-vJ>p7n~F0+%M)3B2bJE3IKUVYy&!4 zH@Z{AM*+FdnFUav0D2|_0~Qb}q>()#xr(0zG45)#4UvEf^LnRx;N%9`es#M{Rkw0f z$5$;{)>oEM@C6C`0M`(WOf;mBS`D(V>3JyUipR&TDpV)F4?2O<5FBBrJgp}+cTUx( zI`zK@W$M=NrgcrY7}v8J_Ner@L|A^v)pn8%GZ)&gkzRQJT27f-k_MWkm8Ikm{xq*+{7pQJAX4utay$bPq9yRJz@Bmi)6>ch3aMNz>JE$hd zX?)*Eo}JgXsv#Vu_0VdP&=jEGNz{87Guas+yT}WydK?#UpVTR{P zf4wh867F*jVGW}gLGp^pE5Dua-UgW4t&-2zT5Ln{gV2J1p{kLOIOLw&I)%F@x+--L zrJHN;{aNI;X0`=kvA{e%;M0{NY};v2U+&>a$CS*Z)qQCPzbiccr`4vFb6^r@7q{zt zAf(wA33t~5V7WDhVZw}zZ5p>haCz1H9rGSMHi|cfyH2o85e7ler_>wKU;qcnqY0g(UM&pZB=b!dGN0X0yW>hV?UwVx zZC0o%%I+m46LOZuN#>dK=%NoH0lNFyt`?$(v@Rv33k&7&su7)D9f>94PhL*Wy|tyZsXHQ>N&W!;IXUdj()_+fhVKpi>A zgu+!)5z$ag<$CK(K?|;GR@kyD!hkphi|bJ<)fPy!Vkp+5cj3rCr?YH0L(6cEdY&Ls z#v8q9+jByL%eV|)A+TpyGaGRO477iqM=Krxu_*}6hUeFa5d6Zr3>DneIZAEbD-WKK((VbwOx4l{B0+ z@9GT*FBh8*m9SFaB!mAq9j)TUk7QScnIr@8Fj?ErT*ZDlXm zrDitf1|C<)0Jmv=R1)7R;Ljkw(D4(lT5j_P@nXqUBYh-6NW=tZ%y|FJlrA0cCv#w4 z7l&UZ^5It}=a4fmw=LV)A@_RGj!X=oiPOV%`*qD6tkifR4PqTkmg2tUmpApg$08WS zgg(#`TPa^eGysR}ji1eQ!3gs;Z?grJV-9SS6&@p?mukcVf2DJ~35>ngzH0+xhOujF zC6snd=)=})2MZ&2$sKetcRf4k{Dcmk$5*(7nPaAxIg#d!xUn|*bsMo+o~B3cM|wB= z1h`hf!bN3VfpnnppV}5_4sc`t463kHM4v@yu*tbPL6IYRzEZQdQux#gPoQ@uWh zR`}1d-*Fjk<8mqqZ{3%(ZCWvt6;-wGr#JZQ55r1pePGKuC&Sd<5393HJa8jj84X)* zK8Gg5Ptoua84k(yM>Y0cbG$~d%1rE25tL5Gi`RCME(wwu4IIK+PkIH<2Cda-+xR1Q$IJnV zy+nUsdLCIT^c&Y~{V(RR^JPX*RyJTbU!q!RWYEk%gW%_lgbFdnhE-%a^DE`NU}rLh zNfeqGjVwKmBj*W~iWISaHPvD1Ta)8mIB(k`Z|uONfrs@l9uFG2UN?nIoq~#dy<=Im z=5W$u_Sjd6r@{H3{{>*pWXjaZ{p&JGm9tXrTnxQqe_)Flpy5n|RP5caWHK_KcJIXY zlOs!TO%|C1D~nVt1Q3YPNu?8=O2v|Cl~#VObmACc`=iB1>M%g&3-}3B@uEFkq6VPJ z&Kp`wmt+KMrhr7Q9%TdylasTj0;pu^mYVivL7vDJs+R9#Ws%ob4>xQ-Ho+=4gFz@p z{%O7s){{BPwe#gXt+N{!%L%cY_AI;&aKNtlJnd@v7%~vQ)sr%@#o6f_Iuhfa33q(I zF41~8THJ@qf2=nuQi7&cfwIFMBE$QmM?KBiOsFa(+l~xL1Ic6S#xp^(BuP*mIRpg0 z5(~3(R4>H*db=*DR}sVJ)TUa$U}P;4#c%4Bu(jl>-n?CIkq_w`fB+ONLtaMli)CsVgu8-5HD-iP<9)v~Ap9Iaj$v<~PAacFIy7c1jGmm}|*3bpl) z`PKEs76ZKWlJldUn6_Tkd@MG;*PcE4zQM_6u+wa4&CGxZ!<_dz6B6xZ`d|KPeV^l} zw`t6;A9mB_jetk@m2$)&O=HZEAr`s_?pIR6OI2RbUJtz*_WbS43D3dnJ}{L4+if@4 zSZ{boZMsU`G2Qi}zWb#>-{eKl+PlV-YkAmvxzbiKg1fT1h>ja`55}4Oy~tFd@OI6g z4b(2F#$_6Ke5XzN({!}^G{Ssv^ZRqi6nG@Bs42*16b{l2OG$p+`F`l3+hzbL%L%%q z)6c$sc;o)|r!}FbcnnE$^d=%4;yI5^GbC}ITe-eznha%qzkZ}~)=kq$65(7nDz}*J z;$vk(JFaymz3CQ7@X0aNreKf2@81WYIu%1#oa(x%3pc5;d*Jv@b#A+nMtqUB0Gd|d z{EUt%qLf;Q9D2IW`1kOWg=FG8I*6I-#TInxa4Z7=Y?$<)46K~&No&lwR;sMny&`|g8 z#!qmtbOua(7e0LUL(gZJ_xNYaF@TJvD5B@rO?6wN=efagsC!9Z=>AA6II0rprEQmy z);F@fDPOLROPEpCRVjV#fLkcv$enm>8G0X#xNoj>TCoO{U-kkoz3gy=ZWO+gGS&B{ zBdYdfIhv$p-^)?=GaxueZ^VeOYpv%}^HI3K;a=8e^RKK{fl93tz@0mptB1gGQ)7G1 zLoBMf&Bm*PAFhgRh34WT1@wy9k!*H22n#9VxC1EYf!t)s3}2+=0YurHo;&&RU1lhT zQ2~~>m(X1BddeFOPGz=-y_b8VjAg!H$@BP79)qL7fICqL5aBm1=v%9(UwZxj4I z*6JC;m!UfU4P%`}=LC2E0NX~c6Xbsw9NAlC*QsdIOF)z3zd0N1zfd&Q*mAzVwf)DG z?~4Khcb?Q$|L0ruze35wa?mh;@C*O;enc8@Z}P|_c7*;hT?jV6ZrlGg_MiFsziIsY z8u`Bk@}H&g`{I9JAeA`=`oo9a$ZVEt--{J1z9Aupi$xL&(cDvc`Ti3*Vl-coF zo@{#^jYxNYnBUUKGd0?`9$_XZ+m1&aTiQ+c&>9u}jL>jc+C*$HicN6IrUu{quyAOt z`TEwzWnu)A`W5h1r? znfs@CKn_auWxii3_iMIaWE!SjV$8tJ<#nVkleP=z#G!Y8cO!Qm1mLks8#QepIR~UuEy{-kSHky>HPvO>SYY z-2Gf>GTryLi@KN)AAc3fKJ(@qS~7*6mB9-J!e_kWZRp6iqr=JfGNV zG^s;Yx*oSPl00x@wn&m(5b3bx~lPym03#Z1T58qRD)F^cPk=IP+WXdl8HlYeTD^$3MN=%sk82 zTCAEEPmnFTA(^lL;?LiDaLu;vWx;?qFMn&Yr@qx6f0)P?fjez>x`a6PSGim7@|W2{ zi6M6aH44XOCj=zLE;x^l7OS1Aw))e!i(~P&=YL#7$6rko*Ho9A3LCQv-JdV{Uh>VA zopg$X<3@5l6z2+Q!c_*HmZ~*vwcPp%#o(Mc48>6^_dflgz-w|iMYZMgZNj@1BKm>l z@L+jJPL`%sFSz*-sSMqLmx|18?PD)ecdA!^m~Vg7i)D_m#i)OzNcQny?P7=~8kkJ> zM?xijGIihyAn|ya=2o!u@Q25PPPX42RnsTu|BB;Qd@UL6f)lsn7J0+X@63Uu z$=yZ334ATZu&u?h?X0Ae8@p_cdmxTlrEqn0b0mHDw=vPp`-d^(FZ~|ZzR)iz(fsc~edP)&7*qj7j*T#Il8zs&eziunmmgY~ZU zh0#)dB-isaz6}rN(We#1DRM$XCV^4TQ!b@e*9Ikj!iVKLoe%-5bs?N#o~PRz6$`PR zAe3R2CECbvm)8t_sev2~55~tvhg1C}NM3o$RX;ea?LNHL*^05ii|zRAt&zGFmKl_2 z=~vFjqpIx*kq36C#pm_7XI%Hsqimb7tHdYPsy`d5idW0kg+m_(Eeloh!(efrqKv-7 zO<%Y_nzcV;$zQ~)yM9GEkAXWE3PuBPjFs`;^V1#YUkYjQ8qNfUOQ#Dphq~i3QW%u^ zdO-#oH;We0hh>KcOwQ*^T+dgG7b6Mud0f&fW-Trjq?6UaLW)Gm>a7gv$D*PBHjfd~|ZSS}FgWxCElJ50Iboi*czXIvgnh&(3Blughd%&NPjMvp8x*@Pl#r(?zW|Fj2vg9^{bn$56iCyMP+&$IijS?z6Xsj zotI;S8$DmZR=E{#2+*~?U8+)pBHdk+9>Z$@z}~&k5B38bGy{S4ANa_$T`VzSnUtZi__#A_&~P&t7*CLQgHil-L9;tuoKyB@@&5f zX-{^ihsMIg&4l65+J!BmZ1GyI;ZODZZv+}-xZEbE_I{l7j7-kuW=*OtKy%=p=Zd3I z*_rN8bGWT+8bO+EZkpxRUt-1VgNoS3-3$`6eC{3t*urmv6PCkt?>ivz2EV_R8eDuB z_CscxYqH%>B7`n|UGs)d6u#OF=Qa*|J>N~Y-V9|0&lO6&6V0*PbTBV)15|S)Gp`7# zV-TPGT5e0LjqPfkG?uQd&1To=51@D4%ugNc4OH>=a0s4g2yRHt289AvAD-9sN7)DE6 zrj1fO&$2rs>9iWIJl{&Po3;;3O5vPOOdEkC_|rn+q>qKJu~7!L2$3VCy7Rfyae>lW z4B{B^hgCzxUy^>M`_?7Hshq4m+h1SAVkyKP?T=GT>RaUB7oyCDhd*-navq}A>zDnk zs9o9B^`Tg{tDXW&{SfiFt>wgGcWR95ur=m~hU#mb23g{J=Eu5T(n`It(zs=5E3&f? zx62W9el7@#);R3=-xevpMqN}V=Z{B8;ibI9uf5ObcV==9lNISS_eZ-4Mw%A9!x@74#B`pui z$ZmfZ^RfQ11Jn1h&I@19-OFa8VqG5JlA4@m58slq*I~uV|B3#dwXZSkX$6o=n6EWU zUO9#57qhTCuHRdI71c4jRR`AAip_KFM0;zNOn$~L9Nf>JMY9Q8P@_Aj^-;W!M^oQ2 z1=>4Lc~A#9PT;)GPrLj**2twD<61yWl9ROnZOpA`L_FwX6?skQRAzwb%79LXzt*4` z-l6vM5qC|I$xi1oNLNW(ji=+{dRBvZjGk>>ts2CnHGQq&)O9x5a#{=k*LyK;9i6_TccbbDKefmD zY;LAAJTHJ0_E2k`GDbme(qe6}(R_pT4F4v7Jer>$sI{aK8u0rNa1e?H+UF(*{ zuwvA?R2)f{FBUD{o+ekjx-8b1m~dGy-M_Hw1x=SJD3~|?NOq}nf;`9V9NPAV=d}%6 zZ@RxXAlfeas*&WdFw+qIo$_BEPR1_vnEabo8i3cDU%hOJs@>bP`wx~dJk9o{7q}9 z!RLYHZ#tt=)Il7i>I>p1dO7c0@k3=BD zN-UCp2BA=T6|ZAz1Lu1jJ zW-hI%uVQd8?I{vt?IkokB+(ooHP3$~w!#J2ZjeLqWl)0~$SR>|)}^FHCDX}CsS^5A#y+3lXp5zujg+yWo;DY=*et&r`u)I(`+?!um3zKOv+cI@0 zvV=x6IqZx%4_ELR&1b(659WD_70cQxts+H(0P>8AIbX;zOAG4)Hkm0qL3p+zAFYO_9*Yf)N?*hWSE)iN~FbgW5qzqI|kI@(nH#?{ZObY zof_}EtJ53U%sOTVG;homyyFgQvcYI7<&|rU0Xfy5Cv`)HlM6PT)Mn+$+L+=a)FNTH z^zV{u=GGOzq>hr5SWoz({a|*{j35kZw@A3@rUaCV-z|XW&WfpRo1qZd;wr+_aIXl) z7DyvWEN>N|Hf+GmPPy1pkrt;-1i-n=)tV+|Ood^y8>@~SOWHy-P&?auRDT&7NrT+p z`l$esye49wuBPrfAvAg=0hP7uj$3H58-^kttP8$EbaIRNIA%Q2iOVvS#B7qZCRr=< zhO>pH`o@RZWVXMA+nZZzEl{%sar~`sMUkV!(nfr*8&}fx&%y*{Pu6~dm&Y}$l3*x+tXF0B9ciJz zu<>M`1eSp*Z35R4EnOTY4y_c=x#Q>Xoau{gt3U7uC-`9&{PDU0IN>4gQ<9?eq% z)||(gd01@XzEAot4#n)xanv5qDHpr+usAnAX3zTx^@$;dBkA7yD{rWWDTEno(o$Nc z3zcJ)wfi!CGy37FEdXBkE*28=Yu-q5%Ci#lV2R5T<9mGR_;kL= z3eP-CE)UNq_kk3*i5xr0Aj_Vijz`s866Ydnkm8*Lkq2rxN8Rm%B97!W4f{nJzy)L) zuL!Md=;3CT*_aTVZ?Y(REZ;iMali}7Ghf>ERq+Q$=JYX zzfk>nb5d3Gk;66x^lPc&tZVb`-qwNZ&sa(P5=$kVuXQAF65QsOVZ}so8C|Z7m*37kB$_H4{=3xjk4-C&fz}D<3B}7Y}cT7oMKu0DZ9H zlgC5s(>KWwVwShb7LgMg9=VsKcv$Y*;lFh@rCc29UlHvCZ@sv4>0^b*fKB$#(T03i7s=QlU2{O!uLh{XEP`6^W zwt^MYp#HHhfjVK|BZh(lGta(T&GMCdnXEI_{P_XyM2$>u*(H3P=3UP!juraJusJ2y zsM||}!8fqIg=VKJi@>df*@JNUm@`4XRQ&OY^LI=4B^xfH zePJKH+U8w{V!3xddfq~d<<#snsaa3VZ`F)<{hT{w*18M@Zq{iI_o@XTW$6(%8N7+? zaX(O>K*-^k6I7BsUcGh&yh9r8w$j6j5=jNcM-eq&HR@ z3i-?lI`~|8Loz+h^t-%Vub%PLRch8%Y^H6@N$r5QURxa~=wH3ZxNosLn9RzRiAX(b z&|fn)w-DDTYM7+MvJsmX!?`XJF5Nel^hc5s9p=?;EUCt!HzYZYHgfP`*f`$M36c`X z2uM%R!FC-+BE{eb!L@x%m+Y-WJEd+#yLnJyLI7xAh#1Zb)onjxJqiQgn^}N&ITze1 zWBwr-^alF-uV>5sJzd{EDE!iCNSRPqvDxB{wasmEj*qakI-Q3n@)~GdAu4qeyNn(n zJcf{udrDv3%jr)sU7PL8_JdZ}o zh6*ymhtPwyiHpTlK-R*f?zc9DCu%y55fV-U`|zS0G4Y{K>R%17bi2{M zS2prW`P3M2e(Oez_^dWwc@hR%^q{>xjoe z?D2(A&(j1z+eXD%eXehr3sg>)oN}JQk9CVEyu7bl1giHB8|{0vL>^XL%9u>Y=?tkV zZ0K<0Cm$D0f@SeN%KhKJly)SA?;q0rM0_c2XU0iIhr?u|TV;GqxpRV;bN$y@+n?|& zP~_N>&59)>tM~;TQ?jh}rqSXDefkjqveL^-z+*TjEsN;fMPfOcN*i9gCLh)bnu6K{PqPeGo?3^jUM^7^hA8Ds|4G?h(YPxxVV z*G^rx4F9uCiSpY+RuItnAC3qUC=+?lU`zCRp|OY@vt;h>ND|6zm)Xk>v#AaVYbv$p zr>VH$_ev&`)lT_wxS?w=&*>4Ir?6n5otOIQ>@GArFM{9kp$Ov_3o4+JF8i~?=XLqN z)LV)uv+|Sq%@=GYlI|83l|e^yTzndl`{92*4RIon?|*z3z!iV>CGhj<4NjNhPB6_0 zLJT`cxB2mWmu*-J!<`=TXBcQePDpQ(GNBH|b82>k7?L zI!|36Ublix`24_~()lge*6?$BcATd)(t-NZq(7y`DJw@k$LA?i#ZvFKaeaK`8O^3IU;%M9xH|zRS@4Xu1@TNPqg1 zMX56!fqak)5UDbJJXaH|mGDW<)nr$Q#t}dV4IF~(KOfl%f@V@4K#WNo$zIBLf zENVI(!}X6J?vG5ro2iiKBKZTf=j$c@9iXAN;O05S+?>`tXUpnvrjsmHgc5qYkI;c0 z2Veh=@OU7`<26+aljLWi9CN&oH0sri3ME!d8lVbl6XbJlfxhTRZya#4Zw|&Dn5F88 zMYn_*ZY;P6NFfU16i^Hf<+9%ca5~uLjAlv2gwr#F(06PwJAU0I)q}V|2(vKkj>wR3 z!wg!%`_Zh93JQhN4K9@yYNm3#gsHCVzazsM?g`wN zmdw#iStAXN3$4iUl?@~ZgnuHCHdLC7B zg~wHj?gmYUn%|RBuFy7!z7%~ov|#^%b=*~Cf_aaS>V=ID4i(rd%XxdaOKtDPwTCWJg?8@2;9Fdb^MTyLpOxGBv}vB4eAQjorkW z_Px$@>9_g$1L4vCk_I)V{V^`*i z%cA8dEchrD7u7wElG+W(v2x+K zKnz&`Xm%*&<^Tm`88Ro>T5s4#;W%nF(nW_OgnDotaSH&gmyng`+~Kqr8;<%bk0ONz z7+e|+^T6Z3;)K6@`Nz*Q5(I42SypI=e`;Yw@I4^`G>T7rDKvmY5a0 zLsrqz6m}AgRxSV>*h!#eg!%kyfajmr`XjR|O=fZD+dSH0sJ?~_T5!n|4*gs6Knu7j zaG<-gcnh^={sGDa?@$Ik3WGa;>%aNomzyn0B3_q%lr7-%tJhc|{?{0FKY(CwVs$aY z-Rdhk|I&g0US$4nS{v?ez5#rxm$4P6b`d!Dp*LUUNbpzz{#gthk)E_^uFabDD(j?&v3=GKi9PE$~J)Y)`*7Gz>So zq}%*ny;;F;d@aDj$#$+7n%TV|10;YM3E%*!-?irUEx?ku5BS$&@ac?4d;8B@B8!}H z7J9I*xBH9M_4gOKs}tNm*QOxz+jpx4xa#)*LrJO5_}PDxqW=6K-~XRYc-9x3EMqJ+ z0L6%gPL^FaIA%*ToqBa;>$I0imZQ^WZ+-nU9&)7C^B^_jqW(Tw%WW-pvu9ry;V24D3CLe@}YHLU;v1zts1(0{P<(C=8#?T5hxf%4ixjr zJaIUf=27WHd-mId?-}=Ch177$lT0O8_23mav&&ihuXilH$ec{o?M`a%u~k868T zYvv*^2(=oo?yP2iA|5xZ;n88FVy;RgskijE?b8wDLn|V$;JnG&9Ne)l=a1M@2o^JUV_ zUWthJI17VU`&GKjjd0%pR>%I&qWtr;V+Ig|V$;>uPoILz>NoJdW$^Xdo_RyU56o)3 z;sd~291=-TvgN`_ZrS5uX&mbW)vs~{h&sR)lV|=J*`^5^KK0sJlc&Jd$gqVg#9+Sm1L#ubrk3S=G zwMu`4dF3$rTb;R?gZ=VKokg3`uAx$!dp3j7d}V{G?Sxa<9uJB4;hTgKwI<1+4<8BB z-EMNbQ3HU|Sw|$E3wrm=W(%JI_0+Oiklq}-!?@BYP>ZmaZPvh#2s)juMTW0*X%gb( zH`@xTh1d~TX5~7Vcp9HEr_1u6n(Vn1r17Z6?Ys>^ebUC#xTRWFfYL-PI<8?2=Ceb! zW}M<51cGibwo~lnzs%QbMcag~DdI!0;fdizg#DGx8NmxJeYZ1aHDq4|+*lf@;2!v$ zGE{11!94NAQV_?-o7AX;gpBoWoxZSH8w8CRo@d(x#LkK&hz(Sba*x-(;pX)q3NGG< zJ)4hmg*njPC1!$a#>l8#iZ7Sx$+;5A^2&IE7zf4Qa-)I_d`^|uVF+g0USTNw0JD#* zqN1Wwvx88j#$7U%Wh!9t1IkjuRFM59Bt&L-LrJ>FODy6Avn>q}XHwa^-_>Az7L5f$ z0q?rE6Eyk=8m^A!i89XQ5cRq*`Xk#4&wNEtGFJdJUW;@credXRwF;cL0O4Nswr!=0ZBko(>rxKcbN5*a?>`+ZGM zbqL|a->VtIwQYvUd{WnNyLKY80mNK~v%azx z7?`(zj|WRz1J?2kdkMk^c&@Yh^J;3@vh0!RS~HV|{k#o{uPFp-EmS+(>FL#U-YirlO4yYRjYwh{6(%HlL>F(;>)^mPnn7E&Y`D=q2 zsFkY)SKJpF^HGglF~E)iNGrD!ZAC3$_dah~vD;*CyiD$D#p6}&;D?aAV=O`rW4&P2s&en+}JYJ+d zVItutg8kpZbK8bjdiI}Ak?HW4X?3;!tVBptFIN1z{IZul#`G$uaNRGPUX}`pTd#7` zu&Z_5%6q4YCjqjKU7&VG-t``ri>k#%jf7g6W$JvwTj|ahl3;BygJ12sllSQ?n!#Wc z1f{j@g+pD)qRt#GAQaa!{NoM{WTQF0`T6&4FPO%DfGF{!ghR9W8Y@Jh&SCt{KOvl+VFJIUuMqX!00G zt&s`xL96SGkBDntnOtw7dXGqO!sXPBs>%M?8mxgkT*}A7J3bXAWc&=5n`VX>DBl$? z1QUNyn@=^Cp33Qptu+pH48iFW;f>>tgoMO|*L;?*)3Km8B-m7;h%PXNjP&I1zIe5g zt|$nTg$5zT$W?Xb*o8`cafS00w06w=1&SBmcA1#ONnja`N-;}oOL49kptw;u*LdN# z3Fsjd!1EWb8U1yELy_@?Tlm0B+b0`v>t@D)c5GCH^myS%ftGB*5`caUj15FAgh^&H zD)h^89=v$+vs|jwOFm&TPa)u2HYyMO=@g7Ynb+C-s49OmfmYlL+W}XkFh(?3`OQxP zhTo+WaRd1^r8sl)`rzk;rI8D6dO_biCYNUEOaY2^#7a}NUaz{jc!DQPh{&)AjXF=9 z&JL+QubM;KL0A&_hp=($xl*+O%%~&xxpf+%IXlo&s5W6kHi0xP#jg)ZCZ$mO9PDZk zCp`6wLuP*+)RRa%1)#}gUCX8o)vlscZTK~{DM+}XD`q*gyHl|IlfD&|hhIahqrZ7Q z?4^;GVmT$J{nxrny`V&Rw|Tu~H&k!-38*@77)t9%1bDc2Xp*+g1&(~*$JH!8-Lb-I zZMZ%1thp)QR~3l6sr&4A=1C;miFvF~ zR9|W?^rX|1745O51Juyf9#9_#lqsT|e zIkQlME<;I%o1t7d|K$(ArNj`jf^PU}^%$x6di~X-FwV#A<9jieHIWF$`Mn&JsSK!=S z)v-hKq+GS>EF$ghG5)ccJ4`4zj`sDl1O4k&?cz&r5!q%tQ0onU)s<0oSI7PF*fLpY zq0XXc(P7msx%p;6<$(gf83MY9o@n~#jN6P7HBiPfjKSza_I`g?g9(0TS^zeA=72|D zSyE~YyZF1#%Ml0>S4F~P&cM`k@QvQ2N4@Vn=&gNEwHIs570;Uxn?Mj$Lg%MQ zt6E8_HT5jKw410MJj(hds+t7F8*#VsIcHov+#z&=7l*z57dmWIc6Sv5x z^Hn0;b&lc9IXM-P`gE5O(yJ${5ZzT;>i8{~?NiQj- z@k}w@{|d`&NrBMs8yR6xVr-A`2$~hKnl5Q`8Osq?dMNrFbO&w%<#@r@+XIt3S+!0i?ciJsu$+ zfY)w=1LK*j=wy?(tu1q@ygv(USWl2K!(*x+)8d>&cdQG4jQph8Y~_aJpw8vxTwegV}sW-l?CUE^*BfW_0uh~YV}x%CHdkZZK6=52tgu__Y7@V*SM7EZ9ZsWX(Dt&#o3f#+L6$T%Cp zIw@vGgU-owMwPdwE&XK#RXtqsT34VLn9H4=5plVXB!vi#0G=nI!;;ldB8)HhUlE&r zIS^R8QMo0hBC4oQsZ}%4%~GKehd{8^sGgE*;_PL@I{ZyHZD(CYIZ9{U<|fRx`gU<3 zSmo#MnWIWzz5hnSBH*FMHojj0TPnXq-2V^wTM~d~2*Sf-{ZZNIuWlJ}gc{%r&Xe>%{q3*6@iz&;Xw35W5B(P! z^lvB%a3z7Z-mM-X>HZ-k^jFss-!p&|*a{}Px_bYaMx`4Mbi*E+f%h--{Ttcf6Z{33 z?5z(x_u_EaDxO6r+uxlW-zkMI|Mnk}G}Q1t=60fUkRz2vn7dRt_JF zMs%m+=2lxwi;6-j&;9Y^hewq-TmsFc)X8#_l>O-n){Xwot_l7Tog)69t#081-N*Jx zju*n?br%50+9aF3&m?V64C=gSwCcFkqBc8JG4oYuU=q8u9e@|42Q1 zPG0n0_>idRTQJiTPf&lRxZZQ4 z1pA8Rbc6FXN82?vulrJp1}%bMH7rPAAoRB^L94*PzyUT#2-6b93S}U-1hs-&O{a7h z1YCQo`Eo2SxX~_b&VEO)3qTNN+V-1GK@a~_D)!U_+wdY;yMGdSqm}6Y9ygA|&P2=` z!Y<`~eTxm0!AiYVa66-Ft*OzEZ@F2~qb-Q!%ZSKX%}zrs?Dkjn#IaObJ}UIrTZB(2 zAn1hevlh*OSico%#ovYy_sf9=oO)7dXlUz=Zc(E0TA-@Ap6*m$326wsobf)Wr$D9svalKJBr2`? zFpJ1#v1)_-qSB9V;3FndCkr(yf=%|aKfApCvly5EpkQj zlK}2QZYG3z)^+V|9Tg>(U`3O&*{UN_`GE?Z>}R2ToOVyJaTV2J6iGA__e<$GnZ>08 z9KIU<%Ux4vjraIHq0qbJ7gq zBOC$w5(GcA_gVtCOY1u@N3Yj>`O<%D0o0*~V+k4rqlPXIfaZPre3`;dO3a~l^@XN2 zBAy~u*XPQLTIXkBjc_*#o6y|?-QKY~O(i`6_1z_rL^RE|f}JtSqc-n<1$mLe({2p-?}syOV8BPuoO z=2EFOe$x&$lw{6=0zqHl8s7H(?6n8vywKE=CmIRF=nUW65@E7hERdEgGQ|UK(LfXl zrS9@VXdbtaU{3(FlDVC{j3=9Vz2y&6rpJEBOy>*1AmS%#cYJw7vG>7J&k409$;Wj;%2=$b(A zBhh4USZcv$5R0a1UD8Vi?aM;YxeZn+{1uw(cq?$j54@pbvX6%A*Eq^&!4-+jk{aYY z5M~9nwRadm)j$zaUmo1A_UPZfPRGdl^Kxwi=^E!lZ$KO-R;RE`i9L6&#}q;T?cA(#~18L(XpLcjbNEd=M5SgYJC`AxsCQ{Z2!uL*qZ&8=@evjJzO-R3$pl z#yW+N^zzTewA=~7Xq#OTpvs*tQAOt&a&y%l!5heXNogV7BxI4`CU6U?M@nS zri*;R3+YagalZw~d309m^vy6#K+VUMBw--#8^L#@w~D8`n7dr-YSC1p^aGh@a#qgk zzG~A$!R+zTWMi^YDaYYvw9VO&5>U+D5&%I_mhR=@5G<9~guB1SWHthT`3ez|SRE>4 z>C#T(sFMr)(QVkGLvIT+oY3ERlT~|)ng~I5bNgU)OKy^0<9vOc)tIv5VQ#KPXgQm% zHJa+Y`te z(~JSLg6DB*mjXM2ky@kev=t`niIf}Nc*7USW0>v_T(JnPnsA>KX}NL`R==Q@Nz-o1 z6D^!ut{qh^7CPW*UylV&F>&{<;6=;pr#xdkusUCzKD4*<*z#*=wk~`(VUu0dZJtL6 z#2LQUZ)t-XfSZ=LSYHINol8qnrB`>L{$nZbRNJt(=b(oMV?r5^R;TW^A0#0im$cDz zLE65I)Ah7V9qTwVLc**4u~eU=ne5!ACyb9Vx!+-B^ngDkxsFFq*LU0fm6oCx?)8^9 z=fR%?_&iPvAmO9WZiD?{kx9iTpxz0JP8TT*4WFw~11TtQx$NSf{*0VC{n}Wh%6DCJ z5jYU4&WHu(j_5R^5E~aBiuE7B9|{oj0T6Z84`25KL|Fl|kE#IMomf64Zn90RM13~g zw%{I#JHD$_pQ-znPyzW45m5}L9Y3G4)DLQx*B9@wh`PRUDM#gsf_P@f#-$<(1*XhY zx`4L{C{$2WemIP{CX}v-5br+WO_2IXEtX5VE{t3GFm{0&kMYxGkI>1qBy9(dD+Z`1 zTzBq=&C*_Vy>l6!^2>C0EYKhv9lz@N+1@ zZBbrU=7+TthjNp`|6YGGm3aRg6fqWY7WFWoPIhD^ygZ5=p|Kz6AUb@T}OL* zwCTmLr{int_3Y&pqn2B0_-^q0oSglAZGH2T{kb2{FDi-2oogjNpKBq)pz?d43?0#f zG2U(ufzY4F@{CvTnL)*jJ!0dUwo|q(IZiWoFZ&42GJC)zAj4cTyRYauo-We44+q!E zR~PulJM7hQpT5A_!+{-eHXqm0I3y}9kD{9;PuUQUuQollYTD?jM&Cg3{%J_#S3>KO zr}mqZslp*}A5;6x)zw^3U zOFaA9lCw&XZuWHp&}Y5{KC>Dnf`#|!QkI*-*HVc(hAmYrX7aohYzWP2aWKslE4ZJj z-hg#&XrEO!j5vQ?XTPqCNZ@Q~Gjl9P#BCXJ)LQTH@g1DhNW5WF%s!q$$O>gIHROCa z;Wk*xgX!)beXTZX;pMIzA`-Tuk;ZDi)GlR#CFpB~O;dkbJJou)V(0L0RH_nAnx4J8 zr|5DUzOk7t)%AC?uH0R2oyk6tJZ-@_!i1W3D0c`CrRGEjO{SU4KUOJ!^eVGHTCFa# z;ra{eA1Q!eo=o{odB9#b@D13$MceZa8x;bs^3U+7R~_1-P`6ZJbVA6eAv%iEkc$Zp z$=aRqCH#ALU;Mq!gfOn#v;7p1wbW@kQ|Qx?Y+TPta55WKAgqEA)Tmf|5Ok;U`VlZg zz_ZTGwPFHa@CC7g!JLxS_??py;e#*9v4WL+FJ81yE6c;#V-k?xNs5GH%XGYhXK2nK zM=P2WkR@=nFHFQ!eK~bXODGDI``!VXxj^R_f+!H`)cHBQ#}lOGMwIdX4*VpcD)<(6 zEVW5`m-BRaQ%SX7qJ*uzOmmhmqjeMeyQv-w5WR> z@u&7?Dub|>xfn1=>yUnOe)JyGn-$lNYaqE@V&!L9B`h#Z>yTNPD@=|Y6?8vi2CAx; zzR)&F^DeiY1!22dt(xzU&D^=P<~%mLj2gRLa*#UWR=k0q>>7_>s4>~AeMH;gC2U6N z97#krUGVKGZXVe*y5c#2_N);hNUOHk;rlkQk(c7E?9gOiB6~NFpqxXi&>AwjT%yoc zTdZ0ktUsolK?t2SbhAsv*yh|D(oi{9%moX2`5{~qv=w1E`rY+ddRsTJpjd3IgFkDp z`-t0^gQDmzZcxHKKz^;UAAxx0PlA9uG*on&B89t(WkY9unu{%@KSPjsNdP-U+oAZB zp`BSwNNS26oO^HL?U1~l+RbIEik4vUF3o<0adbTs7FE)F{Z2`rzMRkOnr77nk1~wt z;thyP?^S~{CZKVA)Al{y0mHM@lkX`wS%Gw&02X&|&YvfiLP6T1nHOTk?s2hhRxqRO zR_0PagsOA!{8p>Xgeqwl=iJiR9rw?WvF8#ALUb4jj}fA%ruX@KXMZGv+)OCZ-gv!c z$QcE4(w$;l9#`2OIRj&eolP}j7D9&Zn;)?pV(+0 zyf_2~ydNe!96w93en7Sp@X?~DXu^+el1O+Ax-T&df5d5WJU!#zH0u=*LD)C%GBLlw zW6N-&qggMn&>`E~P%#?;4Yfpn0GDFvj&l+J$|64^oL)X2MSI!Tt)P5MA3|OFRYfC% zrT$V2D>!4FbSEv{bSeW6T_hy(*3PogOvnjx#HA7@QGsV2r%kEmk5#Nw=vB#ZEfAmr z)7_p2ea$$&>XN}QZhrY1==5e?shL`O;WMwlzr)X}@*u~YTmR$#pbB(h2=7+=TVN|G zU5V-pOqJ}R@TqWc#i6);+;xOjs>Y0VZ92-1rn{PnE%T5v&bVbn(w=|2qyhRMiOXLn zmSk>1!LoebtPsJB?GbF+`6x6Xuavl8772+!zqkKuSmisI7Z6v8g-MEm1`a$zPrl_w zW^i7nzu_qOp5p-Uf`v(z_fNH#{(QjkzT9uGnE&$$`~xcie$7P8ID-pCmiZ^{u_Xgw zEcdo+bEbc9=okNH97_7muQ%|=f6m|kTwHtv4q{ghOH-CVfs23fKlluRZhYG8mi=ch z?qdLqg$JkJ(BuX{9`xob3A=uJmqCT9Q}*kY+HJnrM44e!|B8Jy3BRs}7)AkX5Hp$Wa`e zS64GhwAt*Me^h1r3czYPc6XfWnF8!6ZT&|^E>TCTr$#Q<2XfWV%H{Vp7OUD-BqITHSJk#gfG>xv-~6T4e4eJlssg~8C=O@LV`j{fV8HgHdH}z#F%(G#UwC6< zLl;0-CNp(R*qs=J^vBX@&(F`l6-e`!5V1RjGg-`hMreeMOioDnMUtGV1bp_jOUL+9 z9K8x7fZXykkcCB_#bSXX6myn!qc+Ku7ARI=HCQ5tW?ZbDACG6&w145WJD{3xwDae?>--i9wT$vYjYX~=4XcfKs@bxQo?uZCqEdot;5&irNO_djU{sn|MEiwvUrSkxp{feoj(Yi zwa)aX{_}AC>jEDEm{mU8tUXmPXc0dgZs(&IbsS!I)o>K*B>Mb1!>?lkG9qEv6E>o~ z!G|nR>gYaX-OTPmA4FrsVuuc0mwF`*k{jyC^!B4-FCOniE46zt;dE~7mefkijzPe= z;y(%$X#qgcd2C4Ee>dGu_STQ%u^Y(^(jh@wCxLdtF#LESmI{S3Zq|0@|FQShQB}3w z+o&RvQlfw$tso#>(p^d!bV+yDrlmn7lx~m`>F(~9j!jDErZ>&G@qNPgd4A`QZ=5s6 zIo}xHA2L8#Yv1dRx#qm)HLr=D5b>^g%aR3MVfzbc`Mt??QS5}hKmA>Sb+vM)(N*K3 zv4z*|?C}k_FjAnU7JABUad7ucbi)2NWg&AYL6Cma(C{#V{=)A`nqTlOu(Z$WQ2;KQ zMn|TbVep*ES0Qe3k&nE2$3lrk$8qqhesL?rkU8skzNy8lHil9?)m24f#AJOaa%0AK zUZdVMlFdk7;$u9vUM_Z>UKd(ZcMlNO81yY+QXJxx7ig!{^`-2e<=^`I@&K z5;?de#aiZ%x?(S84lO^NguYCjkJ z{pbJ!j;x8|7tNFcW3K>4KttJLE>N|J)*jpcm)zM05i%;izS;plyw8XnG4RFaaA$4~ zlaUuCWC}7E^c)}GDhXpMVk&z0RaD5QzR)=A=Bd+A|1OC$aG~WwQXs(Xdh*RVe1=3H z^8PhdbLvcx{3(w`Pfs^B-dHn_9}qu@i!_t|)ez4e6(Gqf-)k}*PGmZ9|FZU{ez*AQ zPYu4w7NlB`6XqfY4F9BhJe90{*GS7x>G5eda?@td1u;Bw?3_TeC?cU2I-(?vzrkI2 zoif5|>u>zmeg61nKJ=DvfdI?t1q2YN_}h&5VNvhC$t$j6>0(;fs$^HI9Eiu_&2+;q zbSOCJJ~w~*HFuE=`sK2Hv*~y$hbte zw2S|nZtU|C-2dGh>Hi^b!Fae1Fs`^S*LCkx+XBWYYdM$KD-@Lz<3yTK- zIS1hV4Tgac`C#~ecz?gTd4JnXOc?!_Nxr$Or#u*j-^-8d|9s#Ni2F}HD_ti|IJlpA z0$L)M)qDyN8&H1p_N{lOA;Qg#0+0}HFyvfi92(8qnAX--QIY5gclL;iEFz&QU z(g};fuDrj*MNvJZjR>!-oRBo7fc^qUxP-}`K2;z++n%aY>yDz6*^8%)PP44OEaK^EC#dQ$-q3OzB@mGmP3~$tL2B z*@)0=)t{4u)xdTZIoE;ClKw=YHjDX`oy@^{R>NqgnX zCOY4Uen?(l$(cGFmU&#?wi#96%MkHDwwZ4fGBad)-Apdya>NY*D+HDShgEbe47R@ks(W1}D8O|F!%JH{70+ z*kaN90VW9G?H&#sw=NN^OL&Ak61U^apRRT(f}RLKVNI!LqNt(U>+XTx`cT_Y$!a8v z>6lq8k7NE^(mjIR+z18@)l9i$TEGPStOjJw>Q7hHR4T1f=XiOA{>B=gV7r6lv|qCE zNF++ZO2wQzb&26Xl!!m(tIM;arH^#^VIcPH`b;UI*BK_)8+~@VabyB5&zrUz#<-6k zeL57!WgCTxd#PD&4rNH>w*PW{v8qUo(Jk?9_tPWRyGdn=d8%;-*LaXI?3ZWlM3&zT zi$Ep|2tq8TYt}grRUK4`yFO>9_{pPEW#juUS9uU6+>}kD@HPR@G3O&ri(0eEVwH`t zyc~Og9Lkiq?X@u-wWjixa&mCUI>UFKZbK`=3mUFe+e5EKfb;w z9+>3NbY2&Tz-X`3>v~eP7{ZgT-GU@LwmzIiG}|(2J6&l_3HZaGK%GLI!!=zx$6~c0 z!QSyUls%#nyn!|{@_{PdTZxolQg;FdKnr5mKEj8s0clj}HMqw9&mhK$7T~dy z19EM~J<=;(Dp+NBqZwcpsCtk<=H0ik`WLFNfym0zxK_tYuvKQds$_oK9bw*xEIHwi za|6UyFFLs1RgBr3K zl*xhCkJXP@4La?|HDKvLR8C1@g8Ur^JNqaNY4gM7hNCg)J`f1(0vs4hl%q?av}5ma zy1$IJCrcR@20Mnrn1u6U6YR%SSZr6MF=#*?tH{I1E2D<9jm`b5b6C$^e08#5LiQo2 zI4$?nNa;jw6|re>FGeu9!L4U5Ky;7ApqGGaHNjSIC|ybwpMPQbd@*!Smfahv4hRGi zagJTg@3s>eBQ*I}r7CII2lEqrkOX}Tb9nb!!@ZbxB}kl!9eF49v)tH zj~DTSl>-q8ZKK2Xls<1F38Q`}2_GAu4;x}FRGzfEVs+ZQ)%%`0bqrAu_IrpBc?596 zpJMT^?UAJM_L}+`iK?P#eG)3<5c$v_&nZ{%n%C!Kt7O2Qe+SkOYAc#mlQ;{7?0(qH z1@>yCNm_!h4#bRryL)@(thS%0svHZ;wXqHy>8ykfPhjdctt%RlZLJ#b1RAmM{y=F1 z2b}07OXhFrO(A&sVDJpB(e8qc{d(|xw*{*Q2oU8bdi!IOa6^pAowZ!H71jqAS&mAZ zY)^o=RczRgqL2<;twSJygKIqa)%Rm<5W|K=tu3xy$TJ@*(IYGYS4-aaeWk9DXXl4i z9RX>^AxifSjI^AVJvQk5m}(D)NNfhh@CiTjN{Zq6Fp0hpEvXRdtVb_GI?l@Dnf3g^wT|q>$0IEPl0Rf+8G)F0%I@$H*eBbrYpwu^2Hc#o% zda+1&ALJ^pUVONHsjvohDuGz-!#u7f(jLrrVH)6TO>2nle0!btc!9@a`bD3^a-`CW zNh~1E2~JI_^X}k^c8j0EJ~j_(UwggVx!t%+hzp=1AYu`H?aObt^827c$gvglVb$r& z-wuJz7vKuW4h4G^Q_IACt*hMc?n#_&qzhu21#2x<_bFOyR%D>5$FYb0`b{dQ+nKb) z35W5Js2JmYZ5%R)QQkN3Sg5pAF#nC{ zz?Z$|D{^JM^{aEz-qQI8D~8RcsB~KO32*U_$?cT#RD)&8xa~JyXjg(<`>5_N0tNBC zB*}@LnXZuCHVQeE0bh0_c`Z;tm6lqeyi7pSf6&h(2dKR%t9vXv;r8dG z*Oj6Idw?N7RyZ+|7Rqnncyy4Rm1K1c0*e)&U{3QXQC~nvXtfQA zqj6J$txH!k^wwJ_yfi`oSwiNl}j>Ity6uU7rhP2nGrF+rM*CFxLxkR{lztE)y% zDrEECtg2~%Gy%&JrCXR6GP9h(Wd!99)3Mxjb#KW~COc6ovzcl!_baD#hJD}K7dz%L zNVg6&Ptc6FxnR7f_M}PbC~6)*@mJTj`wG%}9R7#^1nsapPaKGeFSQw)rEh6xp6pw; z=*+ZHzN-Mi@Qw)WEi!EKdlxDT&4K;jD|ZQWuQuJ_Gs!euwXIjJw_7Rhk&jERa%_o6 z3tc3=_`DD+Eh97dLnnQEoKb{rs`jcjCZ-2f#A$5a+AkwR|8dFc7IiE0O2StUCKMZp z^^9M6vo}gRHWZRF-=m|0rniAUl%Ae`_VT-}oOSce?8%9+&BY07^7_Hg_c}UWis+1d zD=ICjiKA}>2Oie=-UbvtD4oM7 z(#n5PNd5f$;$VOiOPS;cSD$kBBor|44#PNIK2^czf*ofsDq(ULZw^;F2KD0qaFxl< zZlJrl7{4+JlpfJ%EVYHRNpvV>px(+jpd@i1sGD5x4l0)89cK6g1go9tNq!M=aH7^^ z^P!+lbX?BSsL=6i#jNx2c^FRW=FtRre=S4oP!HV^w>c#15avz0R(lMFOTr#;GX}R+eVY1Yg_q_`~ zXq&^g|b|)`JndarIRfLQ!V-B}@YSet```&Ip zVjJWekV}5=k>Bo=d@gwPt0F17IS?Tt;NkDxStaJanHcBdk02f9 zWZne44qHYkPqN(dL9tgH96opb-S{t>7OHUeP-Bf5ab_a|1S+s^jm9eFzMk_kjeWn4s)_{fT&hfRu zw7%&d2b!MSFs`wap3Zm%wlKKUm zg!zyM1C(@w37DH}S-z+F(NI2lY@4>NzVY5XU!mt9On-Miv}|go2EImNm|EVHh&kRG zMlSLWBCyxG*q_}MPTkY!$+-4Y7lVj{E%iceIU`zg>>x%LBi=+YZ3hZxZu-86C9og% zRke9|+4}mrG+H2rFd&G_LC97eHPBQkd1{dGfcl;x`CquhAdrf33die8ImxsgT z`xCmCz3196+wU85*Tx#H>vizi(TC0jH9JUr{z@BU5fhgrs*VJuZyV=`ARasLIS*6C zV>i_n1)O5E7E{k$L;Uku{rA4}KdF{ZFR}Q4IL=XEY|UNQkCrSfa`HMu`i4pd__KsD zbuj(7x|API#yM&nS6&|4yMql?eZNe`+p$8^+SgEsDc~{0fjpv+u#`Q(HTT2?6tfUC zxn$+P2p(T;%$>0FzYBGpO^&%^)~5tTdk|2TrQphUtgL0;QwMimQJ8by%%eRv)3}S) zs`oS4{~n!Uu2uVW2dTfQQ#r$dI0gICz%$A;nHTF8Y1H=Ood@WQ_aZ>d(hWqSS#&Nc z`JXj5zjkPPhSDx*-tGeWio!`A)|AT~H%W>^dCR7bCpjP<&9Bfue^h7%?o6IL)35f!^4GK|K%yuR=YDanXJ244_lAhnvd_ zJ3yhHsj`h$S)%EJ3Cqno$#bsQ3M(sSziC_L)rRp-NKDWQzb%41Y;|T5TFSWx(}ncm zPGR8~Dlr|)dXG<}Br8Af<>>=a#36A*CzfYKc)8&KiCcy&AnvdP>!)xJR`~cpXf&r; zJGkYiwh~%uKJeIcODHwF@3pNae)TGQWLEUi7H|#~(zzeCR4$;l%xY`CQs8|xjxkm5 z2;6M&8JHHJNR^m4{rFrnxVHg|Gd1DJZzynD1oEsDS$c*v2R1F z=iGU8+e{%fT95^&kF%_6gzuI924l#wZooA*?I<}=t{x_s% z7$H_ErAt<-_ZL|CU%wMXz>x&l9Rjki%)irUD5L;m*)7tO`o|^U$NvwmdRim%F*rCf zB4T6=>(e~t32N=tu=pjT3@tA|#= zB1Yc%@kz`*E0F+mQjbDb`zUD^RNSNg8;7gI-1RLG4g7hAo1vjs@lP0%>xneC{%W)J zXIAt-D|{LWHl#Z-S9$#3e)pe7^$8dTxwa0)zejO#gZ_F+(f{YMey!YJMMnRttMzj~Q+14x25}am8Q%Mo&RXcTACg_}aJ%^h8L+^SC_X++TVt$8(;e zlxI~gGIO#tESI=HAaGr)j$aSbsu5VoSYs>Y#xvYHK9=d_bR*V%SnE{M3D45RFTcvj zJWr5ZAbS4;)S5ge#X1#z`=pgxa;`oUQL^vt@26xI44^JhbfW%YU42~SmoGmznL@#9 zOZ1Uv9<;m*_&Y`gK>a{?fk=O`XXmdCxx{&V*dS+&K2rW){jaGBdV99cSf#-R#o zU0pk)|CrlO(qU)%(fzD&YB}0#L0&71^W$xsE#>(lQ7Cg%Q@1c75f#kBF~$57@xCwj z@qz*NH}k80&uB&iBce)VmZdufg3o78oAl@E-F8G&X4_j>)HhRE`fcv^{&^n2lb?p& zq;ai&c(GR*4&*GSoW8-_v;`RT@BpPgfM=`czKWqcEe4Ngyjg}OR?_OJfz@Ye?Iijbq?txD*vuS_Z)EXd?p*-PJNEb@ zcVTo_T>)eik}%IsAq~`efVwYhRNiNGlQ?Oxs8`t-FUay246Z^th;JuDcWSDDq#tFMO_E+DVfm z?2G1n^;v?%A;ihjY_?b|$@zPOEipx?z+eC}^@2`VSPU7Kz{NL?EakzkEUqW;>N>Mi zn<`j6vRw^8&Gv^ue7a4Nd>v=1cd|=*mwd7?h_PN-(`{KFFj~rZuP@LM@|k*rJ1FD#uaKQ!msl*pJpf0il;b<~xzRy5Nc|M(ag_nBbGdx$5aa)N&h;Sgpki+* zh%tt{rl+7Ggxw>E(TMxPtFUhO>rVAtiqIV=(;q};apw|#CKz+5~`nm#OSoJ>u27Tl( ze(F;RASP>-T1eob#~BSDXdFi?Ai5L+8TgXB5s$UNxV7_k(P} zLPFCjY`SyD@($c%#wN%0tP7hmf~Pypjr6VG?0QszGWnuAQC73VM6{OF_0<4`dT{T0 zzNP7>+(~oGd8mi_%HhMijp!RaOe70>KeoRHC79mEQq3ji6rR(hJSO0uS+s?@!E5BY zK#Lr`pjgvQyi3Us$VdR_kIX69$X;)fOznQ%B%-W!!vq_aHDjr2lCA0YF*;dahrVARrO{oJ*_$6`ROqa z%;0_kl%L6?GmOc7wjLkj0A;;)+6YIm)-z4>3h}~h^S@+dWjDqgW^?W}^L2@+FO< z;!>G_BfLh;wNH%^&G240^ZG>1ha2_63^cUyOagSQU{GoJ5-mnvKfny~K%0oeVnhzk znET*zRH@}qzDTDn;jUM?+3eSj=0OHWHk#tYC23Q<#W_4XEdw=?lqsbF=e?sIZA;x9W>XA= z9aPGG?Ok!zfegOub667c{Ru4OJ63!e=ku5FbhmSd{SF^go0`2=wR8QE0k7RP&>zDD zg)c;xgm)`?FzpDp0b3_IZYB0S1exO4(hxLNkR`0KzP^Na)xi&67hH?YFgR42u?DF& zg}W~J;v1dt42eu@dd!q@?Hn)9kKmbzK$2_e&WNshE&4 z{*h8&5*+-{uZ$My1|oLw|WnYHOSDMHFb~FeK(2M1h1f(w%-Da z0nO)D`}F8!i9rCTE-bS36_6hS{?z{J6X&h#FDV2d@@WXGOF!2w&`i$#kr$G6K(u=kU|!B1K*w) z-DK|!dSg=a64;DwGRLanMC@iU>j`kol^sJs$STR3Xp-_h`a zl3eH2$Mt5QAfXYb?gjHg=pG>gcX#faEqd1+g`5pY5w=01f{1c1doK_1{@(i|LIKd0 ziX2@le(o~%X4-5l{kx%w$aIHOKjz9j`~wzip*2C5RKY~+BB*prRI)M861g2(Y1jjX z^VBM&606I$$!Sb^8%bm3tvB;ZsX^L+7F)HF#P51HF=QRIVH+titvf*xlh9ZjPT`#U z5vI-?AgEldu}83KeJ{1iqGUXT&yPvn=hk_N6>9*^yo%fOn7`8d$Y-yocMLX%^UZb` z(iOK=We-K784?oUDPuk(}# z>%k_CtM}`_l?ObHsKXYTgN~C{hOf#@q3y<8ia+ZvJ|=-qE3mH7@xyL5{Bbp_326|V zyv6zW&^1rKnG60^jDwyZ_&%`fByDgiBD|_`cm{6tlp2xSRlM!8gJuG4VW3`wtEX zvx$}uQbjV%<3E`KJN8-PY=bTs+a$Lo#TY_;4AXinM zzjnD{T40)#ED`eH;)EsQ;BffTEph0fFFM}gG%cAbu!h)6%8gyoGbTPU8WX6O#Rqb3 z85o>*AcvsHMhh9y;G~OY(-@D=yRj_CwqdV5ekHXvKK6SHPkxDsGWitW_Cg>vUjc3j z`L(k7S$@s=$@8%~?5F`m-rp#HhPQ4z-vqt5!4)etmJLxvmWOprgZAT^bd>HFLSg;U zP7N$b`)~V}LprZjZ)VuB) zwQ(JMmIUUu4S4P22~uj}pxkRIGv1=K<_xu^u@ZGDH0q0B9t+#n{1KZ@pEV2$do;9a;@B^O2AmY=eO6q(b@?!m-MgR z)qG!}G?v+N3J4XFB%9~os_M~zEMm}hwsdkO5>V4?dAkCi60J1UvO>_(~a9_MJA z&y$upy8@t-xYT6&1Q*x!^}Pu9d=IM-P$ zCV9glsqOQkAK10FCkPsYkmB)}mwbO}0ZeOsF&6-!!?F+iBHvf#ryN{P6;AZ86+E#~ zMwvpbK&I;~F6E85k+!d)7CLO+o+jp6ZQ^;c4uwtO%0?zZdqcW|IiS>Gz4_IS9f3q`)G82^Qq=+!ELob>N)=^v0#{<8oI2C(4$<8NxlY0n5KphANx}U&{!=E zOSM=nlbAAK!DgAI9pzxpF{VMuR&IQ8K0)oW*CIJ4`EU5n<|c_Fi`(zLU*{WB5&aA+ z?!l7p|S5Gvi#iqn!>t;@Gn&G7Agc6g>Pq+hZ;^9)yU^|)(NPSrEbeI z&uHllQNtYj_x0C*M__K$U;t^DL1^WlBwN5%Ho8Hp*A}!F|Hdl+g#C(d@JNVmD{t2jyvH>?UfP{qOe_*E9fMJyU@2#lwzr0+nVs@Jo zm2XiVzpxXuAfPaJ`2!dryv03@8ykhzEz%svW1% z?GfB?I;+LJ7VaPatmno)dhTL(cDZ;!R6E~XfSSQ02`U5-P(Au3HI0DeE%OV5Cq=aK zXm=xoZ(2QqT`o{`+1wl`GDDQ5);a-0)$TVlU~A5>QJ4|saIPMAr{eg1==+t33ZVXQ z&bBkYz%qN@BKBqZ1|}8k$3cPD39PDif1$sPruB*fK>!%S+*#@W80!Fagi5(ZcVdRc zI)OhkPD_tHt>(^c^D~}Lg5y^9gkl#Lw!TakL}B%Q;>pbvTY(4TPJk(_*3I3Gj{kLL zQxTipV4*)O#1%>aT4nNL^3lHz?`=zvRG1a5{2xxd!VSdr|LR57v&fan*Pwu=^69ms z7vIgSdh$}p11&P%?_4gU`?JY>?!AE83n~jb3*c9*E+fTbF3zqc?Lpap#|(d3xcx`K zO1J`@6o!A>mN5`k7zE?L{kLsdVns+;j2}RH9?*^bJfT9n!UAgE<$!yw_x*f#p{4&CH&*td&YKsstKeCF0NczW<66G8GrYdO z?kv#Z#urxAac>=>g(VvFO9p-Z9a#J~$o#8^x`0BT?tHZ2#mI<_bbjn4 zx~Ch;$57))&11YdZaJ@XQhzk2#-h_2DjT~K4A9O#@}EGG~gcEhUP zF*w`$k(xGb-9b!S$eQ?~#Q~H;JbWbJo^X5yVC64B^yd8OE@)nR!`gREf&o_6v;%be zwr|P6Cmd%h_zqt&fPlJp}>{PppjjGC=t66XOKHXVYME01O+MvI0^ z%;e^1xV=^>3+d+M!OJ?OXnHlp`m3|}Uh2AIgy2W8P0hl}Ut|b=kj0xsj<|4N1H|Mg z*+g!x8OUayQ1vAs!Et@90AQp|(_Q=*#t7!1EMN+!4oGlE^(WRy9<0zOJilp?Z8a9i zy{iit9ohSxlvq8WvBwsPknFwQwwO0XBYg;75z9)c%}HcWbyA^;d~AGg|z z?Vp4NAqrnY;j*Bo&%i$Te6+99uCa9z&;h_v(duF$E-~H~njNPP96moJVVxb93#GIL ziIxNvASDnyv{w&La?$UOpqZ+1I#}rfz9ku$<+@=gG6{rCFLX|V)NO0YWj!N0!+Np3 zIutt(s2yW(>Z2*_hV(Nt+^UnGuk?WEQ?c3P8+w=@k;RBB zNMD6%c@U>{RNJkaAiCSOSQqyu@u`K{Z;s9QjLL?ALQAPURrb55*eq9<@VPE9u^E7M zYjewC}!=f#Ho>S*fi6Zdq@Or)aiPLH&A9+i~H*01%}|6YdZ=nZGlPIlvAe zo_l#c8gQU$Nvqrb>2(qo-(i^F%WJ?7CTy<Q)7z(D3G!p#(>9f zvo-OWMkzmmU%l#Z=v7%hz#ff7Zz)sM7QD&$xU$ecxk4^&2#RX;+igP1vw)!3+4@ej zK&E7X=?TF1hne~h57}7hQzC(azFg0@ciF}2wtei?eOgb#NKp~bppJ0lb?n(%s|DdE zH@ra#R>S@zNKDA|GdWH zg?vytX${2vLIk(>u|%%6MCO6HX?4KwIW}PL9c~ur+JAf(ghzi9OW6n?I%#3;cXSRv zu|m3mz=oO_O?_JU#4qZB0O^znFGjiF;zQbZr8PnV%;RHz5+0|^OjR4-bEJ)IDTN5s z0r@@gVj&mTS215*jI7kMb`R;&=GN_6!0Do7c-2K@SxdhAveF#T_F0?{jfPhp*-D8H z60Q%DuV0wM(>3U9S9+r;#6HF_J~vfZTX@m>c;&w3T)l?-)#O`-l{}#nouoKE1A)(m zZ6BA#Z!xxKn$2#-at3cdeHGOW@?hCtEkVQ8{Z`ydtBE2EqOl(E(4%Pxdw$rKrFT~_ zsUio9_HUq5q=)sAq(nqHnoe}Cdl?8T`9uhF&Z98Uma7W%x=U~% zr#}l|YynwKrFQILQ1cyjeFYavaJc8@$(aRcoM(av`jeZ;f%2G@xmpdfpnO`%lIqNmr2-cmLwf<`mb3bDUYO2DAEdwdXX}eiQ^eo>u2Zvlix1{EiYG6y0*9CSXZJJ2RrML`Yn63*Hw!ukg*0RinaNEhaH| z>@9)!v-P7UL=0r^WSSnc5+o%xfXqQPhPByU{4>DzybMNhd&8Fa0hk4lySs^mQ6D4~>P2E#5XTTfbbU0Yw-gB7Edj$~y zWe?w-^|qs_ciWV)NG?8yaDtv$F#?&`wTg#d%^POjnVzs3BhM2mt|UgYY#z zT)5o@nw#Wu!a;93qSd9?i^oy;#& z;iPr#alC^d8?t}3c*3}|Bdi_<5#AL^*?9?H>Q2Ne}h zk_?+nJfRYO@A|-P$DpL9!i5w3bF6_oK{Om3S01^5m)*t6=fJRCq}z!1QRElnZ#~tb zTDK^SySM4jEjd%;c;Fc5J)o^Q>iUX3Z_qP53f=uZn`D#;NhC{|Oha+AM;xam$>}#i zslE^1b{pAQCp?SQJozi?-(?hn^WAp^PW9e}^;uM}Qwiu~Q+>NGiJL>hzx7#~eAdW{ zDcGmZ{j$1q<`kIk&hdzE+52zhF2@VRnqUz$L=-~F42SI@^xsrQ$f(d*>|bZ2N3W$5 zT4VrEP{PE1^g@a9Asqy>Kf^-&0cWMcA$KUx&V;0G^?PT#T@vZHjoCQW>^7xw+{nhQ z0VQtckj=4GaU`-~$$7#=p0oC5t*d>dhJbq84dnclA5C2H9&E(^S0kc4lE7{UobrFq zF6&QZs2@$>A!t)QW!GTexr2;Z3@T;XrpZvoqSx4C*INVUB#M*mhNO0cuMp?%?UG(U zdx88pwpNHt;%@U*1F0JXxT}tojhulO4S6gTs%5=%lmnwSd}oenfDnH?*g-l8d|3*g z!3MdeRz3Z&ZtEGlx09KKeRXh5&Kzu#tLtUG)opr%oE$dj&=7j45AOE`=pD2RYq?{vW@?;*)96Cz7EBI)?@h~R6X zRA{AqqvH=D?p<9{u-x}nxPYg`r2hRb;VBmA;hD63xCs~>CFb8DQ1Oj3^~zccc_IQmCCq5yWXEFTf(xunDYsu`-} zy(^dt!0R^HH3ajD0{1N>4Lyi#-Ahp>yvBYLPx@mmQYhtMA9`X3+1=3akM8%rtPI33 zQ+DcjBf|)s3OH2_K)0mN#??%KaBxnM?9GX`3#uyw<_=YcN7OYuUY^Nn*mRM)d90;W z6~|$1$VrvWnmCYhw8i%)HyrJ$)VWyZEu+=&?~}_9{%08;Z)oC7{}{51L?^T)mn0P2 z>888B?|N8ZVi0H2J!<1NmYau1c*4xTZtcnnwHLRRdf##uMce)u%SE?{l_b zoR!*Z1eQNcZm1lnmdtybRbJ5cx{+w5RzgeKNKc6#S0g9&cHJR77+M<&A z(cSz3D_4O3m1u!xt<$C^6aYup5;e(Ux9a6Tzvt*S)Neaiqj}s;RB&E%*2E3y@mf1n zvbDxc6>Fxo#)oZZ-H~)385$oT+NhHyUc=cZ)WVRkX0s)$vI$@Ekrc4eME4x6*}#A& zZSAdUADa42!^JcxD(2CSi0h*V0s@f(>j}`}=i?wFplVi?m~8`+|A{tyk(M}4INw|m z)C%(H=89I2Z~#b6^f`|8DAIK;)!%hK&0CR-z|9(0Fju*#In*M@Ah#*mWE+&Jar2yS z;kYB}yD742A1$DMiF=!^V#*D@H->Y`t>68Q^cLUQT3UW)RSH-5O0rEi4gamTyF}?& z{3neKqu*LgHqz6{8Z>Gg@J5j6$P{r^KriK=c3WTr=3A<+;9FfA6@HTnr#cmi^)k+I z+n%nn)qA+u?zX)n&3SN#>S%phsyQ+6HUF^X z`O#Xt{~1&~l-BKMAobbjSKf?QA`cKCS)-BEurf!h`J6qPrT+*K*=oE z(-IJjNVqqIy!dIQCH|dALMk--l`*AnY|nRfrB}Cu$5R?am&@IWg5u?>lKLn0A0c9+ zPznZH5@bB?!Tn^CJEJHd#rIX%Sk5tZuEE3OTJw6^emwCryleKG!G7tmESz$yef9O+ zBq-i&yDrqXiEnavfU@pXa8nd(N>y`I)$;vx1f}WBS^Uq#38Kp*i#5(TscVE0b z*e@}@jAh&|EB<`gi4&aic;c>5fma{YvY`dbD`R`(q@Qbbx^wStN9OX#6ZzG75XpX0 zzc6ygKK!2XAua=hgm;3}t6vxVXk6M~M6!u<*r3ozgjlCqfb=}5ph?ya^ULYCNUu0$ z(^fN$g!Sv-_jXqF!|7dPR7YK7O?HU;VuH(Gudc^BFV3!YKwpEd?$l-Hs(eLm!aC6R z_;Q|8(r++L$Nx6(W-hB+uyvS{zTSj{$lV?vH~Ji#IsikoMM2NZQ`E6s#yb6PA;doT z21K$}@>(|aZTfO^Mi9cY2FiC5r?P639b*Gk9S9Z6Oc*hS-DdoON`9Dr8R_htXv zWLX?nPF2hf1+^)6?n(t=VyQtN;nkz|!Be}`)DCOFuo*jQLc`yRD%A>HF(sp8zgC#f zcN;b3mO9U|r?GlFAJgYHc4eGwKWpDacrW69!<@;_T!Wse0;HX)0q0Cx*CDvXorY(Y zuQ&yaerCBk*S3Ki&9rUM_=)*i^Ru%`yIyXO4cq+708I{eo^02#k|6Im-EHqA`bYOr z9DyHKrHIrZqi?^|hwB+a{LwP<%No-#eF1A#zFFjD;4yRjuuX=#l2iKr>t#()Sv@Uz zo~PTL_wLgnQK`jDLB3fIDJy|4?WW{YOO=Str;*U7eR9i?uLp1PB}{kuAA0UHoWGq^ z+Rvm0ug@g4*yoWuzVGVQFCtJ`h_XizOfauouyBJ2zO9H4ppRjoy;;V61`8tXzUqc2 zE(Wkt3sxm@^>mkKg1gYgJ32jPbaU+yNi?VUAn1!NKLsNn# zNfAcHK2iKO+|++_N0h(6=&62a3yR`J{iKggw96wUQqU)0L1WF9h^l;3c z0q@0c*pebC;)_&c26K0E&jSu}P4ZL?)XLL~qV~0XiW@Elgv2K9bf0{*!kh3{$L@?S z7QoM?>*l|Iq8P#2J%55hC`U-eM-YO)WU7by|zH!xx z-TbCg>2~*(D_==hXiy_(6^8AvRiXVX*s@}*^Wn{keJF0&cSS~Vr5mVP zk;J)K+mARuX09=MqVU4KRer78DV5-4vZW#0=^pk8RfRU@@2d&+1qllF$x4&d687)Q z5AO2?o$0%+P5et{!%FR@}Y=|Fvtb+sT4t z&BcC@Zj{?vYDb=a(27H6DsroI==JV;mevHI^BTZsUDPge-?-H;_(3oB9y=|iGH;%) zrM*7xqE4O!EM9Hk;U68?71_bZ)^ zxCR`nX6@^N8Pr_2$C7Kt1twB%G^it|2<95i1W|q=@Z|Od!VM zfX;N~8yU|xUU3-qGae^jUvL(i0?#HpnDjbtyBykI4#ZoObeTGki$34;wVZW>-x zwUf~uCN!mWHL;W570~0Ebu(w|$1jh6h=hu*9YIrg2ldNmfodJ~7iviOb4#Gtllv;V zKmVazQ}8&MK*6u`GKrJY&dHO{Ri}8j3C;B=TDH&98d<=2=z&0uYMBYg0u+`f)N)pS zzEktW>|)b4nr%q<;HvB&LrGCDIp5O@$YpHp1m3=W5b{->@c zb!qV_CQWJS#V6mIoz5;+;$c&C1+@#I3m@+InSL3t1@XZt{OM)d(bH~d)&S;?<9C_n zw$Y!fH885`xmzW-@0{7-vZda~1-8p`7X<$cd9*V9KqO?0G-t=gaicU?zjy#f$3T$4 zqW@j1!K)g`ZA^hagT_jlr&$DA>_X?zx&Mp3_l#<)>)J*YK@^lC?G~gc0@78QbWjB8 zMS547bm>wO1XNU{Nbf;}@072;xIs`}vd@K08PaEf)amE?v$2Y!l|6#w<%}HUv&h98k-h{trs+zhrgrd!W9s#;S#p_l{#X--ehFB-i8_#8R|T3 z7F;*$_vd-RSHz|!b@U2^hzh2CU|T|spa5ZoiLtoqV-=GKxT2P1>1now|(6hv44f?1I)xt-;f4z-UC=pw1 z#Mq~}v^%d)+MhnF%f(D@s(Gr8thUc9?mau)`qFf9Hd6TnW$y)&Z5HfPf=q;+B;(#z{_>z{n*Z7tR})j<4`d2Inpk+Pkb}{o;Q#S zD9KJL-sUuzIrPI6Hjw!AJ(#mZ5-1Lx9QFB5-)mBv<-T$ajBiBtxd7+HmprjeGD*Tk zkh=v!x%@@YSq`=1d)9ABW{1s`TFVAo0&sq_p!lrz8nU^Bx#(%o)4Xt}{@Vtu@V0dmDd@Eqsk1@q@V0Q15Q3dFhN?2@E1U zI=z*=Y_F|za*D3*P=LB!TOc<)aG2EYhseXiMHjc?3x9NBw0ABRrSJLDx|0uXCL3P= zV|)(Mv23X#!`S8s^Xut5QZ`hnD*w6)00tI(3iR~I>a zb&xzbPEMJT&P-5$C1$<6&$Rf%z7Jil^e}Lj$2#ah>oCjh2ivMuEr~+* z5dYt4X5~9q8LDzBo%>|;fAqwMog5|^eFFP z)~k_6>_2T_kUIO+(a!MIBg~pJ0_ZVC->HXXTRQ<6!b;i=N>$mx$MFP6z{JET`tU^Y z%|cJH0w&Ch{Ksi`>fB&?)}7#{o)CYLy4KpTE1m&*!0=2nCLH9%>Lle2Hyn++YQsXf zvHLX=M2&U?g_%tytE=h5xtbS5Fs=hjgV+Sc6U_qp>D(((*=o26a;*oJolei-aa`f4 zl7*QHJXinfFVS9m{tWMc1QZcS{CrVBw5h~dI_)Vr#kgwM)LfJ}#iw7|Ip&jpd7>o% z!fHRp>Mv-RC9efNb{VM}{WzR_F=3d{FvGuTE)}eFN%Z&-bab}$f?xTs7;ZgGJ4kbrXDVcJCC_#Sh%2p>AFLreLfIt|(tU?74uRR_5?O<}cQ ziSK%CQnO|nJq6JTP!g=xRa}aP>*M2T|I|JREsGT0keB z{m{kA9~h;6wZ->!7s{Bp=`v@WZbKx+)GrXICEV{tzQ3}VwToFRSwn&N{7&C4WVN|( zg136g3aI3AMkR$x`l7dZlT#sRHvc*8-sI-v9qzHgl>v9Mw^nC%Jqc0ncG0{Uk%`O; ziUK*s3r1Oy!+N^C-ZZl&ZoQ#UO`@+IOulb?-5&>c<`45L_bF>@?Jkh=ON@yu4-mPEo2zMZTA4jAp3+ICa3lVafvohI!Pl0&@cG-I& zX6z_&v z$h>zAdiX5T*(9S+wwhc7`pi&~_Y=@UV`^q3TEe0la~NOFZpHJ7+7>-|)D zZpIi?a@x>m4x7@}VVb69xrz6TGx;^FxBi)JjiO&lV5SBeXFXm zgOKRYsp17ToB0Xli5CNaD z54MT5T=8z$R#eRZGTm#;zLe7ZPhmo<74l?78yIHpmawK*-NV94hU(_LNs&B39=-AR z5MrKb{4Zx)xd{dcHrMZUz6+;kkmMZ3R&=is+N=AT(A1!$my$!)b2G^HFOcU?kzfwl zANWag8cdu5J_M+1s(kDmLh$`CGghAd>BAz2otr;od6V5GgzRZ0XXHAbT!%e*YkP^B zUxTvi9YZ}?tE!DS@-zrs*(-&sc6{Orgdjr}`0Z^O|o`=iTcit`Y z%!a2?1DAeHi9DY9qK&CutR_?dj_4(DB_*p`i3D-?XMB5?w4VeNWpHVCPFv*sm-9<; zCk33|E64iucqbe0BmMc47Dx(Uku9q48)vuv7c3Huzo9#nCsalaFX7y3H&z_;6x;Lj z>^^U((ms(d$?*Y0Fd3M*F}n?J!!a5%baFtZv3#-;c<;YfmLe!+So(Ub85p0}*Hu%FX#e%_XP@RPL61#g zretLQ?Cn{E{V)M`-8a-MTUReX76xdc*=(xs zsUx%BRv8|015jkaO38b+Gbh4T{xw|MMvX@Nqf8bB%T^I}PC}4e+A*Q2Kh6;gV3}Lu zTN}R2@!9`~0O*0M{rh+o-pLe`fX#e0!J?V6^Za>}!-iOv&Jn>tfub&4#;1RSjF9birUD{P#w$Pc@+eDHs1i+{t0_Xb}pcS7PP%K)ZKbh z15L0^ClzPxXWFCYzCJmTojQ?zM!a8p_VV*93-Hhb_Vb+3CB4-E3VQ zf@c?h42f&bl&9qls}U33#C>svrn6) z9R3yE5eU%k-qGHTn+Pbq2_DY=yq(`JX3zEInF;c*KZH$w#Q$|r?%<;+CF=adc(df+ zPE48EZZCY(|7LQql;8*R|F&5c#{k8s?D$k<{7Q7uzO|3E0W)Y(<2n&sN8ZN1F=qRQ zOAq+B_Un%8G(yivz-*$>Xu&38hb%bB$UaNQ@Kn!uR`(TqtcByDK^BdBgm;(+>|*C0KBS>je0{+v9VAVhiS7WFK>FT*P?PZh=rtYI14zj z~#2OrXMYJ7iX)JQbYOY^oCIh377=P`$i>v;<; z|K4dZ2Diaj9Q?l>i@ogyaWDmKFUy5j6>3UlNas|3{rXl=%RSblxcP;#8iiieROYGWx))_&$xo2AFV7JwuHdj63j`w z@R&uMfJ9@TZZRS38JeZ;x=H-;{Mqup;|pdoT*L;1Jv79?Fb^J;2@+hQCZQ3bnRN~1W`uw;Qb;_0F4uT9lvTt(XCIyOW4Iwcy9_QyOtTi?dDsSE`JUd%WYOI8V> zipow2)mKX?2kVt%$SR;*Dz`Y6uhR|K66OqbV@#iIeW;q%6Gwr348m%O(Gf^3lz{{a zJFP)Mr>SaBB#xXTr34~@$^CYXsRLQk+y!8)YZJjw)kOz9C#+t6^GvNt(}B_e@NZ<2 zEk1k8v{7cv)boI@BY++mLym*e6}NVi1f3jglDCbji;DlzvutE65DFu;N9GPX=JkYw`j*=M;o@w+d-G%yF4^z*80J<>P zRm=0b>nQI~zZijM3E57LgL=HrNa=X1K2Z7Z0Y*$+R0;sIA1AxFmakV$Pq+108&$k4 z<2A17Pkkx77n#*P^|(La-qES|5(H{uXf1QvBt?-dU{eg)UH3xo4e^g>!m`KTQ|7es zWROqh;1wtGM+zKA3n$Rv93HRPrb6-SoeU5yReqcghown4BHB$GIMuOe$0DD7JE>CG zeUC)k@$MD-HNG96{mGftF@CAsFr@sJfe6X9tU8Y&BePOI6P_+!DK>Fo6ODE}=ie}N zb;NNYUq{#%eZv>$Do&3M&tj>oj&d)$IW zyPs3D_?#uerQ`f#vL_;aVxCj&qa{f5=HvT)*(wQk_#?vPRcmb(K;&*60p)T>Ju|nA zBIDXYYJIY0WJgTNz9YO_8Wb(5Ul^@Ny`c(2pLewU@>OIq2P7{|!n3*^PXdWfyq;tF z1l9p%=VM>uRykCJUdT@0(ln#}Dz0f6WI~xit-7Z-QwZu)F9)J;k#fXG6z#ZzsOjY@ z-MgG9;ipHTqlr-o9v&jXPr#<>2^?uFo8DNivX=A2D5|^(?Yr2@0wH`z*s-GioZU=A z0^_UMrh8)f9*wid<*Y+Zo?d&K{nBB! zt~rQN5xze4$98cU2ey@@(31KST7OOScIUk&`CIVq+s+vD%7Q&jWKR_U1A9&mn#28O z5!GhS;>V>ykygB(b#@sZ4ps?DVpYo!(OoYN1v_e59%D;TS|Lc)ZQn^(cBJDG;b{Vk;kHU{;?=`V|1+Q0!MiQ>}5Ek?;qG&WQSGhQepGAu1Kycz z1M~gSEWH^2%_7iLRl7)*!a+(eo*!;L#bELmQR@MCU!uWY%!3XGEY>tD8xb=m-;aT)rOni=Fz~ zH-2qJLEhE{CRB1+#U|oMdMEh9JDrG393_KPQs8w|qJRGqY>&`^W%jbw#Pm!)mb!`9 zJL}~KsnDHaidF}YF`aKYzx{7rb9`s}mf!<4qZJqmA2~GiCv_SaEy5{suMpJ5C*1wz zI~H!AxGwn0{OhAp^40RbM%2>2hu31z`6sk>)|E=3^ujN3l^@YhM39S3a7hfLuTWd7DHx!V&8?f#ShbPzd0173RaHxt_whhz8<#PM{`c^%L@Zy8bv4R zd5<8CQ1~oj#c{r8^*djepBo$nznb`UI~;-@2igYnSrw#L;O_!JHz>Y1E}mc`iy zUkzm>5}T+Ks!;&1@MDke(+w4leTxuvllnK!`u#FIq4aHGsJwj;$l-=oo|I!$Ks5q* zXM^tBC2pVg=~JKn{8|gsI){YKp=^e$tUtws#FXbDTx+2`-|@pH9stte4qFcB4*p7q zRtas5q`l$7jujfLb)7tJs@(y>28Wl!p#y34%IWucUz1#fY1J~pqz=t9?IpFRfqn4X z32=0)@0rU;i?GYq9~NGgwx5u*^1mGPki%Hdg-Srvbahe2V@!)ntjGjgz}$Ds$8bvu zCH(DJo_KB89*$(eNkxG$4GPZy`PiGF$d&#cA#RH*&QHe<`OYS~{|#&T(`VEZJOKQiPS%96zTMq?_vRUs{gt^2vz9 z$DQZnWHM3CRM-c?lIc&Z{I_0+ja-DeV4R&d6pWA%bw6OxQYWC1v`4Hm2xE>Y9; zFL%`i`FBWf?hAxVZ`|=ZKUn&r{VOLu;ll7!wLKV<@3F1)UrG^2;}&6C&LEAHCP~uQ z>Fz|c8cb?vFZXs9NE;q(-cd&r3g&j>jXi~llhp#3d%c&-t93>JG+uiv&AP|A5hpz) z@G1r{az8!yf-IZ*#@oI2aNCVF+C=ldu6PaQM~`Qie?K8+@*7q)qbg6;g!$P?A0I~X zhmc8_zv_{YvNTPrK`!?=hYA6U(U$PVsxiG}>|>V97NZS=k1&YTC`cz)OIGE;neY~qa8~m-p(D=skWW&`svy?(y)0F zyxvAz(xmid)s?K!( zSid4VB-!+YS8?NT3F34mAjGelRx-n+H_fIVe)93OzWw6f=9>LW4%O7>`ImL^v>D@2 z*Xc;%&jUyFh4y=ZYnLoaB=Q{Tb-`BEquGS``~FR`r|38Z?}QEED4aZT&4)0(U7z*R z6h^wGBD{pjq5D{h54>Co=4P68vy<;bw=FARM_0NAnOIEzwBz#C8vPbUZGSIUcg zbI_c;89o096(;St@d~{2g{4*7{3B$`faT&iaQQQ#^aPRJO?_QHrh~20goIa4l zwp2_}Y8+g@Y7NrQ?d(AotUN3$Nt^`5NsvYLr21H;x`A(c2JT)*FnpNJ-QjvkTjqS| z)D3ALH=-$3&&znMg5?pxD%0SkNBN%+Dm8y;CGWHCqxg{j1jFz!5r8JoV4{Bm#DE3% z6@33L<`y35!pHIc@#p`L%X7yGfL2;1VAf8SG#3|)i+9bO1sg3@02XBtU3lp{A)G}+ zl61{72KYgWMC@wGmeT4lDOAEWur^=#M)0joBQ>shS^B$1f_1upW)7r{1vuPYI82_(NUIeEC?j3w%3o50|mslv|0Ydxfnm3rOc`d+scl?G4wbtpWbLY&h4egui-QU(R zq@bjAOQlW8W*HSJJ$ki!Ita+lT z>NH|Uv@5u31FryN;-rN{Rc^f>Us%5MO_DFgIaxlsY{9l@jx z`M?=a@UzHFPXmD{^5p>k>vUTkE+Sz4v&?&!_Ir;LvN#kFE5)qona_~KNKy?n#~D?8 zK3ry7FaO9b`Nril=y0jTS#%B}7>7oQ*#%Q_s7xWpb?FK5a?4u?| z>b&@0&=(!Gu@N~2Iq?RD=pF6go891{gauB zbJ4UVi4!@gRyn+d+{Y** zY&NG=aqmKuNn9_xhfqCd!oO}fz-fQT&*LKwbWbF%sctaSDS%E|-2wC^qCP~5plu?r zNga=MH&8MtvmLTNv~BcUzAnIj^6jTM`JU{a{MfPgUVPbh`4KQ%KzjlB>C^wiTn|hi zxq#mVPnC?5uYk4s+JN`~WpY)c^7)YxgQQPkfsm8sRr+C((7{4_efSf2KE2*Scn0EmTD~+lnkt+|G5@?>h#CYJn4=b zH}S)QANaFB8hQ9o(f=-ur}<<_`H(RM@La|xi~^VN`98yNHEnN_K@67;yn?M9Y&#QE zwYv`8vAm;Gjs*Y@GC=5ZagUz#0M%065>pFM9kvp1*0bQR-gec4yzL|a{Mb+Bei8Y( zcvPrC{1YVi$h6TkxwFiY$bxKTmLn|$z zp8*kYGFh})ADB^vjtF7k;i^<8dbb1n|7ZQQsn;>DDgR_Mx+_nyA8-MuY?r`E@#?Ah-1E_D>l?q)XLIAJNbNF*G=19G6?)pj?huBJu*MzqM!v7Qk zs>s$sM{8V=laK>xhaj&T%Lg1{{>NB=Aj5$h^%b}W*Ar-x`>50&Oxg|SQYJ9r7J)oY zb-2{YcXcKA)S%sKjVge(U#R)*qOiG)8N$9qa!Ry-SuzQDts1gKV0azFYh0iICXagF zC4rM}JSdfPY{sF{vbGw<99&fHFFte;sh(o@^?)3V>fUG$D8@IL>W$8dcU2Uqnyw0F z^!|Qxv}{f|wlLWU=tIO>+a9=ni%UKUM%+xM#?Q;R_wMDlfM{p1N7plUreeb)A_g2&N=&S){T=Q$1UWog6b_%Zn{5E`VXMuZaDND*9 zY&xt%Ny3h+$TfnkRePsBI!h2@a3O;fLU9;j?=Efde|6KYYbM#RHR)K{*ob=JDnga6 zLd;;TmFX1dVM$B9kl2>?P?))Y?M}F2xWIt$jQ0V=ce`!uxG+_+-iXfYB$m%)V|2er z8di6=4oJ%udajQrI*eeJj|o?H`#LbB$XU0&f-q-)y0U%%XcI*lPi&{AxS>}P zO>XBo)M3rYG068Icrck@ld5%F{qD^Xxc^SHttZ0qbSI#>YnIV-e6;#u;@xNI{MggJ z>8-|NI4MMf$$@pkzRs#SEYGkqzp5u@XqS$s6eMP}W}iw7sfn*=S|%wyXyoaFLiF5& zo@a@BVN^$Y13Q@fCuh8uE_k^aH|q#dZ7_hmk2H61HfQZx*b+9htT{K#1aV3;6#%_qB>3XIlRGG_V&uPK+sO;CGx zO8K+xwlfg9RrWoK5%F_epy6BAiE6M^6w1Po4uZU#YP`3PL9aX7B%dUiyUy#Z>8!s9 zCPR!rrb}fQLiv0mJvrSSe~M4VnFtHQU_-ul2444bYdMdV+gL2|Y~A!2Lp~>^VdR!S0yD6UgI&t;71XpkEdILj}MtSy^Lh7EZ^X82B zIPma8ZHdy;4vXuVFExJ4UwLY9+I$L`zkqDQ`2adMBelUv4vZbk0E*;!wi4MOsy5yk zM?l>M%g)jR_&2QW;hD&*MHxWg-SdkPC=fCKtszqj&7B4v7QJJxV;?gYE`DFD+Xq#q zaOi5*EY5X$iiuN>S~tzhfgROHu{PTXSoiFR+y+*x(I?A zxj{sCun9LGIFzrGfjy$|= zsGe>Ml)?7stP*n)-yAlz*KgECIc`Exm7DnkJRzC&tN5%K7_$6~5YvDMP`$Txws_p+u(K+@9BlHdK0 z(BYSTgfwMQUxJy;b5dO&e(i18Sxw4jDfKS5M(0SMeCcdrH@-73Cxp~FV2r8msFe}% zxk9)eSGSTY-jgiqrpjy8bt_L;r*cy%UzR+ns{pk*Z?`dmQR2-S)|!3l^(`9sXbE_8 zr-9SH(2@Yu=-pPBDmF113fF^2=VdE{`1N*@hV+aj=YWjXDwW@o$pQ8;r~rTPW%D%$ zLjHI!%}mc`>~@V+q%{OjbvARbBcZ#sjZoZo+y5G%BV2Fzq11MD7zP;yIbYnrZmmz( z>I%rQjr0&J3Nf2e3%hL$<;2P052pbFEPn*pL>LDL!RecRL zO!LF-(k*nTHY7cfZ6*Gb6@hS}_0X)+G!%ryY|rQ=&w?hpU|&b$n{?h*eazQiksu59 zA;@5TCv%IGpzU(5yt-n5vq2^iNM-1)u-Bt_^>!jVkr|H+V)O2?X zN?MU>Kv2(IQK@h%LQqHJ+JxHFDvaxfhP$i_ z*Ag>^EA5y13ph97W+GuR*)>u_V0b$l&O!VHoV)0`6I#;Sh=Qbr?CG7pcMWd6m|r&k z{vwnjMM+C0JJZ?>f(AIteC=wOSklg`9+?{?t=6}uG|$J(sM#4PgQ%LA4pzC%xkI=F^D)V z6H6vY`#;M)|11zP3tmB&Jj(rF36XDP*ebYF6DF0bXEbRzu( zZpJ)*wBTs?*CD@IlZ5RF05&n+ZA8(EbY!9yfoom|9~mr_!>P1hOTIMsuWt`+x?zjf z{pyn*Diom)&y?y_Qca|mu$Jph?x>S7%1d1%mYCS*95w5|CX*qa6B*0P{h+z^tp-^R&Rkx|YrEp+HT#mrg;Wb{uD_=I>2p#`^B~4x0Cx2rt zKU3iT$sAqEwJni1A3MA(PV+t10P#7kP`8>UCZzRhf245kK#)HcF^7B6NAGe{h1hfn zVm774$eSy!jUDS2E5E-Hc>mtOeOam{;Wl^X%dyJUC~4pCD^O}bNinYJfvxU(6zrZw zl?|vUY~KQs8!lODyrxH9c@EF-;-nDF(3%xrUs_S56nvn_uB9jPyChQHOD%S|T1a#7 zwZ0EJBk}dL=i2mV8D7m9?HrF~js9enKA86Il-vrCW!j3W_vEb&cggsgJ@vuvDJ?D9 zjm!>_QJp2#zWQyd!MnU7(6osAk|z?f`E6N8VP@^2$fYerrTttEafT>9lEb*pBtV19 zAr+neVWvN>&JRx~bUx0Io$5p=iRt~>3S~vNe<@@{#%BHK`mG)QM1)u9R{5Z>Vi#Ml za)spdvH0eJ&5>rD4%rH&N`i_>J@wuf-0U~ajcO~m&pOUiHTUlS_80Dk?uU=eZ!-jm*GAhEXR{}*LL>6c_q)GB#7+(~j@?-IQJu5STVWTDf3pdB z=|-H+HZ$*rmuvXfTBN5UZ#cjJe>3+l0^F3Fr%-2#ow*Ksp;5JzbqCa?=wWf)>Sl4D zZL7KWJ%ORaH2C8cp@Rlf$wWRe&RM^?P%je=tlZvwr>~^$W;$u#Hmn(Am)v;T+VR;t z3nBpgE9U3|`yc+N*kpe4AP+lWKf-d3Lj%woN395Fq(FSkUgiB*oKM&$Z}rRq>^!pm zfK0ekaxL}jyiLnYDkQ`1&lEnjBFM^{9O7{NwaDJ^=}jf~!F$Gl>g1%+DsQEQ_LtsPFhj4&=Q}OtU#^dA zfAhfiyRB`t?nf#18L;tb0u(8oT}76~>O4`@n2A@GZ@m7Ts;4d%k27BN#f< z2yK@O+MH~wRE4t&E8oJUwpA_mO*34Z4w}h}LVa(@X?&|4ORYB|_M*&Sx0f~3UeH+u z3?K3376{Jwz$o=@r6Xce&5e$Wyf$saN`ykYdc)+A26_Y);9rPCt}VQiK%o)gDUFy^{KsTJ)gzln601a&^>os!ZV`GtOQSOKVwd zSb6Sos~g29;@v&^oWoR*gV0zju~JT@&K8TWdppPRIqX9kP-E&-!ScQnJviJtFw z5B0RV^k*fw@sP@~gfeXOM$%8=)_@8TO&IkBg#lSQNefsk`+w5@gZXDBG^OC8(;o#f5DeNBf#@3u~ge+p`Da-VZ!1RvT=jU~6N?u5+^$SB3Q+bkq4D}w6M^> z<@P?a+J0?Y!nji@*+|uk7M`L?{UX}$6o+3`7p` zLGXY50l8HK$Ofs#8#n$ZRQHddT>wAwKQ07=_ODE`e+BmNpFp#W|7!8<)Bi5bej75% zaG}C_v>r=H+$Y0xWI6#$zn4Cg!N&Q5fvJF%e{6sOo;;Z)Qyho$_SS;CYXF{9AaCDD z)x^QMr0qSbg<4Vv2K*02&&MQjNxKPYqL{{Oa|_2;l(d;*>H9ECTY~ zyUP7|3_X-mt+mVWHG~Jm8)}K%tIZa&x>RFewTXAIZu>oDNbofi!I2g^F4&~lsobY{ z;{i6sI1NbU2_H5oIPBoO5H>c)?ZP}pi4p>Mk-7jSzBKl)`Mf%{c{_kT@9eypG1~YPJ51|ZYtxe)wR4~6?7eBQ^p`!V zkLDcYEunI}Gwq>?<`}f32d|$(Ev=#Z=5qR@RJA$Gw)ni_Lxof3cb-|Q3oy`pnK>bv)9`G1=zg-1pKp1c5%8?Q4_CjH4^OUngsK{HB8d|`hKS(f)ShfU%nh-|Wt zOmeT~&2BnJ6MEGMjbU&`Too~P#4~QnvsG~up!Cbtf~WOW)CRjmO@KBEEvMVH9_?04ZR*yHyUHi8-AQRnB&hJ5K{t;&6EF6p(H5 z5`@aV6PSn@9-Zx8ET2J#ZpA1`@q&4JUD#>r1yCq+8!Z^~ta4v*#R6?EhoS7=;_HZw zD!_$8giEVcm05HgwWWG5KK2g=iq2J4IJD$CNHi&HJeU~*WoH<$3vow4WKSO3=YO)c z3zE!wfuNoPM~Zvl3M69Y)f}S{C`rr+SP-qZ0hBEyFc*0BhZp51@To*+X-Zatwj=WxKm#JzK5Q2;+JsLk~agGXgdGrqG@rRFcK@JnV|Z`1R&}z z@N0oMHW0q&J-7t~amrRl3a~P|Uf7OLM^pRT)2)j?Q*1zT+O9WXS#6jhDr~KQI`!Ix zRUi#_WNX+5wEb3++Kw*&gI>i)z*VjDZmk4x4@V0Po?-zWQC#0O?Kx&5Vg!f#52Uw2 z6XwFi2Z0d#`T?Xjv640#2e9`FpdHo)fIK>TQ1lm8f+WEG1i8hv#$hp`XM~aY67a50 zQV|v5D!9_$L%h|-n|!ZGE}nM=ac`4XzeJ_BQbe4aOgu)Ajt%wtqFZSY6LV}xUN#!& zmr<z3sM3z6LZeI*mwXWow3m^t^7+D zK)`t%NZho86e%la=-oj)*#IwC2v|MR&n8Ym1{q<o6FMf4dJ_MXKs*m1PVIXi*&`CLxtNBgHer7)BV51R| z+3W;HS!EY`xXreKhoT>owvSfO_Ksby(X`g{44VEk|HsXKgsW0sWN~;b8SOg(aQhW zPjXIAv9bwIn$P3z6-HDj*Eq5NArWcN1FK_jxy*qwU~#LirGjz zM@{zb4IV^^ne)MePq5?M-c{DA8W+|z5Ug6SUD`koC}j9m$w7Jp7rg&S$Tk2q9qFFf zYcaU(?)B1dRcIi*gwaeAsol8OIJKz+6i!u4On_q4W9i%&s0hfZJDWm}z!YN07j63K zg@iPBQUVn(NuYTPD2pGGXu(X)+AQA(O!E3A5$NW{oYmi{12yVB82ok$gP^mRaRve) zn!&<+SOq{U^5k@>Z}g~-XR{t8?EL`1wY?sv=I)}oVLH6|Q84z3dgC~1!lC}Hx*Y4K`#-SoM&oP=(k=*+?k_;gttY_GeqpNc|hthCMlm<=y~MSRVV z+G|vLb~Wo~zRd|Bnhlx_6NbxvG+P1ai%IC#MQNY%#g&OLygn!%|Mmn(WrxiY$b@9k z5GF9vIyF$KZ%}(FXIsLp&;`9}st+hn(t*unX5lSEZU|7M;-UPG`WD0-03p$zWklm} zR_@U;J_5ssx~oI5#-I|Ng6w)XWa0SL;Sw``>l5)!>S00h>u~!X_cONRrTKR<@o=;7 zeA38^Vt!x~_VzP5_Z7r3!15R+{A-HKzM}8}O#sL@9>)XcRS0w`SeDuFJAb6%T(!-o zYMdQ`Mr|={GvbCMfhXR`lVQ;V?7iL?ftV$%LA8_QSkI9&tu%#hDzk9#nb47403dTNUK)`H9F_gchAON*Jb~D6-JC$h9=BWf_u1*2%1;d3OwexrJ6Zaz@ zNICxqeK#(R|Npb$|ML#CKqgk~$MCoR!My{sxDM=|kEI`>{+p4V2eOZp+d97fnOy%g zyJ$=RD=!^(6aC++)7oIe$og&Z?7waco@dr*ppOFkUvdc=T%b!@+h@7h_}Tl98^}Bc z(ea{XoA zgI>hRQYT+K?*_3+S4Jj^pN%C10KT<=yBde(?EJt8!tWo3qH!t1Z~Gd&NsGX*M(aG3 z3rrgg(IQ>6_(8lv>Hvt#x`p~H-Q~cVpg4}5Rh}>-peP4v4jh2x>GoSN2CFtWTsx&M z+ctiFWRGhZBp=+9b#8mWh9UwA3OCb|Jh?iwQ}c{jY48Ne268v>QxW`6=OkhdOs^fD z`1{%rnCa;(ZTlw)n!p7W$J{}s2a{zby)>dM7P32HtM5OizwTm54yH_^tPjF!LP@)B-J(|F;@>Y zae6d^PTrMX@;}L`N6I8im~HeL7$mr)U$v7tM?kUs*ZC2f{IQ5-!dz^|?k_GvmI!_D zk!lU%QCam@lZmQ?8Z1Ysu|}k0rCXNwl1$n(p*D=NAkD*J7*`~4iwIo*0Dt|GiAQku zUrp1pDK-y5v#-w)vYb1A34e#zGI5kGTkH3SB`*Ev_vZ-fg3i5j_R?5N_>b$sPy4M1 z@A}nW5h4%#w|4L!12$M*|3WYhd$a$m{XdtuUq62Z$@i!?=%4rf$4}G5FPR^D5__ax z{>StDvq=;AbKk1FHFjD4Jm){Zr&uN<`2VvjTJa8OA+-zPOwSc_Qhfclf>rTmrqMlsUY9N{HSNH(wibw9_0&P7Yuc_AAIQShvLe zpoI;5Ce_i=0ZP{Q3nWY@B0|>+CpgR35lXzQtkZm^cS~3&-1zZLWwL4*88KAOkdsKb zTBh!Af-*4^iU1j+CYHjo=q@WOh@I$9b(WBc=?_2SDC4KrfU&kudr^LBbj!v zH+{n9-RB;QnketPUc*r6rQ{5xofAb|)l|^&v3Q@KbK-?B z8dV9R9tKWMRU;!K-gL;XUym0fVJ>_^LL9zxyx&6aA?S}{S&9E zHI?A+Vl{OgoyvNt77q#b@sJ+RQUp{Gu+gN7AR?e3gd*)CO_VMO zp(&jZf>ddtsGuN46i|vFMd^q@0tra~rT3Nqp+o2$LJxcgXJ&oBd+(Y*XU(k1UuWf% zz0cmy`@HY_oLxSu`1VmY-kcG9(jlM*It+yPr@50?%^jEokAAnWVW0`(d zIC&qnBtQaDdz-Vn{#MGy%&|VLkh|<)FnG2^6s9;eP-=@kkM{-!l(n?Rep;A6b1`!1 z6(raPjnF6fROLvwh&oB+rxAcvb*td+<(O7z*tRM(hK@eJ0OGnfF12Zh2F4D3hT!ro z(D5;5yGs6b2xohL0p2XV+<5FvTsPvdy?-=L%)PrN2Fwko*d1StQn|u;I&Q5G6&3}it*rD_18quvrp;LBgmZ)kk;m==Ki## zI}$qLT=mkU8jdQC6|B(9@2@)Fii&?U%FfG+KjXhgx(>f0*IyOK?XmdWIyk5bLlT=k zIPj__aD1b&kdu_^rFgPox01yw?XbfMBc>}r z(4*I$YT-K^%p}?qOZ8!SlY~+>B;mK@J*IQ>bWPLb_%Jvr<3(s=j3t(dVZ$Y_BdOK` zY~;vMTuc~u$ljXoZS3z4TB-?EL%dj*TsEcsmQXdwuByktMuKo{%#V9q@+x` zmSyaX8sdLzk=XCHd!CeAE?s7W)y@U4xv=&1r)AbtpC4U;y+g%(kbe|mA*`waJNon`qMLYJTI%0N@Y!(R8Z$Melii} z2Ive#;Upf(ZE2mN^U*6G7sk4LV5r!i=ra^A1{kcKoWmip8i$Xyhw>vwdc>FYC<>cf@`qP|q+c84w* zm6nrq$|SO2$+e0yV55>SUZ=!~f^A(5*fABwwKi;TRO*3qpU<^F_VO8U&62ng-CHRd z;r1_@-}Jxn+@2ADrct5$eBFQ7|d9|`Yr2WNg%#|jVBn`T} zcUDz7+Fr-o%Q>nSl-0ln-CdzQ&8%*8=Hk%MrXp}zX}1M|ny@JHLD)abbf?ALMCEc- z^J#%!mfR^?9E$$gy!C{=^{EE`U#4Z}8BU-cbKy+{3Rfjrt}elWG*9PZ$FyiJhunw0 z%FdCuZZ#8(OdXDQT0=e+X|2)6a;{xf#x{)x+9oV}H+;MILXh(+x17g|4}~9*LpR>u zdmGNowLmoGm|d0*>|be~OK@T_hI-pnuli_I+O>V^{)*6Y7J9_F+PO)867u{vlFu@V zwEMg0ycU{dREWkxzDZidaKmmhM5O2BCIRfeRs8g<+XcCo`O-#1U;Zk?nvdON`&p{a z1SzcZKC7Hq@PpC1$k#>JF!kW){WPv3=fpyY5URWwDJaCa;OecyNGzwb) zc_*a*F_T8X?w(U)vT9syi7GYiNo6$8{8k$ov*0lA6&=_XRzUOP|MK*^|K(iCX1|hF z*uI=(5gg)RlBwF~uI=mAf59@=wV*1kEM8^H4Eclw)&SZa@^$2yBeVbVGigYSF- zZWenO$VhW>d%d~zLNu8+jq3EC(R?cZ-_6ywcXK(f3`S26|( ziGl--AScf-FHVL;+GP!QYz3@HJMlhePtgcrh^)d~;kTTo?Is2*Rs+J*H;&YpDKj~* zC~NB4+zQ4@tuq?GNs`qgr)vyPiE6vYQpepG2p9A?_|g;*nE6Jzzk}qux6waZv`g*y zLPj`wz7HO0j?OW2RyQ~=f4x12urE~QB$}%={m|QU{-w9ogjRy0z!1E87(d4}N z@h>rWO7|Aim+pA0jX(o5UkMLxI{S`Ph5A{G{^T1rUx!%xd)xi?oo<_u`lamj>7c0_ zOkSL_x63LT?^f~jjn&4SdL-^Mu9*?xbw0@k=BH@7S|Vmqw>ryyf{^NqHVvTBG5yp9 zVPwfw|`@Y{xrP0_OyAA`Pld4L(|zX zVq?{cOkv8|NRcV=!!P53yE~By5X%0u(=%NrYVYIIm0MsRvNQB*QInaObxn3~OhTBl zGt|hYCdL!;j3{ukRR|gVfjaT9=-GR)sC0|3*T+}~o@8)9L14d}k;=_^03qinLCL4_ zOT_}uU6xnvlQ8vA)zW@VSG*QZ>ALu2`EXykNul)MkEdn61Uz3D0v!L)jX%;0;`f~U zdVmj8Oasb$8mQuvxd>(ja&s+cy= z;SM9xaunEmZE~}#tpwQUS=e&R-qgx(IN0N<~w#C z7f9VEQ;ak_%ESIGI98zssVCXvNpFvo6)oUrTuom6v@wo>>&Hd(NCA7J8Pp`uZB5IiAH@*mQ+&y9l)5|l`Hy>oNyLr#j z!QmSy(B95v3xjMeuc~zTbM=eH&t-KE@hyp(>1NAMMuYm0 zxZD_HFzYRE=WUK?QXF{gy7<|R`0K5djFPeBPRD1XD%>-Z+D>`zsGmCPd(D3(DJh6a zSB-~#8`X|BlL<=Y8Xhjqx?6t5RI1H$w!<}9Xer6z%Vn~eA$&KtNERj3w#FoU(dUDL zF|lBTst@j*f>R0e5qJTs^>t5!ov_JU-3)?AeV^uJd`$y1N2ZuByc_QT?weEZduVTz zb<=-;t-xd6@Vn|tokw2ul~UP5C2mOH;bV|wu~zZ~kwN`i6bU(&IoEG5({LwMo}8+G zh4XQ6N4sGxyIh+l-dI?jJNXHhUlHc9pd-hxWN)Wfrv`LSkz8+KF}1aNq(cn^0^ugK znN%%s7kiYQmKsLrzO>Bl@pl}cHz>HaNdnROQ-^+9tZ2w#Uw(SYGianu?O8_@YOL3_ z=+W8Y)+nLd+sS@R271n>FS9N)fv@I`#=7nRKBCUj6f>GAjRLlh2!pAKW3~iVpB37R zandfk=H=y)U*@=ohE?FwC=WLxR2(mA^`UTORdG13T^edy`3EX=wqPcuWyeqR76E0R zbo&PD?J3B;<)DV?*ri-f$6*}yx!zyxnFzs!Cw6vTOSn)6sW7s=nBw=G&9fVq#kx!j zLp3u=gYn|25dZ?Zc&OLScL27A5QuduQuV6$!e@FQCep&f=qeq9&gPJqiqY8Q=jYh$ zm!7R5PDS(rR@=im_as(%-d#VVLx>M#Q5A7I;MgV63*Y8!5RH4{WvCzXD>V67d6JdK<>qIBP6CQo{53ur zXEyWoF9EFb-*Ul8B*z~z!F5i0CG>Cq{<$S&BIS0;S#|#ZC5QaaEBH=KyHtOQ=LCTN z`7|wd;KcvCh2(o)ex$8kibCyo1Tj>dc+fck>ikaQ68`Qp`a7DlzaBHF@Rc&??+D`$ z7F?ir0@2AYQk5hWF*t5TbJp)eU$30BsD|_0Hyvj{x?`*^s;8)nE_ONL7Vi90vSL4xpr#bPk%@UDMW;Gr znW=%9nLh#Hn%`>0;{;$$yGreSli-BXsx#sTyX%bYVUzz`3c1dMisry*+202p)`1YJgWg?BIf#pDop zaZ>uLAb^0sX>q?F0L^0JG5kMdmH#k?GpfJ|eyfWQ{-r4v0@{o1^~Rw8xcJXu91C#b z|7Ib4(s-^a3GA8F{Jc^cJ*B6YqpZqN*GTI{w}$?-j|O!fO_`YIC>Xj%UtDVC>^zdI z-N^fBWbAqPG|(zrjMUA>J3H?{_;tf)yFRH;he;VRdh;Vl;<#BEv?UhqKnXt|rNC`) zBBrWTv*ZpT2?)yJb8A@4{t#t7(*#Y2g-2hVZza46qcMz~9>gwD5+}jQGa~G#7 zg^~0Z9`)eTfF0t*@4f;w*v}z!^pCCcM~x|;bN?Di&xM`G0Q|SY7kA4$*!PD{pX0M# zp@o#FOyqFNx#A&B}sKBP4~C6;S$Gm|3DG2+Q%82T_N&fj{D!Qr3EzOR@~nZcVeu;4iVYvU(RnOD5ls`)!Dk$`0GDgCjv|@>U zl{Y^xcYav11DFPhf87F)_WcsN&Hv znVYz?5oLmbr8BWB4rx?RjmR{RE{cEG{ZbA+-ZT;oI5I03E*&wHl7g=JZ5FQC@*=6O zX_{4vPwb>^?n#(!1GC?@n^=jIS>=IZWlC*+%k~r8grCe|JaTYg(QU0c7Jbi`GJ!23Ter28|hK-q<|sqvzsZgJ*g z`CTqq`{c{;X?dO2c4+2yvGP%e&(U_0Pax4`QNig4lYH=(Xxw~R@iIHh12(S1(G>CmsrMB!;pqNU`nytf z8FInStMo}A>C&R6Zrc^A)yZa1{7^G-8D<{Tf5vgGqoO=?5V2RP32pvfj5H#XeSV@^ zVxU9g3b(7I9QHb*7?F>60;Oqn(mvIvS2iGyS1*pfNpoL>CK~A5aq+;rOh)|zZls-0 zgs*52wbj2dz?wsU+o{`D*dT*n{&C^4zt;(o(I~*6RrpgWq@qJ3Smb zd>U4SF;*$O2Uy%U$y?Z?U!!x};~*$Z|cZ@nwKNbXCb=?yQ5p6Y%nju zJBIC_-{EdelO}K-JwDjp-$>6dh^`)qDHbiYcheugKd;Lbuq+H6s81T2Aho)#Mg@~o zvkwrB!ArgKGao9X5&J78yD^tFRykzlr~oDTTBvP`%zkZ?s%IziU7Y}NbIS-Zmt3%P zf6%Suac-JB;Q$?qNyEW=tS~ET#a0TIpyK0g5Of(gVjD!yTzUpM?4ZVG;ev)Ag9G*! zb_Sc1llA8(^SI*-RTWF*6#_Pl6@EG-h*8>@5U`1{>JEMx&yLNPd73)-p%iOWy~zmq zaAt(;^qIhpNQ#$4NoMqj7sC)}teCGT!YJ z-_H*B516@DI-@HmA8^6viImcJqdy&wUB9nX6Yid?KG4G4S2P2sV4RpG>TtN(T49*6wpXVa4?1T?ovGPpgHbcvADKv5nT!c}c&H1|PPcXW3 zBu*(GyzYAFQ5&H=qagC2Y!iv|NGotjA$Ls`Ps!lNOt#j!mYYx&s#hEX^(aFaU=K*% za;C9>fsVa8Wnw-W=i>a5?!b&FbwM+AV0Z4NnRUPw=?Lb*)K#S`?=LE!gt1=Ap-O{;Za&+dR_`+xVLx7*Vp{v3>V1y6 zj!ko_R&wgtLo_gP@_3UW+kNvURh|*cyZy(>blja)%UdC{T~Xo)0&Kr>XRF)B{vLt` z>%J1(GT6v;r0WiC(^bRZ3p)zJMeC zK9g`=51=Z7+*A&MA660!Grm0Fvvn<%@Nchu*)E{f%ISpZ-Ir%Bp<^rgiMzKLUGMJu zbVPPWjj(&LgI!BKPE;+@tO}&00rL~5#M2w)1EJ~%BRhXv{1j$ z7US~cj>XS39k1{d^6+NTcgp+=SNb|~e`(@+b}WxA&FT*`n zH6j_s`Hs}Zw8EET9lW@R|BHJJ+n%n3wN0^)Nnt@r+=K3xI>e=;oqRsT7CXwnwBU{& zaeK5JwTh|3E4in0!#&OB@YT402vC!05T$iH-_sBOuF!$s{KF)dvTYs*M~C++TjHz( zq9U(synT6G zp}Q)A;MK&Rxo|xR_C*O%Y#81BYlOpvXZYQ>CKZl)wy0fK`f88p;1<)O zwNZ6BH|%eAP z`ch4S#DR^;!qUV2*^*sv^ZtV!KNZNz0c!qQ^8wlywdlKROOZ3PT&7WYt^105%SVcC z;f8jgJm}m|vYnnHp&@YX+V>ARTW+#GQiYyrDL)ujF%pOM95p!KMPT(dwXvTy-^LC? za!Aud>bU)v%kFeba5g&~slxq-yTi(6M~_X@J0vD=C8^i;AnXTp%9)(uB^ujC~cBIp=nIxo5ZvFCyxGmKq3a_;}73qd(C} zU}rrF#3Xh!O6aH~EIrJ$`V~9xk%l6_AFeWj2IO`Ez9&I*zrga~Mn7(Gw8y`I(q|Jh zAs z#)aUgZk{a@c+uir30uh!x)eF=|2TR`;y6j-lY<+Hn(YlE`T^VcJmd3~rv8pc;)K(* z?{C!oT!B9NDBc5=uhbNG${O=cIKJ_uFF6>~SOQ8sC>u_+KfDsWA#|Ll$hi=)Ft$K-ryJvlJGHk%X z`=3aGi>$%AQrD8TJLG&5(^Ut^jmMjNn(ZH>PJ%*IXo!RBx#0_!nP?q=bztUg&!#x> zuO>|D1-2xPJx@s+z@W8lM62jl71s-yCkbf1UMk%RX27v*`_)4f_(kw*L5t)~n%bJ{+9s}?Eg{r|Zc%P=T)qa(ppPDHl%pbdT3@=LOH{ce9eX>A4`Xb0f z3{@En1m)(3>YZM6X`f3_*uo}R{ovmmVo4WxaSC{@N)NzKkCIKbRr$IksiN&f**|W# zh*3shfpuY!;!OzJ3b%I=&O3H7z~>-@H!|*47Hl^Rx#tx*dh3_R;4EyrJElTtsb*3g zG2O5jc(-djF9Gc{bYyx(X0Pw6yxoxQ)O94nw74xH`Aa?~@zmS=9@X_eFF<}`#>lUPagR<}=D9YPVV@Ye46H8%J-po2ut($QBcE1?!u|7J7 z(BfS0!0kv0qKeHn>tciI(0|3kkLgF`Z_hr6_>*<1P^-;Y+Px;A_LqR!HZR8C?JX?<@76qyGuuYeg>_> z^O0*Il>#gf3I`-;#A3p>UDRYGr8m@5F}5ZTC9kj|umG`y@v`-mT-W~QlaV9oft_W2 zQ*rdJ0=we(YAWk@+ipW+fXX=rFHx$hOQKE2kW7OO1YKNYn~oG+dy(?F9_K4h(u?Ea zZaT~-U+hlNax&atc={vV@2*gD-}(EZ!UQ0NnZ8!TIW`^e{o}vJQPly=99}!l@{b&u zHa-?eSvN374`~0H{5}O#l3zCN4FCMUSrZUk_C`gn|7+mYZR%rYc~r_}{~J{qr+}Zf My6*krzih((3oZpqNB{r; literal 0 HcmV?d00001 From 6fd99aee7b36c1eb9f6c826aae19ee4caa88d6c2 Mon Sep 17 00:00:00 2001 From: Brona Nilsson Date: Mon, 3 Nov 2025 15:17:35 +0100 Subject: [PATCH 5/5] Add image to README --- ai/generative-ai-service/complex-document-rag/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ai/generative-ai-service/complex-document-rag/README.md b/ai/generative-ai-service/complex-document-rag/README.md index dc0a5c3d7..4120deea5 100644 --- a/ai/generative-ai-service/complex-document-rag/README.md +++ b/ai/generative-ai-service/complex-document-rag/README.md @@ -2,7 +2,7 @@ An enterprise-grade Retrieval-Augmented Generation (RAG) system for generating comprehensive business reports from multiple document sources using Oracle Cloud Infrastructure (OCI) Generative AI services. -Reviewed date: 22.09.2025 +Reviewed date: 03.11.2025 ## Features @@ -14,7 +14,7 @@ Reviewed date: 22.09.2025 - **Citation Tracking**: Source attribution with references - **Multi-Language Support**: Generate reports in English, Arabic, Spanish, and French - **Visual Analytics**: Automatic chart and table generation from data - +![Application screenshot](files/images/screenshot1.png) ## Prerequisites - Python 3.11+