# Uchambuzi wa Madai ya Gharama

Notebook hii inaonyesha jinsi ya kuunda mawakala wanaotumia programu-jalizi kushughulikia gharama za safari kutoka picha za risiti za ndani, kutengeneza barua pepe ya madai ya gharama, na kuonyesha data ya gharama kwa kutumia chati ya pai. Mawakala huchagua kazi kwa njia ya kiotomatiki kulingana na muktadha wa kazi.

Hatua:
1. OCR Agent inachakata picha ya risiti ya ndani na kutoa data ya gharama za safari.
2. Email Agent inatengeneza barua pepe ya madai ya gharama.

### Mfano wa hali ya gharama za safari:
Fikiria wewe ni mfanyakazi unayesafiri kwa ajili ya mkutano wa kibiashara katika jiji lingine. Kampuni yako ina sera ya kufidia gharama zote za safari zinazokubalika. Hapa kuna muhtasari wa gharama zinazoweza kutokea za safari:
- Usafiri:
Nauli ya ndege kwa safari ya kwenda na kurudi kutoka jiji lako la nyumbani hadi jiji lengwa.
Huduma za teksi au usafiri wa mtandao kwenda na kutoka uwanja wa ndege.
Usafiri wa ndani katika jiji lengwa (kama usafiri wa umma, magari ya kukodisha, au teksi).

- Malazi:
Kukaa hotelini kwa siku tatu katika hoteli ya biashara ya kiwango cha kati karibu na eneo la mkutano.

- Chakula:
Posho ya kila siku ya chakula kwa kifungua kinywa, chakula cha mchana, na chakula cha jioni, kulingana na sera ya posho ya kampuni.

- Gharama za ziada:
Ada za maegesho kwenye uwanja wa ndege.
Ada za upatikanaji wa mtandao kwenye hoteli.
Vijisenti vya huduma au malipo madogo.

- Nyaraka:
Unawasilisha risiti zote (ndege, teksi, hoteli, chakula, nk.) na ripoti ya gharama iliyokamilika kwa ajili ya kufidiwa.


## Ingiza maktaba zinazohitajika

Ingiza maktaba na moduli muhimu kwa ajili ya daftari.


In [1]:
import os
from dotenv import load_dotenv
from azure.ai.inference import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential
from semantic_kernel.kernel import Kernel
from semantic_kernel.agents import AgentGroupChat
from openai import AsyncOpenAI
from semantic_kernel.agents import ChatCompletionAgent, AgentGroupChat


from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.agents.strategies import SequentialSelectionStrategy, DefaultTerminationStrategy
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.contents import ImageContent, TextContent
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion, OpenAIChatPromptExecutionSettings

from semantic_kernel.functions import kernel_function, KernelArguments
from pydantic import BaseModel, Field
from typing import List
from azure.ai.inference.models import SystemMessage, UserMessage, TextContentItem, ImageContentItem, ImageUrl, ImageDetailLevel

load_dotenv()

True

In [2]:
def _create_kernel_with_chat_completion(service_id: str) -> Kernel:
    kernel = Kernel()
   
    client = AsyncOpenAI(
    api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com/")
    kernel.add_service(
        OpenAIChatCompletion(
            ai_model_id="gpt-4o-mini",
            async_client=client,
            service_id="open_ai"
        )
    )

    kernel.add_service(
        OpenAIChatCompletion(
            ai_model_id="gpt-4o",
            async_client=client,
            service_id="gpt-4o"
        )
    )

    return kernel

## Eleza Miundo ya Gharama

Unda mfano wa Pydantic kwa gharama za mtu binafsi na darasa la ExpenseFormatter ili kubadilisha swali la mtumiaji kuwa data ya gharama iliyopangiliwa.

Kila gharama itawakilishwa katika muundo:
`{'date': '07-Mar-2025', 'description': 'ndege kwenda eneo husika', 'amount': 675.99, 'category': 'Usafiri'}`


In [3]:
class Expense(BaseModel):
    date: str = Field(..., description="Date of expense in dd-MMM-yyyy format")
    description: str = Field(..., description="Expense description")
    amount: float = Field(..., description="Expense amount")
    category: str = Field(..., description="Expense category (e.g., Transportation, Meals, Accommodation, Miscellaneous)")

class ExpenseFormatter(BaseModel):
    raw_query: str = Field(..., description="Raw query input containing expense details")
    
    def parse_expenses(self) -> List[Expense]:
        """
        Parses the raw query into a list of Expense objects.
        Expected format: "date|description|amount|category" separated by semicolons.
        """
        expense_list = []
        for expense_str in self.raw_query.split(";"):
            if expense_str.strip():
                parts = expense_str.strip().split("|")
                if len(parts) == 4:
                    date, description, amount, category = parts
                    try:
                        expense = Expense(
                            date=date.strip(),
                            description=description.strip(),
                            amount=float(amount.strip()),
                            category=category.strip()
                        )
                        expense_list.append(expense)
                    except ValueError as e:
                        print(f"[LOG] Parse Error: Invalid data in '{expense_str}': {e}")
        return expense_list

## Kufafanua Mawakala - Kuunda Barua Pepe

Unda darasa la wakala ili kuunda barua pepe ya kuwasilisha dai la gharama.  
- Wakala huyu hutumia `kernel_function` kupamba kazi inayounda barua pepe ya kuwasilisha dai la gharama.  
- Inahesabu jumla ya kiasi cha gharama na kupanga maelezo hayo katika mwili wa barua pepe.  


In [4]:
class ExpenseEmailAgent:

    @kernel_function(description="Generate an email to submit an expense claim to the Finance Team")
    async def generate_expense_email(expenses):
        total_amount = sum(expense['amount'] for expense in expenses)
        email_body = "Dear Finance Team,\n\n"
        email_body += "Please find below the details of my expense claim:\n\n"
        for expense in expenses:
            email_body += f"- {expense['description']}: ${expense['amount']}\n"
        email_body += f"\nTotal Amount: ${total_amount}\n\n"
        email_body += "Receipts for all expenses are attached for your reference.\n\n"
        email_body += "Thank you,\n[Your Name]"
        return email_body

# Wakala wa Kuchambua Gharama za Safari kutoka kwa Picha za Risiti

Unda darasa la wakala ili kuchambua gharama za safari kutoka kwa picha za risiti.
- Wakala huyu hutumia kivinjari `kernel_function` kufafanua kazi inayochambua gharama za safari kutoka kwa picha za risiti.
- Badilisha picha ya risiti kuwa maandishi kwa kutumia OCR (Utambuzi wa Maandishi kwa Macho) na uchukue taarifa muhimu kama tarehe, maelezo, kiasi, na kategoria.


In [5]:
class OCRAgentPlugin:
    def __init__(self):
        self.client = ChatCompletionsClient(
            endpoint="https://models.inference.ai.azure.com/",
            credential=AzureKeyCredential(os.environ.get("GITHUB_TOKEN")),
        )
        self.model_name = "gpt-4o"

    @kernel_function(description="Extract structured travel expense data from receipt.jpg using gpt-4o-model")
    def extract_text(self, image_path: str = "receipt.jpg") -> str:
        try:
            image_url_str = str(ImageUrl.load(image_file=image_path, image_format="jpg", detail=ImageDetailLevel.HIGH))

            prompt = (
                "You are an expert OCR assistant specialized in extracting structured data from receipt images. "
                "Analyze the provided receipt image and extract travel-related expense details in the format: "
                "'date|description|amount|category' separated by semicolons. "
                "Follow these rules: "
                "- Date: Convert dates (e.g., '4/4/22') to 'dd-MMM-yyyy' (e.g., '04-Apr-2022'). "
                "- Description: Extract item names (e.g., 'Carlson's Drylawn', 'Peigs transaction Probiotics'). "
                "- Amount: Use numeric values (e.g., '4.50' from '$4.50' or '4.50 dollars'). "
                "- Category: Infer from context (e.g., 'Meals' for food, 'Transportation' for travel, 'Accommodation' for lodging, 'Miscellaneous' otherwise). "
                "Ignore totals, subtotals, or service charges unless they are itemized expenses. "
                "If no expenses are found, return 'No expenses detected'. "
                "Return only the structured data, no additional text."
            )
            response = self.client.complete(
                messages=[
                    SystemMessage(content=prompt),
                    UserMessage(content=[
                        TextContentItem(text="Extract travel expenses from this receipt image."),
                        ImageContentItem(image_url=ImageUrl(url=image_url_str))
                    ])
                ],
                model=self.model_name,
                temperature=0.1,
                max_tokens=2048
            )
            extracted_text = response.choices[0].message.content
            return extracted_text
        except Exception as e:
            error_msg = f"[LOG] OCR Plugin: Error processing image: {str(e)}"
            print(error_msg)
            return error_msg

## Kuchakata Gharama

Tafsiri kazi isiyo ya kawaida ili kuchakata gharama kwa kuunda na kusajili mawakala muhimu kisha kuwaamsha.
- Kazi hii inachakata gharama kwa kupakia vigezo vya mazingira, kuunda mawakala muhimu, na kuwasajili kama programu-jalizi.
- Inaunda mazungumzo ya kikundi na mawakala wawili na kutuma ujumbe wa mwongozo ili kuzalisha barua pepe na chati ya pai kulingana na data ya gharama.
- Inashughulikia makosa yoyote yanayotokea wakati wa kuanzisha mazungumzo na kuhakikisha mawakala wanasafishwa ipasavyo.


In [6]:
async def process_expenses():
    load_dotenv()
    settings_slm = OpenAIChatPromptExecutionSettings(service_id="gpt-4o")
    settings_llm = OpenAIChatPromptExecutionSettings(service_id="open_ai")  # Fixed typo in service_id
    
    ocr_agent = ChatCompletionAgent(
        kernel=_create_kernel_with_chat_completion("ocrAgent"),
        name="ocr_agent",
        instructions="Extract travel expense data from the receipt image in the prompt using the 'extract_text' function from the 'ocrAgent' plugin. Return the data in the format 'date|description|amount|category' separated by semicolons.",
        arguments=KernelArguments(settings=settings_slm)
    )
    
       
    email_agent = ChatCompletionAgent(
            kernel=_create_kernel_with_chat_completion("expenseEmailAgent"),
            name="email_agent",
            instructions="Take the travel expense data from the previous agent and generate a professional expense claim email using the 'generate_expense_email' function from the 'expenseEmailAgent' plugin, then pass the data forward.",
            arguments=KernelArguments(
                settings=settings_llm)
        )


    kernel = Kernel()

    # Use fixed path to receipt.jpg in the same folder
    image_path = "./receipt.jpg"
    
    # Create a structured message with text and image content for OCR processing
    image_url_str = f"file://{image_path}"
    
    # Using the correct format for multi-modal content
    user_message = ChatMessageContent(
        role=AuthorRole.USER,
        items=[
            TextContent(text="""
            Please extract the raw text from this receipt image, focusing on travel expenses like dates, descriptions, amounts, and categories (e.g., Transportation, Accommodation, Meals, Miscellaneous).
            Then generate a professional expense claim email.
                        """),
            ImageContent.from_image_file(path=image_path)
        ]
    )

    # Register plugins with the kernel
    kernel.add_plugin(OCRAgentPlugin(), plugin_name="ocrAgent")
    kernel.add_plugin(ExpenseEmailAgent(), plugin_name="expenseEmailAgent")

    # Create group chat
    chat = AgentGroupChat(
        agents=[ocr_agent, email_agent],
        selection_strategy=SequentialSelectionStrategy(initial_agent=ocr_agent),
        termination_strategy=DefaultTerminationStrategy(maximum_iterations=1)
    )

    # Add user message with prompt
    await chat.add_chat_message(user_message)
    print(f"# User message added to chat with receipt image")

    async for content in chat.invoke():
        print(f"# Agent - {content.name or '*'}: '{content.content}'")


## Kazi kuu

Fafanua kazi kuu ili kusafisha koni na kuendesha kazi ya `process_expenses` kwa njia ya usawazishaji.


In [9]:
async def main():
    # Clear the console
    os.system('cls' if os.name=='nt' else 'clear')

    # Run the async agent code
    await process_expenses()

await main()

# User message added to chat with receipt image
# Agent - ocr_agent: 'The receipt primarily seems to capture costs for meals and beverages. Below is the extracted travel expense data:

**Travel Expense Data:**  
`2 May '22|Meals at restaurant|75.15|Meals`

---

**Professional Expense Claim Email Draft:**  

**Subject:** Expense Claim for Meals – 2 May 2022  

Dear [Recipient's Name],  

I am submitting an expense claim for a meal incurred during a business-related trip. Below are the details:  

- **Date:** 2 May 2022  
- **Expense Description:** Meals at a restaurant  
- **Amount:** $75.15  
- **Category:** Meals  

Please find the attached receipt for your reference. Kindly process the reimbursement at your earliest convenience. Let me know if you require additional information.  

Thank you for your assistance.  

Best regards,  
[Your Name]  
[Your Contact Information]  

Let me know if you need further revisions or additional details!'



---

**Kanusho**:  
Hati hii imetafsiriwa kwa kutumia huduma ya kutafsiri ya AI [Co-op Translator](https://github.com/Azure/co-op-translator). Ingawa tunajitahidi kuhakikisha usahihi, tafadhali fahamu kuwa tafsiri za kiotomatiki zinaweza kuwa na makosa au kutokuwa sahihi. Hati ya asili katika lugha yake ya awali inapaswa kuzingatiwa kama chanzo cha mamlaka. Kwa taarifa muhimu, tafsiri ya kitaalamu ya binadamu inapendekezwa. Hatutawajibika kwa kutoelewana au tafsiri zisizo sahihi zinazotokana na matumizi ya tafsiri hii.
