## Flow

In [1]:
# goal
goal = "Try fabric prompts to analyse a page"

# tasks
task_1 = "extract a website text"
task_2 = "analyse it using a fabric prompt"

## Source

In [24]:
# urls

url_1 = "https://a16z.com/generative-ai-in-accounting/"
url_2 = "https://www.answer.ai/posts/2024-06-11-os-ai.html"


## Setup

In [3]:
# imports

import enum
import instructor
import json
import os
import re
import uuid
from abc import ABC, abstractmethod
from bs4 import BeautifulSoup
from datetime import datetime
from dotenv import load_dotenv
from exa_py import Exa
from googleapiclient.discovery import build
from IPython.display import display
from openai import OpenAI
import pandas as pd
from pathlib import Path
from pprint import pprint as pp
from pydantic import BaseModel, Field, StringConstraints, UUID4, conlist, constr, field_validator
import requests
import tiktoken
import time
from typing import Any, Callable, ClassVar, Dict, Iterable, List, Optional, Type, Union
from typing_extensions import Annotated, Literal
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api.formatters import JSONFormatter, TextFormatter

In [25]:
# load API key

dotenv_path = Path(r"C:\Storage\python_projects\ashvin\.env")
load_dotenv(dotenv_path=dotenv_path)
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
YOUTUBE_API_KEY = os.getenv("YOUTUBE_API_KEY")
EXA_API_KEY = os.getenv("EXA_API_KEY")

# main constants

GPT_MODEL = "gpt-4o" # points to latest GPT model
GPT_35_MODEL = "gpt-3.5-turbo"
URL = url_2

#instantiate client
client = instructor.from_openai(OpenAI(), mode=instructor.Mode.TOOLS)
audio_client = OpenAI()

## Utilities

In [6]:
# cost decorator

class CostDetails(BaseModel):
    input_cost: float
    output_cost: float
    total_cost: float

    def formatted_input_cost(self):
        return f"${self.input_cost:.6f}"

    def formatted_output_cost(self):
        return f"${self.output_cost:.6f}"

    def formatted_total_cost(self):
        return f"${self.total_cost:.6f}"

def cost(function: Callable) -> Callable:
    """
    Decorator to calculate and add the cost of token usage based on predefined model pricing.
    
    This decorator enriches the output of the decorated function by calculating the cost
    based on the number of prompt and completion tokens used. The costs are computed
    according to a hardcoded pricing table for supported models.

    Args:
        function (Callable): The function to be decorated, expected to return an instance
                             of a model with token counts included.

    Returns:
        Callable: A decorator that enhances the function's output with cost calculations.
    """

    # Define the pricing table within the decorator
    pricing = {
        'gpt-4o': {
            'input': 5.00 / 1000000,  # $5.00 per 1M tokens
            'output': 15.00 / 1000000  # $15.00 per 1M tokens
        }
    }

    def decorated_function(*args, **kwargs) -> Any:
        # Call the original function and capture its output
        result = function(*args, **kwargs)
        
        # Extract token counts using dot notation
        prompt_tokens = result.token_counts.prompt_tokens
        completion_tokens = result.token_counts.completion_tokens

        # Determine the model used; default to 'gpt-4o' for now
        model = 'gpt-4o'  # This could be dynamically determined based on args/kwargs if needed

        # Calculate costs based on the price table for the specific model
        input_cost = prompt_tokens * pricing[model]['input']
        output_cost = completion_tokens * pricing[model]['output']
        total_cost = input_cost + output_cost
        
        # Assign cost details using the CostDetails model
        result.cost_details = CostDetails(
            input_cost=input_cost,
            output_cost=output_cost,
            total_cost=total_cost
        )

        # Optionally print formatted cost details for transparency
        print(f"Cost Details: Input: {result.cost_details.formatted_input_cost()}, Output: {result.cost_details.formatted_output_cost()}, Total: {result.cost_details.formatted_total_cost()}")
        return result

    return decorated_function

In [7]:
# wrapper

@cost
def wrapper(
    system_prompt: str | None = None, 
    user_prompt: Union[str, List[str]] | None = None, 
    response_model: BaseModel | None = None, 
    max_retries: int = 3, 
    additional_messages: Union[str, List[str]] | None = None
) -> 'WrapperOutput':
    
    """
    Generates LLM completions using provided parameters and collects token usage information.
    
    This function dynamically constructs a message array for the LLM based on input parameters,
    handles the completion process using either standard or model-based completions depending on 
    the presence of a response model, and returns structured outputs including both the completion 
    response and token usage statistics.

    Args:
        system_prompt (str, optional): System-level initial prompt or instruction.
        user_prompt (Union[str, List[str]], optional): User-provided content or context as a single string or list of strings.
        response_model (BaseModel, optional): Pydantic model to structure the response when using model-specific completions.
        max_retries (int): Maximum number of retries for the LLM request.
        additional_messages (Union[str, List[str]], optional): Additional messages to precede the user prompt.

    Returns:
        WrapperOutput: A Pydantic model containing the LLM response and detailed token counts.

    Classes Defined Inside:
        TokenCounts: A Pydantic model detailing the counts of different types of tokens.
        WrapperOutput: A Pydantic model encapsulating the response and TokenCounts model.
    """

    class TokenCounts(BaseModel):
        completion_tokens: int
        prompt_tokens: int
        total_tokens: int

    class WrapperOutput(BaseModel):
        response: Union[str, BaseModel]
        token_counts: TokenCounts
        cost_details: Optional[Dict[str, str]] = None

    messages = []

    # Construct the messages list based on provided inputs
    if system_prompt:
        messages.append({"role": "system", "content": system_prompt})

    if additional_messages:
        # Can handle both list of messages or a single string
        if isinstance(additional_messages, List):
            messages.extend([{"role": "user", "content": message} for message in additional_messages])
        else:
            messages.append({"role": "user", "content": additional_messages})

    if user_prompt:
        # Similarly, handles both single and multiple user prompts
        if isinstance(user_prompt, List):
            messages.extend([{"role": "user", "content": context} for context in user_prompt])
        else:
            messages.append({"role": "user", "content": user_prompt})

    # Generate the completion and extract token counts based on the presence of a response model
    if response_model is None:
        # Standard completion process without a structured model
        completion = client.chat.completions.create(
            model=GPT_MODEL,
            response_model=None,
            max_retries=max_retries,
            messages=messages
        )
        response_content = completion.choices[0].message.content.strip()
        token_counts = TokenCounts(
            completion_tokens=completion.usage.completion_tokens,
            prompt_tokens=completion.usage.prompt_tokens,
            total_tokens=completion.usage.total_tokens
        )
    else:
        # Model-based completion that structures the response as per the specified BaseModel
        structured_response, raw_completion = client.chat.completions.create_with_completion(
            model=GPT_MODEL,
            response_model=response_model,
            max_retries=max_retries,
            messages=messages
        )
        response_content = structured_response
        token_counts = TokenCounts(
            completion_tokens=raw_completion.usage.completion_tokens,
            prompt_tokens=raw_completion.usage.prompt_tokens,
            total_tokens=raw_completion.usage.total_tokens
        )

    return WrapperOutput(response=response_content, token_counts=token_counts)


In [8]:
# predict tokens

def count_tokens(text: str, print_length: bool = True, token_type: str = 'input') -> int:
    """
    Count the number of tokens in a given text string using a specific tokenization model, print the token count,
    calculate and print the cost of tokens based on a pricing table.

    Parameters:
        text (str): The text string to tokenize and count.
        print_length (bool): If True, prints the length of the tokens. Default is True.
        token_type (str): Specifies whether to use 'input' or 'output' token pricing. Default is 'input'.

    Returns:
        int: The number of tokens in the text.
    """
    # Encode the transcript to count tokens
    encoding = tiktoken.get_encoding("cl100k_base")
    tokens = encoding.encode(text)
    token_count = len(tokens)

    # Print the token length if required
    if print_length:
        print(f"Token count: {token_count}")

    # Pricing table
    pricing = {
        'input': 5 / 1_000_000,  # $5 per 1 million tokens
        'output': 15 / 1_000_000  # $15 per 1 million tokens
    }

    # Calculate and print cost
    cost = pricing[token_type] * token_count
    print(f"Cost for {token_type} tokens: ${cost:.6f}")

    return None

## Tools

In [10]:
# web text extraction tool

class WebTextExtractor(BaseModel):
    """
    This tool extracts the text content from a given URL using the requests module.
    
    The extracted text can include the raw HTML content or just the text content of the web page.
    """

    def run(self, url: str, return_html: bool = False) -> Optional[str]:
        """
        Extract the text content from a given URL.

        Parameters:
            url (str): The URL from which to extract the text content.
            return_html (bool): If True, return the full HTML content. If False, return plain text. Default is False.

        Returns:
            Optional[str]: The text content of the web page if the request is successful,
                           otherwise None.
        """
        try:
            response = requests.get(url)
            response.raise_for_status()  # Raise an HTTPError for bad responses (4xx and 5xx)
            
            if return_html:
                return response.text
            else:
                soup = BeautifulSoup(response.text, 'html.parser')
                page_text = soup.get_text()
                return page_text.strip()
                
        except requests.RequestException as e:
            print(f"Error fetching the web page: {e}")
            return None



## Run

In [26]:
web_tool = WebTextExtractor()
web_text = web_tool.run(URL)
_ = count_tokens(web_text)

Token count: 7733
Cost for input tokens: $0.038665


In [14]:
extract_wisdom = """
# IDENTITY and PURPOSE

You are a wisdom extraction service for text content. You are interested in wisdom related to the purpose and meaning of life, the role of technology in the future of humanity, artificial intelligence, memes, learning, reading, books, continuous improvement, and similar topics.

Take a step back and think step by step about how to achieve the best result possible as defined in the steps below. You have a lot of freedom to make this work well.

## OUTPUT SECTIONS

1. You extract a summary of the content in 50 words or less, including who is presenting and the content being discussed into a section called SUMMARY.

2. You extract the top 50 ideas from the input in a section called IDEAS:. If there are less than 50 then collect all of them.

3. You extract the 15-30 most insightful and interesting quotes from the input into a section called QUOTES:. Use the exact quote text from the input.

4. You extract 15-30 personal habits of the speakers, or mentioned by the speakers, in the connt into a section called HABITS. Examples include but aren't limited to: sleep schedule, reading habits, things the

5. You extract the 15-30 most insightful and interesting valid facts about the greater world that were mentioned in the content into a section called FACTS:.

6. You extract all mentions of writing, art, and other sources of inspiration mentioned by the speakers into a section called REFERENCES. This should include any and all references to something that the speake

7. You extract the 15-30 most insightful and interesting overall (not content recommendations from EXPLORE) recommendations that can be collected from the content into a section called RECOMMENDATIONS.

## OUTPUT INSTRUCTIONS

1. You only output Markdown.
2. Do not give warnings or notes; only output the requested sections.
3. You use numberd lists, not bullets.
4. Do not repeat ideas, quotes, facts, or resources.
5. Do not start items with the same opening words.
"""

In [15]:
response = wrapper(
    system_prompt = extract_wisdom,
    user_prompt = URL
)

Cost Details: Input: $0.002215, Output: $0.019905, Total: $0.022120


In [17]:
print(response.response)

# SUMMARY
Anders Brownworth and Martin Casado from Andreessen Horowitz discuss the application of generative AI in accounting, exploring its potential to automate repetitive tasks, enhance decision-making, and increase efficiency. They also address challenges such as data privacy and reliability.

# IDEAS
1. Generative AI can automate repetitive accounting tasks.
2. AI can enhance decision-making in accounting.
3. Increased efficiency in accounting processes through AI.
4. Data privacy is crucial when using generative AI in accounting.
5. Reliability of AI outputs is a concern.
6. The adoption of AI in accounting is still in its early stages.
7. Generative AI can help detect fraud.
8. AI can improve the accuracy of financial forecasting.
9. Accountants need to adapt to new technologies.
10. Ethical considerations are important in AI implementation.
11. AI can personalize financial advice.
12. Training AI models requires significant data.
13. Generative AI can assist with data analysis.

In [21]:
revised_response = wrapper(
    system_prompt = extract_wisdom,
    user_prompt = web_text
)

Cost Details: Input: $0.020565, Output: $0.014925, Total: $0.035490


In [22]:
print(revised_response.response)

# SUMMARY
Marc Andrusko and Seema Amble from Andreessen Horowitz discuss the impact of generative AI on accounting, emphasizing its potential to streamline data collection, research, report generation, and client services, while noting current limitations and future developments.

# IDEAS
1. AI should be added to the certainties of life alongside death and taxes.
2. Generative AI can significantly enhance efficiency and time savings in accounting.
3. Big investments are being made in AI by firms like PWC and Reuters.
4. Fewer professionals are entering the accounting field, creating a demand gap.
5. Generative AI is effective at summarizing research and answering questions.
6. Certain aspects of accounting still require complex calculations and quantitative analysis.
7. Data reconciliation in accounting can benefit from LLM-powered data extraction.
8. LLMs can centralize data relevant to finance teams within enterprises.
9. AI copilots can resolve accounting discrepancies faster by acc

In [18]:
analyse_prose = """
# IDENTITY and PURPOSE

You are an expert writer and editor and you excel at evaluating the quality of writing and other content and providing various ratings and recommendations about how to improve it from a novelty, clarity, and overall messaging standpoint.

Take a step back and think step-by-step about how to achieve the best outcomes by following the STEPS below.

# STEPS

1. Fully digest and understand the content and the likely intent of the writer, i.e., what they wanted to convey to the reader, viewer, listener.

2. Identify each discrete idea within the input and evaluate it from a novelty standpoint, i.e., how surprising, fresh, or novel are the ideas in the content? Content should be considered novel if it's combining ideas in an interesting way, proposing anything new, or describing a vision of the future or application to human problems that has not been talked about in this way before.

3. Evaluate the combined NOVELTY of the ideas in the writing as defined in STEP 2 and provide a rating on the following scale:

"A - Novel" -- Does one or more of the following: Includes new ideas, proposes a new model for doing something, makes clear recommendations for action based on a new proposed model, creatively links existing ideas in a useful way, proposes new explanations for known phenomenon, or lays out a significant vision of what's to come that's well supported. Imagine a novelty score above 90% for this tier.

Common examples that meet this criteria:

- Introduction of new ideas.
- Introduction of a new framework that's well-structured and supported by argument/ideas/concepts.
- Introduction of new models for understanding the world.
- Makes a clear prediction that's backed by strong concepts and/or data.
- Introduction of a new vision of the future.
- Introduction of a new way of thinking about reality.
- Recommendations for a way to behave based on the new proposed way of thinking.

"B - Fresh" -- Proposes new ideas, but doesn't do any of the things mentioned in the "A" tier. Imagine a novelty score between 80% and 90% for this tier.

Common examples that meet this criteria:

- Minor expansion on existing ideas, but in a way that's useful.

"C - Incremental" -- Useful expansion or improvement of existing ideas, or a useful description of the past, but no expansion or creation of new ideas. Imagine a novelty score between 50% and 80% for this tier.

Common examples that meet this criteria:

- Valuable collections of resources
- Descriptions of the past with offered observations and takeaways

"D - Derivative" -- Largely derivative of well-known ideas. Imagine a novelty score between in the 20% to 50% range for this tier.

Common examples that meet this criteria:

- Contains ideas or facts, but they're not new in any way.

"F - Stale" -- No new ideas whatsoever. Imagine a novelty score below 20% for this tier.

Common examples that meet this criteria:

- Random ramblings that say nothing new.

4. Evaluate the CLARITY of the writing on the following scale.

"A - Crystal" -- The argument is very clear and concise, and stays in a flow that doesn't lose the main problem and solution.
"B - Clean" -- The argument is quite clear and concise, and only needs minor optimizations.
"C - Kludgy" -- Has good ideas, but could be more concise and more clear about the problems and solutions being proposed.
"D - Confusing" -- The writing is quite confusing, and it's not clear how the pieces connect.
"F - Chaotic" -- It's not even clear what's being attempted.

5. Evaluate the PROSE in the writing on the following scale.

"A - Inspired" -- Clear, fresh, distinctive prose that's free of cliche.
"B - Distinctive" -- Strong writing that lacks significant use of cliche.
"C - Standard" -- Decent prose, but lacks distinctive style and/or uses too much cliche or standard phrases.
"D - Stale" -- Significant use of cliche and/or weak language.
"F - Weak" -- Overwhelming language weakness and/or use of cliche.

6. Create a bulleted list of recommendations on how to improve each rating, each consisting of no more than 15 words.

7. Give an overall rating that's the lowest rating of 3, 4, and 5. So if they were B, C, and A, the overall-rating would be "C".

# OUTPUT INSTRUCTIONS

- You output in Markdown, using each section header followed by the content for that section.
- Don't use bold or italic formatting in the Markdown.
- Liberally evaluate the criteria for NOVELTY, meaning if the content proposes a new model for doing something, makes clear recommendations for action based on a new proposed model, creatively links existing ideas in a useful way, proposes new explanations for known phenomenon, or lays out a significant vision of what's to come that's well supported, it should be rated as "A - Novel".
- The overall-rating cannot be higher than the lowest rating given.
- The overall-rating only has the letter grade, not any additional information.

# INPUT:

INPUT:
"""

In [19]:
new_response = wrapper(
    system_prompt = analyse_prose,
    user_prompt = web_text
)

Cost Details: Input: $0.023730, Output: $0.008280, Total: $0.032010


In [20]:
print(new_response.response)

## Digest and Understand the Content and Intent
The article "Death, Taxes, and AI: How Generative AI Will Change Accounting" primarily aims to explore the impact of generative AI on the accounting industry, laying out various practical applications of AI in this field. It discusses how AI can enhance efficiency, data ingestion, research, report generation, and client service in accounting workflows. Additionally, it delves into challenges and considerations for adoption, including talent development and incentive alignment.

## Discrete Ideas and Novelty Evaluation
1. Generative AI's potential to revolutionize accounting practices.
2. Applications of AI in data collection, ingestion, and reconciliation.
3. LLM-powered data extraction from unstructured formats.
4. AI-powered research and specific use cases for R&D tax credits.
5. Automation in report generation and filing.
6. Transformation of client service through high-quality insights.
7. Challenges in replacing human judgment and sa

In [23]:
analyse_claims = """
# IDENTITY and PURPOSE

You are an objectively minded and centrist-oriented analyzer of truth claims and arguments.

You specialize in analyzing and rating the truth claims made in the input provided and providing both evidence in support of those claims, as well as counter-arguments and counter-evidence that are relevant to those claims.

You also provide a rating for each truth claim made.

The purpose is to provide a concise and balanced view of the claims made in a given piece of input so that one can see the whole picture.

Take a step back and think step by step about how to achieve the best possible output given the goals above.

# Steps

- Deeply analyze the truth claims and arguments being made in the input.
- Separate the truth claims from the arguments in your mind.

# OUTPUT INSTRUCTIONS

- Provide a summary of the argument being made in less than 30 words in a section called ARGUMENT SUMMARY:.

- In a section called TRUTH CLAIMS:, perform the following steps for each:

1. List the claim being made in less than 15 words in a subsection called CLAIM:.
2. Provide solid, verifiable evidence that this claim is true using valid, verified, and easily corroborated facts, data, and/or statistics. Provide references for each, and DO NOT make any of those up. They must be 100% real and externally verifiable. Put each of these in a subsection called CLAIM SUPPORT EVIDENCE:.

3. Provide solid, verifiable evidence that this claim is false using valid, verified, and easily corroborated facts, data, and/or statistics. Provide references for each, and DO NOT make any of those up. They must be 100% real and externally verifiable. Put each of these in a subsection called CLAIM REFUTATION EVIDENCE:.

4. Provide a list of logical fallacies this argument is committing, and give short quoted snippets as examples, in a section called LOGICAL FALLACIES:.

5. Provide a CLAIM QUALITY score in a section called CLAIM RATING:, that has the following tiers:
   A (Definitely True)
   B (High)
   C (Medium)
   D (Low)
   F (Definitely False)

6. Provide a list of characterization labels for the claim, e.g., specious, extreme-right, weak, baseless, personal attack, emotional, defensive, progressive, woke, conservative, pandering, fallacious, etc., in a section called LABELS:.

- In a section called OVERALL SCORE:, give a final grade for the input using the same scale as above. Provide three scores:

LOWEST CLAIM SCORE:
HIGHEST CLAIM SCORE:
AVERAGE CLAIM SCORE:

- In a section called OVERALL ANALYSIS:, give a 30-word summary of the quality of the argument(s) made in the input, its weaknesses, its strengths, and a recommendation for how to possibly update one's understanding of the world based on the arguments provided.

# INPUT:

INPUT:
"""

In [27]:
claims = wrapper(
    system_prompt = analyse_claims,
    user_prompt = web_text
)

Cost Details: Input: $0.041625, Output: $0.016560, Total: $0.058185


In [28]:
print(claims.response)

ARGUMENT SUMMARY:
Jeremy Howard argues that policymakers need to understand AI technology to create effective legislation like SB 1047, emphasizing the difference between deploying and releasing AI models.

TRUTH CLAIMS:

1. CLAIM:
Policy makers need to understand how AI works.
   - CLAIM SUPPORT EVIDENCE:
     - Many policymakers lack a background in AI, making it difficult for them to understand legislative implications (Nature, 2021).
     - Technical understanding is crucial for effective AI policies (Brookings, 2020).
   - CLAIM REFUTATION EVIDENCE:
     - While specific technical knowledge may be lacking, policymakers often use expert consultations (Government Technology, 2022).
     - Some legislation has been successfully enacted despite complex technological underpinnings (Forbes, 2021).
   - LOGICAL FALLACIES:
     - Appeal to Authority: "Few, if any, of the policy makers working on this kind of legislation have a background in AI."
   - CLAIM RATING:
     B (High)
   - LABEL

In [29]:
new_web_text = web_tool.run("https://blog.langchain.dev/what-is-an-agent/")
_ = count_tokens(new_web_text)

Token count: 1226
Cost for input tokens: $0.006130


In [30]:
response1 = wrapper(
    analyse_claims,
    new_web_text
)
print(response1.response)

Cost Details: Input: $0.009155, Output: $0.012705, Total: $0.021860
# ARGUMENT SUMMARY:
An agent is a system using an LLM to decide application control flow, with varying degrees of autonomy.

# TRUTH CLAIMS:

## CLAIM 1:
CLAIM: "An agent is a system that uses an LLM to decide the control flow of an application."

### CLAIM SUPPORT EVIDENCE:
- An agent, as defined in numerous AI literature, generally refers to a system capable of making decisions and executing tasks based on predefined rules or learned experiences (Russell & Norvig, "Artificial Intelligence: A Modern Approach").
- Specifically, using LLMs (Language Model Models) to guide the decision-making process aligns with the emergence of AI systems designed to handle more complex and context-aware tasks (OpenAI's GPT documentation).

### CLAIM REFUTATION EVIDENCE:
- The perception of what constitutes an agent varies widely. Some definitions include simpler rule-based systems that do not require LLMs (Wooldridge & Jennings, "Intel

In [31]:
response2 = wrapper(
    analyse_prose,
    new_web_text
)
print(response2.response)

Cost Details: Input: $0.011420, Output: $0.008175, Total: $0.019595
# Assessment of Content

## Digest and Understand
The content discusses the concept of an "agent" in the context of LLM (Large Language Model) applications, aiming to clarify what constitutes an agent, the varying degrees of agentic capabilities, and the importance of these capabilities in designing and managing such systems. The writer's intent appears to be to elucidate the technical nuances of agents while advocating for a broader understanding and better tooling for more agentic applications.

## Identify Discrete Ideas and Novelty
1. Definition of an agent.
2. Variability in people's understanding of what an agent is.
3. Concept of agentic behavior and its spectrum.
4. Categorization of LLM systems based on their level of agentic behavior.
5. Importance of understanding the degree of agentic behavior in developing LLM applications.
6. Justification for developing new tools and infrastructure, such as LangGraph and

In [32]:
response3 = wrapper(
    extract_wisdom,
    new_web_text
)

Cost Details: Input: $0.008255, Output: $0.013305, Total: $0.021560


In [34]:
print(response3.response)

# SUMMARY:
Harrison Chase of LangChain discusses the definition and nature of AI agents, their varying degrees of autonomy, and the importance of specialized tools and infrastructure to develop and monitor these systems.

# IDEAS:
1. An agent uses an LLM to decide the control flow of an application.
2. Agents can be seen as systems that function on a spectrum of autonomy.
3. Andrew Ng's viewpoint suggests categorizing agents by degrees of agentic capabilities.
4. Levels of autonomy in LLM systems range from simple routers to complex autonomous agents.
5. More agentic systems require specialized orchestration frameworks.
6. Complex agentic systems need durable execution for error management.
7. Interaction with agentic systems requires robust monitoring and evaluation frameworks.
8. The agentic nature of a system determines tooling and infrastructure needs.
9. Tools like LangGraph and LangSmith are essential for building and monitoring agentic systems.
10. Evaluation frameworks for agen