[Enhancement]: observability integration without using agenta hosted apps #1567

aybruhm · 2024-04-26T12:52:42Z

Description

This PR proposes integrating observability into LLM applications with Agenta, even if the applications are not hosted on Agenta.

Related Issue

Closes cloud_#338
Relative commons_#43

Additional Information

A pre-alpha version of the SDK has been deployed to pypi for testing. You can check the obs-app on cloud.beta and review the observability traces created with the example in this PR.

… rewrite agenta singleton init

…nal run

vercel · 2024-04-26T12:52:48Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
agenta	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	May 30, 2024 7:37pm

mmabrouk

Thanks for the PR @aybruhm
I did not finish the review, I am a bit too tired of that :) But I thought I'd leave some of the comments I made.

examples/app_with_observability/app_async.py

agenta-cli/agenta/sdk/agenta_decorator.py

mmabrouk

@aybruhm I thought a bit about the problem, unfortunately I have to leave early today (in a bit) so I am just going to write my main thoughts. I hope these make sense. I'd love to get your feedback on this:

There are three workflows for instrumentation in agenta:

Using an application hosted in agenta, we take care of adding the code snippets
Using an application in shell
Using an application somewhere else (hosted somewhere else)

We have designed the SDK for instrumentation for the first case. To instrument stuff in the first case, we use the following:

ag.init → The only reason for this is to set the variables api_key, app_id,, and initialize the stuff for the config.
llm_tracing() → Initializes the tracing object
@ag.entrypoint → Calls llm_tracing() to initialize the tracing singleton, then start the parent_span. In addition it does another million things for using the app: adding the argument to fastapi, the bash entrypoints…
@ag.span → Saves the span in the current tracing singleton

The question I am asking are:

How can we simplify instrumentation to the max. Remove anything that is not required.
- ag.init()
  - Not really required for the instrumentation per se. We can instead use the environment variables for initializing the tracing object in case the agenta singleton does not exist (or the input variables)
- llm_tracing We can keep this, as a way to use tracing with ag.init(), but for the pure observability use case, we can simply use the Tracing constructor and make sure that we use the environment variables if nothing is provided
- @ag.entrypoint → I think we should have a different decorator that only implements initializing the tracing object, and calling start_parent_span. Basically has only one responsability. (ag.trace )
- @ag.span Nothing changes there

So the way the final implementation would be:

# set the env vars
os.environ["AGENTA_API_KEY"] = xxx
os.environ["AGENTA_APP_ID"] = xxx

@ag.span(type="LLM")
async def llm_call(prompt):
    chat_completion = await client.chat.completions.create(
        model="gpt-3.5-turbo", messages=[{"role": "user", "content": prompt}]
    )
    tracing.set_span_attribute(
        "model_config", {"model": "gpt-3.5-turbo", "temperature": ag.config.temperature}
    )  # translate to {"model_config": {"model": "gpt-3.5-turbo", "temperature": 0.2}}
    tokens_usage = chat_completion.usage.dict()
    return {
        "cost": ag.calculate_token_usage("gpt-3.5-turbo", tokens_usage),
        "message": chat_completion.choices[0].message.content,
        "usage": tokens_usage,
    }

@ag.span
async def generate(country: str, gender: str):
    """
    Generate a baby name based on the given country and gender.

    Args:
        country (str): The country to generate the name from.
        gender (str): The gender of the baby.
    """

    prompt = ag.config.prompt_template.format(country=country, gender=gender)
    response = await llm_call(prompt=prompt)
    return {
        "message": response["message"],
        "usage": response["usage"],
        "cost": response["cost"],
    }

In case someone has two spans in the same code using two different apps

# set the env vars
os.environ["AGENTA_API_KEY"] = xxx

@ag.span(type="LLM")
async def llm_call(prompt):
    chat_completion = await client.chat.completions.create(
        model="gpt-3.5-turbo", messages=[{"role": "user", "content": prompt}]
    )
    tracing.set_span_attribute(
        "model_config", {"model": "gpt-3.5-turbo", "temperature": ag.config.temperature}
    )  # translate to {"model_config": {"model": "gpt-3.5-turbo", "temperature": 0.2}}
    tokens_usage = chat_completion.usage.dict()
    return {
        "cost": ag.calculate_token_usage("gpt-3.5-turbo", tokens_usage),
        "message": chat_completion.choices[0].message.content,
        "usage": tokens_usage,
    }

@ag.trace(app_id="")
async def generate(country: str, gender: str):
    """
    Generate a baby name based on the given country and gender.

    Args:
        country (str): The country to generate the name from.
        gender (str): The gender of the baby.
    """

    prompt = ag.config.prompt_template.format(country=country, gender=gender)
    response = await llm_call(prompt=prompt)
    return {
        "message": response["message"],
        "usage": response["usage"],
        "cost": response["cost"],
    }

@ag.trace(app_id="")
async def somepotherlogic(country: str, gender: str):
    prompt = ag.config.prompt_template.format(country=country, gender=gender)
    response = await llm_call(prompt=prompt)
    return {
        "message": response["message"],
        "usage": response["usage"],
        "cost": response["cost"],
    }

The issue in the second example is that we have currently one singleton which we are using. So maybe this use case is not feasible now?

Love to hear your thoughts

aybruhm · 2024-05-01T06:24:38Z

The question I am asking are:

How can we simplify instrumentation to the max. Remove anything that is not required.

ag.init()

Not really required for the instrumentation per se. We can instead use the environment variables for initializing the tracing object in case the agenta singleton does not exist (or the input variables)

llm_tracing We can keep this, as a way to use tracing with ag.init(), but for the pure observability use case, we can simply use the Tracing constructor and make sure that we use the environment variables if nothing is provided

@ag.entrypoint → I think we should have a different decorator that only implements initializing the tracing object, and calling start_parent_span. Basically has only one responsability. (ag.trace )

@ag.span Nothing changes there

Thanks for the review, @mmabrouk. I thought about this last night, and I created a POC to test how the implementation would work and it did.

Observability would still work if we:

remove ag.init and use env vars instead
construct the Tracing class directly and make use of the env vars

Regarding having ag.trace, here's how it would be:

@ag.trace # ag.entrypoint will be added after ag.trace in the case where the user wants to deploy their LLM app to agenta
async def generate(country: str, gender: str):
    prompt = ag.config.prompt_template.format(country=country, gender=gender)
    response = await llm_call(prompt=prompt)
    return {
        "message": response["message"],
        "usage": response["usage"],
        "cost": response["cost"],
    }

aybruhm · 2024-05-01T06:33:40Z

In case someone has two spans in the same code using two different apps

# set the env vars
os.environ["AGENTA_API_KEY"] = xxx

@ag.span(type="LLM")
async def llm_call(prompt):
    chat_completion = await client.chat.completions.create(
        model="gpt-3.5-turbo", messages=[{"role": "user", "content": prompt}]
    )
    tracing.set_span_attribute(
        "model_config", {"model": "gpt-3.5-turbo", "temperature": ag.config.temperature}
    )  # translate to {"model_config": {"model": "gpt-3.5-turbo", "temperature": 0.2}}
    tokens_usage = chat_completion.usage.dict()
    return {
        "cost": ag.calculate_token_usage("gpt-3.5-turbo", tokens_usage),
        "message": chat_completion.choices[0].message.content,
        "usage": tokens_usage,
    }

@ag.span(app_id="")
async def generate(country: str, gender: str):
    """
    Generate a baby name based on the given country and gender.

    Args:
        country (str): The country to generate the name from.
        gender (str): The gender of the baby.
    """

    prompt = ag.config.prompt_template.format(country=country, gender=gender)
    response = await llm_call(prompt=prompt)
    return {
        "message": response["message"],
        "usage": response["usage"],
        "cost": response["cost"],
    }

@ag.span(app_id="")
async def somepotherlogic(country: str, gender: str):
    prompt = ag.config.prompt_template.format(country=country, gender=gender)
    response = await llm_call(prompt=prompt)
    return {
        "message": response["message"],
        "usage": response["usage"],
        "cost": response["cost"],
    }

The issue in the second example is that we have currently one singleton which we are using. So maybe this use case is not feasible now?

Love to hear your thoughts

The use case is. We just need to revise our implementation of Tracing to support multiple singletons if we want to have different spans for different applications.

The behaviour of the Tracing would not change, only thing we would need to do is modify the Tracing constructor to support multiple singletons, and also introduce a mechanism that would switch between different singletons based on the provided app_id.

…genta_decorator & span files and integrate their contents into their individual decorator file

…int decorator that enables tracing

…lm tracing

…ut-using-agenta-hosted-apps

…enta-hosted-apps' into fix-obs-pr-2

… into fix-obs-pr-2

Improvements to observability PR

aybruhm added 4 commits April 26, 2024 13:41

Refactor: added tracing when running llm app on terminal

c90fcfc

Refactor: implemented sdkclient class to build sdk clien instance and…

447282b

… rewrite agenta singleton init

Refactor: rewrite app_async example to allow tracing to work on termi…

88b88d8

…nal run

Style: format agenta_decorator with black

67d0f72

Build: tag sdk version to 0.13.7a0

cd18d5e

vercel bot deployed to Preview April 26, 2024 16:33 View deployment

aybruhm marked this pull request as ready for review April 26, 2024 16:50

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. CLI draft enhancement New feature or request example SDK labels Apr 26, 2024

aybruhm requested a review from mmabrouk April 26, 2024 16:51

mmabrouk reviewed Apr 28, 2024

View reviewed changes

examples/app_with_observability/app_async.py Outdated Show resolved Hide resolved

examples/app_with_observability/app_async.py Show resolved Hide resolved

agenta-cli/agenta/sdk/agenta_decorator.py Outdated Show resolved Hide resolved

mmabrouk requested changes Apr 30, 2024

View reviewed changes

aybruhm added 7 commits May 1, 2024 23:29

Docs: update tracing docstrings and set variant_id to be optional

5d937d9

Feat: created base decorator class

af71a30

Refactor: consolidate decorators into new decorators folder, remove a…

4b15ef5

…genta_decorator & span files and integrate their contents into their individual decorator file

Refactor: update imports for new decorators

64aa24e

Refactor: update app_async llm app example to show use of new entrypo…

e8bef98

…int decorator that enables tracing

Refactor: improved singleton design and condition to enable/disable l…

6d460eb

…lm tracing

Style: format with black

869dd24

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels May 2, 2024

Merge branch 'main' into 1560-age-118-observability-integration-witho…

688e63b

…ut-using-agenta-hosted-apps

vercel bot deployed to Preview May 2, 2024 18:17 View deployment

mmabrouk and others added 5 commits May 30, 2024 16:22

fix tests

9464b7d

Fix pickle issue

52fdafc

format

cc8d287

Bug Fix (llm_entrypoint): update is_main_script logic

683ab9a

SDK: bump pre-release to 0.15.0a4

514f2dc

aybruhm had a problem deploying to oss May 30, 2024 16:03 — with GitHub Actions Failure

vercel bot deployed to Preview May 30, 2024 16:05 View deployment

mmabrouk and others added 14 commits May 30, 2024 18:19

fixed entry point

ce92c6a

improved value error

046f2d7

removed need for () for entrypoint

23bfaf0

modified the examples to remove paranthesis

9c0a667

Merge branch '1560-age-118-observability-integration-without-using-ag…

0ca9392

…enta-hosted-apps' into fix-obs-pr-2

set logger to debug

a1cef51

revert debug to false in cli

80463b3

Merge branch 'fix-obs-pr-2' of https://github.com/Agenta-AI/agenta-core…

cdf696e

… into fix-obs-pr-2

Update agenta_init.py

1cf5dfb

Update app_async.py

fe94f90

Update app_nested_async.py

5d09dae

Update tracing_plus_entrypoint.py

9cb1326

fix

623197e

Merge pull request #1722 from Agenta-AI/fix-obs-pr-2

46539a2

Improvements to observability PR

aybruhm temporarily deployed to oss May 30, 2024 19:35 — with GitHub Actions Inactive

vercel bot deployed to Preview May 30, 2024 19:37 View deployment

aybruhm requested a review from mmabrouk May 30, 2024 19:40

mmabrouk approved these changes May 30, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label May 30, 2024

mmabrouk merged commit b200bb7 into main May 30, 2024
10 checks passed

mmabrouk deleted the 1560-age-118-observability-integration-without-using-agenta-hosted-apps branch May 30, 2024 20:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement]: observability integration without using agenta hosted apps #1567

[Enhancement]: observability integration without using agenta hosted apps #1567

aybruhm commented Apr 26, 2024 •

edited

vercel bot commented Apr 26, 2024 •

edited

mmabrouk left a comment

mmabrouk left a comment •

edited

aybruhm commented May 1, 2024 •

edited

aybruhm commented May 1, 2024

[Enhancement]: observability integration without using agenta hosted apps #1567

[Enhancement]: observability integration without using agenta hosted apps #1567

Conversation

aybruhm commented Apr 26, 2024 • edited

Description

Related Issue

Additional Information

vercel bot commented Apr 26, 2024 • edited

mmabrouk left a comment

Choose a reason for hiding this comment

mmabrouk left a comment • edited

Choose a reason for hiding this comment

aybruhm commented May 1, 2024 • edited

aybruhm commented May 1, 2024

aybruhm commented Apr 26, 2024 •

edited

vercel bot commented Apr 26, 2024 •

edited

mmabrouk left a comment •

edited

aybruhm commented May 1, 2024 •

edited