Refactor OpenAI Error Tracing #987

umaannamalai · 2023-11-21T08:36:25Z

This PR refactors error tracing for OpenAI with the following changes:

Only pass error attrs to notice_error()
Attach completion_id/ embedding_id to error trace to distinguish AI errors
Send chat completion and embedding events on error as well
Add error: True to chat completion summary/ embedding events
Add is_response attr to chat completion messages that are LLM responses

github-actions · 2023-11-21T08:40:42Z

🦙 MegaLinter status: ✅ SUCCESS

Descriptor	Linter	Files	Fixed	Elapsed time
✅ PYTHON	bandit	1		4.95s
✅ PYTHON	black	4	0	1.1s
✅ PYTHON	flake8	4		0.59s
✅ PYTHON	isort	4	0	0.2s
✅ PYTHON	pylint	4		3.1s

See detailed report in MegaLinter reports
Set VALIDATE_ALL_CODEBASE: true in mega-linter.yml to validate all sources, not only the diff

MegaLinter is graciously provided by

codecov-commenter · 2023-11-21T17:12:07Z

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (develop-ai-limited-preview@7d4828e). Click here to learn what that means.

Additional details and impacted files

@@                      Coverage Diff                      @@
##             develop-ai-limited-preview     #987   +/-   ##
=============================================================
  Coverage                              ?   82.06%           
=============================================================
  Files                                 ?      191           
  Lines                                 ?    20041           
  Branches                              ?     3474           
=============================================================
  Hits                                  ?    16446           
  Misses                                ?     2597           
  Partials                              ?      998

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

newrelic/hooks/mlmodel_openai.py

hmstepanek

This is a good first whack at this. I noted a couple areas in the expected values for the exceptions that need to be changed-there are a lot more that I didn't specifically call out that are along the same lines. I'd prefer to see less drastic changes to the hook code (as I noted I'm not sure if this will work out in practice but maybe give it a try and see if it ends up being simpler).
There are quite a few linter errors that we might want to take a look at as well.

tests/mlmodel_openai/test_chat_completion.py

tests/mlmodel_openai/test_chat_completion_error.py

hmstepanek · 2023-11-22T18:55:33Z

tests/mlmodel_openai/test_chat_completion_error.py

-    exact_agents={
-        "error.message": "Must provide an 'engine' or 'model' parameter to create a <class 'openai.api_resources.chat_completion.ChatCompletion'>",
-    }
+   exact_agents={


It looks like your editor is changing the spacing on a lot of these to be 3 instead of 4 spaces. I'm not sure how black didn't fix that either but we shouldn't be dropping the 4th space on these.

tests/mlmodel_openai/test_embeddings_error.py

newrelic/hooks/mlmodel_openai.py

hmstepanek · 2023-11-22T19:10:10Z

newrelic/hooks/mlmodel_openai.py

+        "embedding_id": embedding_id,
+        "appName": settings.app_name,
+        "api_key_last_four_digits": api_key_last_four_digits,
+        "span_id": span_id,


Is there a bug in this code right now where we are capturing the wrong span id? Shouldn't we be capturing the span id inside the function trace? I just looked at the SDK code and they attach the function trace's span id.

I'm not sure how this would work in practice but what I was imaging for this design and what I would propose is that this code wouldn't change this much from the previous version and that our .get(attr_name, "") fallback logic would handle most of the cases where things don't exist because an error occurred. I'd propose maybe approaching the implementation from that perspective and see if that works out better?

hmstepanek

I had couple small fixups to simplify the logic. I also think we shouldn't be trying to handle feedback in the case that an error occurs but I've put a message in the slack channel asking everyone what they think as well.

hmstepanek · 2023-11-28T23:52:38Z

newrelic/hooks/mlmodel_openai.py

+                "request.model": request_args.get("model") or request_args.get("engine") or "",
+                "vendor": "openAI",
+                "ingest_source": "Python",
+                "response.organization": "" if exc_organization is None else exc_organization,


It looks like this already defaults to empty string above so you can simplify the logic here.

Suggested change

"response.organization": "" if exc_organization is None else exc_organization,

"response.organization": exc_organization,

The reason I added this was because the organization attribute is found on the exception but it is None (at least for our test cases). The default of "" isn't returned since there was no attribute error so this is attempting to coerce that behavior.

oooh ok that makes sense.

newrelic/hooks/mlmodel_openai.py

…-python-agent into refactor-error-tracing

hmstepanek

Thank you for sticking it out through this review process. It's been a lot of back and forth! There's only one minor issue here so I'm just gonna pre-approve this and assume you'll make that change before merging.

hmstepanek · 2023-11-30T18:57:28Z

newrelic/hooks/mlmodel_openai.py

+                "request.model": request_args.get("model") or request_args.get("engine") or "",
+                "vendor": "openAI",
+                "ingest_source": "Python",
+                "response.organization": "" if exc_organization is None else exc_organization,


oooh ok that makes sense.

hmstepanek · 2023-12-01T00:22:28Z

newrelic/hooks/mlmodel_openai.py

+                span_id,
+                trace_id,
+                "",
+                error_response_id,


We already have code inside create_message to default this correctly if it doesn't exist so just let that code run, otherwise we end up with uuid-0 instead of just uuid.

Suggested change

error_response_id,

None,

hmstepanek · 2023-12-01T00:22:35Z

newrelic/hooks/mlmodel_openai.py

+            }
+            transaction.record_custom_event("LlmChatCompletionSummary", error_chat_completion_dict)
+
+            error_response_id = str(uuid.uuid4())


Suggested change

error_response_id = str(uuid.uuid4())

hmstepanek · 2023-12-01T00:23:19Z

newrelic/hooks/mlmodel_openai.py

+                span_id,
+                trace_id,
+                "",
+                error_response_id,


Suggested change

error_response_id,

None,

hmstepanek · 2023-12-01T00:23:26Z

newrelic/hooks/mlmodel_openai.py

+            }
+            transaction.record_custom_event("LlmChatCompletionSummary", error_chat_completion_dict)
+
+            error_response_id = str(uuid.uuid4())


Suggested change

error_response_id = str(uuid.uuid4())

Error refactoring and is_response.

f018b8e

mergify bot added the merge-conflicts label Nov 21, 2023

Merge branch 'develop-ai-limited-preview' into refactor-error-tracing

97d2fb6

mergify bot removed the merge-conflicts label Nov 21, 2023

mergify bot added the tests-failing label Nov 21, 2023

Fix merge conflicts.

063bed0

umaannamalai added 2 commits November 21, 2023 09:22

Update dictionary merging syntax.

77ff432

Remove breakpoint.

809d8e1

mergify bot removed the tests-failing label Nov 21, 2023

umaannamalai commented Nov 21, 2023

View reviewed changes

newrelic/hooks/mlmodel_openai.py Outdated Show resolved Hide resolved

umaannamalai marked this pull request as ready for review November 21, 2023 20:46

umaannamalai requested a review from a team as a code owner November 21, 2023 20:46

hmstepanek requested changes Nov 22, 2023

View reviewed changes

umaannamalai added 3 commits November 27, 2023 12:37

Address review feedback.

04edb28

Formatting.

efdac1c

Formatting tests.

e89d500

mergify bot added the tests-failing label Nov 27, 2023

umaannamalai and others added 7 commits November 27, 2023 19:28

Address linting errors.

ee138d8

[Mega-Linter] Apply linters fixes

96bebc4

Trigger tests

6e6c877

Add test fixes.

953ed0a

Merge conflicts.

9d7342f

[Mega-Linter] Apply linters fixes

c918998

Trigger tests

8e5f1c3

hmstepanek requested changes Nov 29, 2023

View reviewed changes

hmstepanek reviewed Nov 29, 2023

View reviewed changes

newrelic/hooks/mlmodel_openai.py Outdated Show resolved Hide resolved

umaannamalai added 2 commits November 29, 2023 15:38

Separate message input and output lists.

58d831e

Merge branch 'refactor-error-tracing' of github.com:newrelic/newrelic…

4adf28f

…-python-agent into refactor-error-tracing

mergify bot removed the tests-failing label Nov 29, 2023

Uncomment tests.

269c716

hmstepanek approved these changes Dec 1, 2023

View reviewed changes

Remove error_response_id.

ed975d2

hmstepanek merged commit 091a815 into develop-ai-limited-preview Dec 4, 2023
47 checks passed

hmstepanek deleted the refactor-error-tracing branch December 4, 2023 23:09

umaannamalai added this to the v9.8.0 milestone Mar 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor OpenAI Error Tracing #987

Refactor OpenAI Error Tracing #987

umaannamalai commented Nov 21, 2023

github-actions bot commented Nov 21, 2023 •

edited

Loading

codecov-commenter commented Nov 21, 2023 •

edited

Loading

hmstepanek left a comment

hmstepanek Nov 22, 2023

hmstepanek Nov 22, 2023

hmstepanek Nov 22, 2023

hmstepanek left a comment

hmstepanek Nov 28, 2023

umaannamalai Nov 29, 2023

hmstepanek Nov 30, 2023

hmstepanek left a comment

hmstepanek Nov 30, 2023

hmstepanek Dec 1, 2023

hmstepanek Dec 1, 2023

hmstepanek Dec 1, 2023

hmstepanek Dec 1, 2023

	"response.organization": "" if exc_organization is None else exc_organization,
	"response.organization": exc_organization,

Refactor OpenAI Error Tracing #987

Refactor OpenAI Error Tracing #987

Conversation

umaannamalai commented Nov 21, 2023

github-actions bot commented Nov 21, 2023 • edited Loading

🦙 MegaLinter status: ✅ SUCCESS

codecov-commenter commented Nov 21, 2023 • edited Loading

Codecov Report

hmstepanek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hmstepanek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hmstepanek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Nov 21, 2023 •

edited

Loading

codecov-commenter commented Nov 21, 2023 •

edited

Loading