-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor OpenAI Error Tracing #987
Refactor OpenAI Error Tracing #987
Conversation
🦙 MegaLinter status: ✅ SUCCESS
See detailed report in MegaLinter reports |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop-ai-limited-preview #987 +/- ##
=============================================================
Coverage ? 82.06%
=============================================================
Files ? 191
Lines ? 20041
Branches ? 3474
=============================================================
Hits ? 16446
Misses ? 2597
Partials ? 998 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good first whack at this. I noted a couple areas in the expected values for the exceptions that need to be changed-there are a lot more that I didn't specifically call out that are along the same lines. I'd prefer to see less drastic changes to the hook code (as I noted I'm not sure if this will work out in practice but maybe give it a try and see if it ends up being simpler).
There are quite a few linter errors that we might want to take a look at as well.
exact_agents={ | ||
"error.message": "Must provide an 'engine' or 'model' parameter to create a <class 'openai.api_resources.chat_completion.ChatCompletion'>", | ||
} | ||
exact_agents={ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like your editor is changing the spacing on a lot of these to be 3 instead of 4 spaces. I'm not sure how black didn't fix that either but we shouldn't be dropping the 4th space on these.
newrelic/hooks/mlmodel_openai.py
Outdated
"embedding_id": embedding_id, | ||
"appName": settings.app_name, | ||
"api_key_last_four_digits": api_key_last_four_digits, | ||
"span_id": span_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a bug in this code right now where we are capturing the wrong span id? Shouldn't we be capturing the span id inside the function trace? I just looked at the SDK code and they attach the function trace's span id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how this would work in practice but what I was imaging for this design and what I would propose is that this code wouldn't change this much from the previous version and that our .get(attr_name, "") fallback logic would handle most of the cases where things don't exist because an error occurred. I'd propose maybe approaching the implementation from that perspective and see if that works out better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had couple small fixups to simplify the logic. I also think we shouldn't be trying to handle feedback in the case that an error occurs but I've put a message in the slack channel asking everyone what they think as well.
"request.model": request_args.get("model") or request_args.get("engine") or "", | ||
"vendor": "openAI", | ||
"ingest_source": "Python", | ||
"response.organization": "" if exc_organization is None else exc_organization, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this already defaults to empty string above so you can simplify the logic here.
"response.organization": "" if exc_organization is None else exc_organization, | |
"response.organization": exc_organization, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I added this was because the organization
attribute is found on the exception but it is None
(at least for our test cases). The default of "" isn't returned since there was no attribute error so this is attempting to coerce that behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oooh ok that makes sense.
…-python-agent into refactor-error-tracing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for sticking it out through this review process. It's been a lot of back and forth! There's only one minor issue here so I'm just gonna pre-approve this and assume you'll make that change before merging.
"request.model": request_args.get("model") or request_args.get("engine") or "", | ||
"vendor": "openAI", | ||
"ingest_source": "Python", | ||
"response.organization": "" if exc_organization is None else exc_organization, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oooh ok that makes sense.
newrelic/hooks/mlmodel_openai.py
Outdated
span_id, | ||
trace_id, | ||
"", | ||
error_response_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have code inside create_message to default this correctly if it doesn't exist so just let that code run, otherwise we end up with uuid-0 instead of just uuid.
error_response_id, | |
None, |
newrelic/hooks/mlmodel_openai.py
Outdated
} | ||
transaction.record_custom_event("LlmChatCompletionSummary", error_chat_completion_dict) | ||
|
||
error_response_id = str(uuid.uuid4()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error_response_id = str(uuid.uuid4()) |
newrelic/hooks/mlmodel_openai.py
Outdated
span_id, | ||
trace_id, | ||
"", | ||
error_response_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error_response_id, | |
None, |
newrelic/hooks/mlmodel_openai.py
Outdated
} | ||
transaction.record_custom_event("LlmChatCompletionSummary", error_chat_completion_dict) | ||
|
||
error_response_id = str(uuid.uuid4()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error_response_id = str(uuid.uuid4()) |
This PR refactors error tracing for OpenAI with the following changes:
Only pass error attrs to
notice_error()
Attach
completion_id
/embedding_id
to error trace to distinguish AI errorsSend chat completion and embedding events on error as well
Add
error: True
to chat completion summary/ embedding eventsAdd
is_response
attr to chat completion messages that are LLM responses