Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Make Mistral ai instrumentation read context attributes #420

Merged
merged 13 commits into from
May 4, 2024

Conversation

fjcasti1
Copy link
Contributor

@fjcasti1 fjcasti1 commented May 3, 2024

No description provided.

@fjcasti1 fjcasti1 marked this pull request as ready for review May 3, 2024 02:13
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label May 3, 2024
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels May 3, 2024
@fjcasti1 fjcasti1 requested a review from axiomofjoy May 3, 2024 02:48
Copy link
Contributor

@axiomofjoy axiomofjoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than changing every test case to use using_attributes, can we add one new test case and name it appropriately?

@fjcasti1
Copy link
Contributor Author

fjcasti1 commented May 3, 2024

Rather than changing every test case to use using_attributes, can we add one new test case and name it appropriately?

I want to test all the scenarios. sync, async, with tools, without, etc.

Comment on lines +940 to +957
def _check_context_attributes(
attributes: Dict[str, Any],
session_id: str,
user_id: str,
metadata: Dict[str, Any],
tags: List[str],
) -> None:
assert attributes.pop(SESSION_ID, None) == session_id
assert attributes.pop(USER_ID, None) == user_id
attr_metadata = attributes.pop(METADATA, None)
assert attr_metadata is not None
assert isinstance(attr_metadata, str) # must be json string
metadata_dict = json.loads(attr_metadata)
assert metadata_dict == metadata
attr_tags = attributes.pop(TAG_TAGS, None)
assert attr_tags is not None
assert len(attr_tags) == len(tags)
assert list(attr_tags) == tags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a best practice i try to follow to avoid abstraction as much as possible in test cases. duplication, especially of something like an assert statement, is very acceptable and even encouraged. the reason being that test cases need to be easy to understand because they need to point you to issues quickly, be stupidly easy to understand, and serve as documentation for other developers. i like the way they put it here: dry is for code, damp is for tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree in general but in this case we're testing, on each case, that the attributes match the fixtures. I do not see the benefit of expanding this until we want to make more complex combined tests. However, if it's a must, I can explode the helper function on each test

@axiomofjoy
Copy link
Contributor

Definitely agree we should test sync and async, but I think these should be separate test cases. I'm not clear that this feature has any interaction with something like tool calls. They are orthogonal features, aren't they? I'd also mention that by adding using_attributes to every test case, we are no longer testing that mistral instrumentation works without the context manager, which is probably the most common situation.

Thinking of it another way, when a test fails, it ideally points you directly to the root cause of the failure. if every test case uses using_attributes and using_attributes breaks for some reason, I would have many red test cases. if i have a test case called test_using_attributes_sync or something like that and only that one test case fails, i immediately know what is wrong and the name of the test accurately reflects the nature of the problem.

…ation

* main:
  feat: Make langchain instrumentation read context attributes (#419)
  refactor: use dict() to pass the context attributes to span (#422)
@axiomofjoy
Copy link
Contributor

Thinking about this some more, I'm not even sure we necessarily need to explicitly test using_attributes in this module. Is the plan to have two tests, one async and one sync, for each of the five context managers (using_attributes, using_metadata, using_tags, using_session_id, and using_user_id) across each of the six instrumentation libraries? That's 60 tests and will grow when we add more libraries, and they are sort of integration tests between openinference-instrumentation and openinference-instrumentation-mistralai that result in coupling, rather than true unit tests. Imagine if we have to make a breaking change to the signature of the context managers. We'd have to change tests across many libraries. These kinds of tests are valuable to test the integration between openinference-instrumentation and other libraries, but we probably can't maintain that many of them.

What if instead we just add one test to openinference-instrumentation-mistralai testing that the instrumentation is able to pull session, user, metadata, and tags from OTel context? If we also have tests in openinference-instrumentation verifying that each context manager places the appropriate attributes into OTel context, I think that would be enough to be confident that things are working correctly. An integration test between using_attributes and one of the client libraries would be helpful, too.

Copy link
Contributor

@axiomofjoy axiomofjoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving based on the agreed-upon changes we discussed @fjcasti1 nice job 👍

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label May 3, 2024
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels May 3, 2024
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels May 3, 2024
@fjcasti1 fjcasti1 merged commit e4a61c5 into main May 4, 2024
3 checks passed
@fjcasti1 fjcasti1 deleted the kiko/context-attributes-mistralai-instrumentation branch May 4, 2024 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants