Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/add boto3 sqs instrumentation #1081

Merged

Conversation

oxeye-nikolay
Copy link
Member

Description

This PR instruments the SQS client as part of python's boto3 package. Unlike the instrumentation of boto3, this instrumentation propagates the context and baggage over the sent messages, following the messaging systems spec.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • I have created two services using the same queue. The client was sending both batch and single messages and the producer was polling on the queue and iterating over the messages.

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@oxeye-nikolay oxeye-nikolay requested a review from a team as a code owner May 3, 2022 14:47
@oxeye-nikolay oxeye-nikolay self-assigned this May 3, 2022
@github-actions github-actions bot requested a review from nikosokolik May 3, 2022 14:47
@oxeye-nikolay oxeye-nikolay removed the request for review from nikosokolik May 3, 2022 14:47
@github-actions github-actions bot requested a review from nikosokolik May 3, 2022 15:04
@oxeye-nikolay oxeye-nikolay removed the request for review from nikosokolik May 3, 2022 16:46
@owais
Copy link
Contributor

owais commented May 9, 2022

Can we not extend existing sqs instrumentation? We shouldn't be publishing two instrumentation packages for the same library. This will cause a lot of confusion. We'll be putting additional burden on users to figure out which one to use. What happens if both instrumentations are installed? Will both work? Will one cancel the other? Will the cancellation be deterministic? Even if the answer is a No today, further chances to one lib could end up breaking the other.

Personally if we think this implementation is far superior and it'd be harder to add these features to existing one, I'd rather completely replace the existing one with this instead of creating a new competing package.

@owais
Copy link
Contributor

owais commented May 9, 2022

I see this is for boto3 while as the old one is for boto. Makes sense.

@owais
Copy link
Contributor

owais commented May 9, 2022

I'll review the code in more detail shortly but at first glance, any reason we don't follow same pattern as boto instrumentation where we have a single boto/botocore package with extensions for things like sqs and s3?

@NathanielRN
Copy link
Contributor

I'll review the code in more detail shortly but at first glance, any reason we don't follow same pattern as boto instrumentation where we have a single boto/botocore package with extensions for things like sqs and s3?

+1 to this. I personally liked how OTel JS does it where they have a single instrumentation package (they actually have one pkg both v3 and v2, but it's fine that we how one for boto3 spearately) where they have a package for v3 that includes several extensions for the multiple services.

They have a ServiceExtensions.ts file where they register the different services

constructor() {
    this.services.set('SQS', new SqsServiceExtension());
    this.services.set('SNS', new SnsServiceExtension());
    this.services.set('DynamoDB', new DynamodbServiceExtension());
    this.services.set('Lambda', new LambdaServiceExtension());
  }

and their file organization by service looks very clean too.

It would be nice for users to have opentelemetry-instrumentation-boto3 and find immediate tracing for all the services that they do end up using without having to install more packages 🙂. If size were an issue (like it is in Lambda) then we can always mini-fy the code for services we don't use by removing directories.

As another point, what about calling this opentelemetry-instrumentaiton-boto3-sqs since there is no boto3sqs package? I don't know how this will break our scripts that count on that divider 😅

@oxeye-nikolay
Copy link
Member Author

oxeye-nikolay commented May 10, 2022

I'll review the code in more detail shortly but at first glance, any reason we don't follow same pattern as boto instrumentation where we have a single boto/botocore package with extensions for things like sqs and s3?

I would really love for it to have been like this. Unfortunatly, as I state in the documentation, the SQS classes are loaded in runtime from a template via the boto3.client function. This means I cannot instrument it until created, and that's why I wrap the boto3.client function.

I'll review the code in more detail shortly but at first glance, any reason we don't follow same pattern as boto instrumentation where we have a single boto/botocore package with extensions for things like sqs and s3?

+1 to this. I personally liked how OTel JS does it where they have a single instrumentation package (they actually have one pkg both v3 and v2, but it's fine that we how one for boto3 spearately) where they have a package for v3 that includes several extensions for the multiple services.

They have a ServiceExtensions.ts file where they register the different services

constructor() {
    this.services.set('SQS', new SqsServiceExtension());
    this.services.set('SNS', new SnsServiceExtension());
    this.services.set('DynamoDB', new DynamodbServiceExtension());
    this.services.set('Lambda', new LambdaServiceExtension());
  }

and their file organization by service looks very clean too.

It would be nice for users to have opentelemetry-instrumentation-boto3 and find immediate tracing for all the services that they do end up using without having to install more packages 🙂. If size were an issue (like it is in Lambda) then we can always mini-fy the code for services we don't use by removing directories.

As another point, what about calling this opentelemetry-instrumentaiton-boto3-sqs since there is no boto3sqs package? I don't know how this will break our scripts that count on that divider 😅

We can create an opentelemetry-instrumentation-boto3 package which will depend on both opentelemetry-instrumentation-botocore and opentelemetry-instrumentation-boto3sqs, and just use their instrumentations. This way users will just install opentelemetry-instrumentation-boto3 but we will be able to keep separation for ease of use. @NathanielRN WDYT?

  • Edit: I tried changing to opentelemetry-instrumentaiton-boto3-sqs but it didn't go so well. I wasn't able to import the package afterwards.

@owais
Copy link
Contributor

owais commented May 10, 2022

It sounds like something is wrong with our packaging system. Ideally it shouldn't matter whether a something is a sub-module or an entirely different package. Could you share more details about the issues you ran into when trying to organize this as a single package?

@owais
Copy link
Contributor

owais commented May 10, 2022

By single package, I mean opentelemetry-instrumentation-boto3. I don't think we should have a single package for both boto and boto3. They are technically different libraries so we should have different instrumentation packages.

@@ -0,0 +1,63 @@
# Copyright The OpenTelemetry Authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not require a separate file. you can add the same test class in test_boto3sqs_instrumentation.py only

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,44 @@
# Copyright The OpenTelemetry Authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as test_getter.py

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

).__getitem__(*args, **kwargs)
if not isinstance(retval, dict):
return retval
if not (receipt_handle := retval.get("ReceiptHandle", None)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could create a problem in python < 3.8 as walrus operator (':=') is available only from python 3.8. Its better to make it more generic by separating assignment and checking of receipt_handle

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my bad. For some reason, I remembered it was a python 3.6 feature. Changed


@staticmethod
def _safe_end_processing_span(receipt_handle: str) -> None:
if started_span := Boto3SQSInstrumentor.received_messages_spans.pop(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as before for walrus operator (':=')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

links=links,
kind=SpanKind.CONSUMER,
)
with trace.use_span(span):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can combine start_span and use_span in one API: start_as_current_span() with with statement

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Done

attributes = kwargs.pop("MessageAttributes", {})
propagate.inject(attributes, setter=boto3sqs_setter)
retval = wrapped(*args, MessageAttributes=attributes, **kwargs)
if message_id := retval.get("MessageId"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can pass message_id in enrich_span() so that you dont have to repeat the line 242 and 243 again as its already present in enrich_span()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to add it later, because if the call of retval = wrapped(*args, MessageAttributes=attributes, **kwargs) will raise (by a wrong queue for example), the span will still have the available information - helping the user to trace and debug. @sanketmehta28 WDYT?

message_identifier = successful_messages["Id"]
if message_span := ids_to_spans.get(message_identifier):
if message_span.is_recording():
message_span.set_attribute(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as before for message_id

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to this case I add the information after because if I set all the information after, a failure will create a span with lacking information that can be provided.

@ocelotl ocelotl merged commit 8a7a3f1 into open-telemetry:main May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants