Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add net.peer.ip in requests & urllib3 instrumentations. #661

Merged
merged 16 commits into from Sep 27, 2021

Conversation

Oberon00
Copy link
Member

@Oberon00 Oberon00 commented Sep 6, 2021

Description

Capture net.peer.ip in requests & urllib3 instrumentations, using a shared new package opentelemetry-instrumentation-http-base instrumentor for httplib.client that could also be used to easily add net.peer.ip to other instrumentations based on the stdlib's httplib.client (but this PR starts with only the aforementioned two).

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Unit tests included. Due to how sharing of test code is currently set up, this requires a change to the main repo to add the test base class.

Does This PR Require a Core Repo Change?

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@Oberon00
Copy link
Member Author

Oberon00 commented Sep 6, 2021

What's up with these pylint errors? instrumentation/opentelemetry-instrumentation-botocore/tests/test_botocore_instrumentation.py:133:16: E1101: Instance of 'TestBotocoreInstrumentor' has no 'get_finished_spans' member (no-member)

@Oberon00 Oberon00 marked this pull request as ready for review September 6, 2021 14:15
@Oberon00 Oberon00 requested a review from a team as a code owner September 6, 2021 14:15
@lzchen
Copy link
Contributor

lzchen commented Sep 7, 2021

Any chance we can add this functionality to http utils? We already have that package to contain common http functionality.

@Oberon00
Copy link
Member Author

Oberon00 commented Sep 7, 2021

I think that should not be a technical problem. It will add a dependency on asgiref though and so far util-http only instruments server-side libs. If you prefer I can merge the new http-base package into util-http.

@Oberon00
Copy link
Member Author

@lzchen I'll go ahead and change the package.

@Oberon00
Copy link
Member Author

@lzchen Done. Instead of adding a new package, I now extend the util-http package, like you suggested.

.github/workflows/test.yml Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
logger = logging.getLogger(__name__)


class HttpClientInstrumentor(BaseInstrumentor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too sure how this is supposed to be used by the user. Is this Instrumentor instantiated as well along side RequestsInstrumentor? Or is RequestsInstrumentor supposed to extend this class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually now that you say it, this was the reason why it was in a separate package...

This is not supposed to be used by the user at all. Instead it should just be loaded by the instrumentation infrastructure and then it does its thing behind the scentes. The requests and urllib3 instrumentations are the actual users of this code, since they register their wish to have the IP set onto a span which this instrumentation fulfills once the standard library httplib opens a socket.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have to add an entrypoint to the setup.cfg of util-http as well as the other instrumentation boilerplate. Do you think it would be better to keep it as separate package?

Copy link
Contributor

@lzchen lzchen Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind having an entrypoint in http libraries for this, unless you feel strongly having another package. I personally don't like having many packages :).

Do all client libraries have a "send" method? If not, maybe we can just have trysetip and set_ip_on_next_http_connection as utility functions, and decorate the request functions directly? How come we need to create a new instrumentor for this feature?

Copy link
Member Author

@Oberon00 Oberon00 Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we need to create a new instrumentor for this feature?

Because this instruments the standard library httplib.client. I could put it in the urllib3 instrumentation, but if we ever want to instrument another library based on httplib.client (e.g. there is a urllib instrumentation here that could also be enhanched with this), then where should we put it if not in a separate, independent instrumentor?

Do all client libraries have a "send" method?

No, just (those based on) httplib.client, which applies to requests and urllib3 at least.

Copy link
Member Author

@Oberon00 Oberon00 Sep 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even the tests have:

HttpClientInstrumentor().instrument()
URLLib3Instrumentor().instrument()

Yes. The tests need to call HttpClientInstrumentor().instrument() manually since no distribution (is it called that?) is loaded that auto-enables the instrumentation. URLLib3Instrumentor().instrument() cannot call HttpClientInstrumentor().instrument() IIUC, as that would lead to double instrumentation. Also, I wanted to have the feature optional. If users disable the HttpClientInstrumentor (not sure if this is possible today), they won't get net.peer.ip and potential associated overhead and bugs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirming here. As a user, if I manually instrument code, and load the RequestsInstrumentor, I will also need to manually load the HttpClientInstrumentor myself correct?

Would it be possible to call trysetip from the requests/urlib3 code and skip the additional instrumentor? Or is the problem that those libraries don't have access to the underlying socket to pull IP information from it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the main reason was that its not easy to get the underlying socket. I did not check for every library separately, in some it may be possible to use some undocumented properties or instrument some constructors to get to the socket. Instrumenting httplib.client, although also awkward, seemed like the most clean way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, that makes sense. Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codeboten To follow up here: Is manually using the instrumentors common enough that it would make sense to add a hint to use HttpClientInstrumentor to the READMEs of the requests and urllib3 instrumentations?

@lzchen
Copy link
Contributor

lzchen commented Sep 14, 2021

@Oberon00

It will add a dependency on asgiref though

Sorry I might be missing something. Where is the dependency for asgiref in this library?

@Oberon00
Copy link
Member Author

@lzchen

Sorry I might be missing something. Where is the dependency for asgiref in this library?

The code in this PR does not depend on asgiref, but the util-http package does.

@lzchen
Copy link
Contributor

lzchen commented Sep 14, 2021

@Oberon00

The code in this PR does not depend on asgiref, but the util-http package does.

Am I looking at the wrong place? I don't think the util-http package has any dependencies.

@Oberon00
Copy link
Member Author

@lzchen You are right, seems I somehow looked in the wrong place 😄

@@ -40,3 +40,7 @@ packages=find_namespace:

[options.packages.find]
where = src

[options.entry_points]
opentelemetry_instrumentor =
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared to the original PR version with a separate package for this:
On the minus side, this now means that everyone depending on util-http will get the httplib.client instrumentation, even if they don't use it. The overhead should be small though. On the plus side, there is no new package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit of not having a new package outweighs the small overhead I feel :)

@@ -40,3 +40,7 @@ packages=find_namespace:

[options.packages.find]
where = src

[options.entry_points]
opentelemetry_instrumentor =
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, is there some integration test that loads the urllib3 or requests instrumentation via the entrypoint, that I could extend to assert on the net.peer.ip attribute being present, so I could verify the wiring is not messed up here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, you are wanting to test if attribute is present if auto-instrumentation is used? If so, I don't believe we have integration tests for those. We currently only have unit tests for auto-instrumentation.

@Oberon00
Copy link
Member Author

Is there anything I can do to move this forward?

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, just one clarifying question regarding the flow of how the instrumentation is used.

logger = logging.getLogger(__name__)


class HttpClientInstrumentor(BaseInstrumentor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirming here. As a user, if I manually instrument code, and load the RequestsInstrumentor, I will also need to manually load the HttpClientInstrumentor myself correct?

Would it be possible to call trysetip from the requests/urlib3 code and skip the additional instrumentor? Or is the problem that those libraries don't have access to the underlying socket to pull IP information from it?

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

logger = logging.getLogger(__name__)


class HttpClientInstrumentor(BaseInstrumentor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, that makes sense. Thanks!

@lzchen lzchen merged commit efaa257 into open-telemetry:main Sep 27, 2021
@Oberon00 Oberon00 deleted the http-client-peer-ip branch October 5, 2021 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants