Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make span attributes inaccessible except when needed [WIP] #1501

Closed

Conversation

shovnik
Copy link
Contributor

@shovnik shovnik commented Dec 23, 2020

Description

Attempting to discourage accessing Span Attributes outside the SDK using the method suggested by @Oberon00.

Fixes #1001

How Has This Been Tested?

This is currently failing tests and is a work in progress. The pull request has only been opened as a draft for feedback on the method.

@shovnik shovnik requested a review from a team as a code owner December 23, 2020 18:51
@shovnik shovnik requested review from codeboten and toumorokoshi and removed request for a team December 23, 2020 18:51
@shovnik shovnik marked this pull request as draft December 23, 2020 18:51
@shovnik
Copy link
Contributor Author

shovnik commented Dec 23, 2020

@Oberon00 I attempted to apply your suggestion in issue to store the current ReadWrite Span inside the _data property of a new Span class and only give access to attributes when needed after collection has ended in an export cycle in the PR linked above. This solution works great to allows the span_processor and exporters to access attributes without exposing them to the public, but there are very large amount of tests that create spans using the start_span method whether directly or indirectly which does not give access to the span properties as the tests require. Is there a better solution in that case than accessing span._data. and disabling linting rule, because that felt excessive. If the start_span method created a ReadWrite span then the WriteOnly span would never be used and defeat the purpose of making the Span attributes inaccessible since Spans are already discouraged to be instantiated directly as per issue. Do you have any idea to work around this?

@Oberon00
Copy link
Member

Is there a better solution in that case than accessing span._data. and disabling linting rule, because that felt excessive.

I think there is no way around touching all tests here, as preventing easy access is the whole point of the PR.
But you don't need to disable the linting rule everywhere. You can add a single test utility function like get_sdk_span_data (I think there is some test utility module somewhere here) and disable the linting rule only for that function.

@@ -411,6 +411,104 @@ def __new__(cls, *args, **kwargs):
raise TypeError("Span must be instantiated via a tracer.")
return super().__new__(cls)

def __init__(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe construct the _ReadWriteSpan at the call site and only pass that one in to avoid duplicating the long parameter list.

@toumorokoshi
Copy link
Member

I left a comment in the other issue, but if the discussion is happening here:

This solution works great to allows the span_processor and exporters to access attributes without exposing them to the public, but there are very large amount of tests that create spans using the start_span method whether directly or indirectly which does not give access to the span properties as the tests require.

There really isn't a great solution to only exposing interface attributes to some consumers (exporters) but not others (application consumers). In addition, I think that distinction doesn't really matter practically speaking, as owners of opentelemetry exporters or other components that interoperate with the SDK need some sort of API to retrieve values for non-application related operations.

So I'd suggest just exposing getters. Really I'd go a step further and remove the ReadWrite span class altogether, and just modify sdk.Span to be the ReadWrite span (add getter methods, make variables protected).

@shovnik
Copy link
Contributor Author

shovnik commented Dec 29, 2020

Wouldn't adding getters to the existing Span defeat the purpose of limiting access to span attributes in normal use cases? Getters is essentially signalling that span attributes are meant to be accessed. The current implementation without any of my change already allows users to access read/write 'span.attributes' as they please which seems to be the issue highlighted.

@toumorokoshi
Copy link
Member

Wouldn't adding getters to the existing Span defeat the purpose of limiting access to span attributes in normal use cases?

Yes. Although I would argue that no matter what, you're giving code that is outside of the core package access to span attributes. And not having that access method match idiomatic python would be an issue.

I understand the spec is effectively trying to say that such getters should only be available to those who want to extend the instrumentation (via exporters, for example), and not to the "application" author. So I believe there are three cases:

  • a: attributes that are private to the package (opentelemetry-python SDK)
  • b: attributes that are exposed to SDK extenders
  • c: attributes that are exposed publicly

The current PR IIUC does:

  • a: _data
  • b: _data
  • c: external span

So how does one determine attributes on _data that are private to the package?

In addition, the current mechanism of accessing such attributes (using a protected (underscore) variable name) is unidiomatic, and it can lead to SDK extenders thinking that any such attribute on the span is usable by SDK extenders in this fashion. I worry that setting the precedent means that further difficulties in modifying the SDK, since they will start to rely on other protected variables in the span.

@Oberon00
Copy link
Member

Without looking at the PR in full: My idea was that the access to _data is only done within the SDK itself.

So exporters/span processors are handed the value of _data which is the ReadWriteSpan. On the other hand, the Span you get when starting a new span does have the attributes hidden behind this _data.

@Oberon00
Copy link
Member

Oberon00 commented Dec 30, 2020

The exception to this would be unit tests (unless we want to go a more complex route and switch them to use an in-memory exporter or capture the ReadWriteSpan in on_start in a SpanProcessor). That's why I'd suggest providing a underscore-less method in some test package that gets the _data out of a span.

@shovnik
Copy link
Contributor Author

shovnik commented Dec 30, 2020

Would it be better if I leave the issue as is for now for further discussion or if I manually update the tests? My internship will be ending tomorrow, and I did not want to leave the remaining issue I had assigned hanging.

@shovnik
Copy link
Contributor Author

shovnik commented Dec 30, 2020

I do agree with yusuke that having a span with different access writes come out start_span and end_span and accessing _data internally is not very intuitive especially for people extending the SDK, but that might just be because I have never seen an approach like this before. However, adding a getter defeats the purpose of limiting access to attributes. This kind of leaves the Span back to where it started though.

@shovnik shovnik closed this Dec 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SDK SHOULD NOT provide access to a Span's attributes besides its SpanContext
3 participants