Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle "not modified" errors properly #120

Merged
merged 3 commits into from
Jul 6, 2023

Conversation

jhump
Copy link
Member

@jhump jhump commented Jul 5, 2023

A "not modified" error really should not be counted as an error with "Unknown" error code. So this addresses that.

That does beg the question: how should it be handled? I see three options:

  1. Skip the status code attribute entirely, since "not modified" doesn't cleanly map to an RPC code.
  2. Treat it as "okay"/zero, since it's not really an error condition.
  3. Don't use an RPC status code and instead use an HTTP status code attribute.

This PR currently uses the last option. But I'm happy to change it, based on what a larger group of minds thinks is the best way to label this state.

@akshayjshah
Copy link
Member

Being compatible with gRPC's oddball status codes is one of the most painful parts of this project 😬 I wish we could add a status code specifically for 304s, but c'est la vie.

Effectively, I think that the approach proposed here is doing both 2 & 3 - it's not reporting an RPC error/status, and it's instead setting an HTTP status code. Overall, that seems right to me!

attributes.go Outdated
// it would be misleading to label it as an unknown error since it's not really
// an error, but rather a sentinel to trigger a "304 Not Modified" HTTP status.
codeKey := attribute.Key("http.response.status_code")
return codeKey.Int(http.StatusNotModified), true
Copy link
Member

@akshayjshah akshayjshah Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems right directionally. A few thoughts:

  • Suggest using semconv.HTTPStatusCode instead of hand-writing a string for the attr key. OTel is really larded up with standards, and I think the blessed key for this data is http.status_code - there's enough of this nonsense that we should just use the giant package of constants where possible.
  • Can we move this into the Connect-specific portion of the switch statement below? It makes sense to me that we treat this as an actual unknown error for the gRPC protocols, since that's how clients will interpret it. For the Connect protocol, it makes sense to omit the RPC error code and include an HTTP status code instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that is weird that this constant uses http.status_code. According to the docs I found, it is http.response.status_code: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md#common-attributes

Anyhow, I'll change it to use the constant, and let otel fix either their docs or that constant if one of them is wrong 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, OTel.... 😭

@jhump jhump merged commit 5403124 into main Jul 6, 2023
4 checks passed
@jhump jhump deleted the jh/handle-not-modified-errors-properly branch July 6, 2023 16:08
@akshayjshah
Copy link
Member

@jhump Please hold off on a release here; we also need to update spanStatus.

@jhump
Copy link
Member Author

jhump commented Jul 6, 2023

Oh, good catch. Sorry I missed that in this PR. I clearly don't know this repo very well.

@akshayjshah
Copy link
Member

No worries - it's surprisingly complicated. I opened a PR to fix that issue, but I think we ought to add a test or two to make sure that the final telemetry output looks correct.

akshayjshah added a commit that referenced this pull request Jul 7, 2023
If the RPC is using the Connect protocol and returns a not modified
error, we shouldn't set the span status to error. Follow-up to #120.

---------

Co-authored-by: Josh Humphries <2035234+jhump@users.noreply.github.com>
renovate bot referenced this pull request in open-feature/flagd Jul 8, 2023
… to v0.4.0 (#739)

[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/bufbuild/connect-opentelemetry-go](https://togithub.com/bufbuild/connect-opentelemetry-go)
| require | minor | `v0.3.0` -> `v0.4.0` |

---

### Release Notes

<details>
<summary>bufbuild/connect-opentelemetry-go
(github.com/bufbuild/connect-opentelemetry-go)</summary>

###
[`v0.4.0`](https://togithub.com/bufbuild/connect-opentelemetry-go/releases/tag/v0.4.0)

[Compare
Source](https://togithub.com/bufbuild/connect-opentelemetry-go/compare/v0.3.0...v0.4.0)

<!-- Release notes generated using configuration in .github/release.yml
at main -->

#### What's Changed

##### Enhancements

- Add option to omit trace events for unary RPCs by
[@&#8203;jipperinbham](https://togithub.com/jipperinbham) in
[https://github.com/bufbuild/connect-opentelemetry-go/pull/98](https://togithub.com/bufbuild/connect-opentelemetry-go/pull/98)
- Make option to omit trace events apply to streaming RPCs by
[@&#8203;joshcarp](https://togithub.com/joshcarp) in
[https://github.com/bufbuild/connect-opentelemetry-go/pull/123](https://togithub.com/bufbuild/connect-opentelemetry-go/pull/123)

##### Bugfixes

- Set status codes properly for HTTP 304s by
[@&#8203;jhump](https://togithub.com/jhump) in
[https://github.com/bufbuild/connect-opentelemetry-go/pull/120](https://togithub.com/bufbuild/connect-opentelemetry-go/pull/120)
- Set span status properly for HTTP 304s by
[@&#8203;akshayjshah](https://togithub.com/akshayjshah) in
[https://github.com/bufbuild/connect-opentelemetry-go/pull/121](https://togithub.com/bufbuild/connect-opentelemetry-go/pull/121)
- Update studio URL in README by
[@&#8203;pkwarren](https://togithub.com/pkwarren) in
[https://github.com/bufbuild/connect-opentelemetry-go/pull/124](https://togithub.com/bufbuild/connect-opentelemetry-go/pull/124)

##### Other changes

- Tighten internal linting for import aliases by
[@&#8203;zchee](https://togithub.com/zchee) in
[https://github.com/bufbuild/connect-opentelemetry-go/pull/116](https://togithub.com/bufbuild/connect-opentelemetry-go/pull/116)

#### New Contributors

- [@&#8203;jhump](https://togithub.com/jhump) made their first
contribution in
[https://github.com/bufbuild/connect-opentelemetry-go/pull/120](https://togithub.com/bufbuild/connect-opentelemetry-go/pull/120)
- [@&#8203;jipperinbham](https://togithub.com/jipperinbham) made their
first contribution in
[https://github.com/bufbuild/connect-opentelemetry-go/pull/98](https://togithub.com/bufbuild/connect-opentelemetry-go/pull/98)
- [@&#8203;pkwarren](https://togithub.com/pkwarren) made their first
contribution in
[https://github.com/bufbuild/connect-opentelemetry-go/pull/124](https://togithub.com/bufbuild/connect-opentelemetry-go/pull/124)
- [@&#8203;zchee](https://togithub.com/zchee) made their first
contribution in
[https://github.com/bufbuild/connect-opentelemetry-go/pull/116](https://togithub.com/bufbuild/connect-opentelemetry-go/pull/116)

**Full Changelog**:
bufbuild/connect-opentelemetry-go@v0.3.0...v0.4.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/open-feature/flagd).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS4xNTkuNyIsInVwZGF0ZWRJblZlciI6IjM1LjE1OS43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
akshayjshah pushed a commit that referenced this pull request Jul 26, 2023
For Connect RPCs that use GET and a "304 Not Modified" status
is returned, the metric will use the `http.status_code` attribute
with a value of 304, instead of an `rpc.connect.error_code`
attribute. (It would previously label the metric with an error
code of "unknown", which is misleading.)
akshayjshah added a commit that referenced this pull request Jul 26, 2023
If the RPC is using the Connect protocol and returns a not modified
error, we shouldn't set the span status to error. Follow-up to #120.

---------

Co-authored-by: Josh Humphries <2035234+jhump@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants