Skip to content

Clean up detection sentry events + tests#5833

Merged
RobertJoonas merged 29 commits intomasterfrom
cleanup-detection-sentry
Oct 27, 2025
Merged

Clean up detection sentry events + tests#5833
RobertJoonas merged 29 commits intomasterfrom
cleanup-detection-sentry

Conversation

@RobertJoonas
Copy link
Copy Markdown
Contributor

@RobertJoonas RobertJoonas commented Oct 27, 2025

Changes

This PR cleans up Sentry reporting for detection. Doing so, we're also changing the way we observe detection failures. Instead of "unhandled vs handled" (which makes sense in verification), a detection result is either a success or a failure. The telemetry events are also renamed accordingly in this PR.

A detection failure can be two things - an issue with the customer website, or an issue on our side. The issues on our side are further split into Browserless issues vs our own. Hence the telemetry bit is branched out quite a lot, but I think it should help us prioritize any future issues better.

Tests

  • Automated tests have been added

Changelog

  • This PR does not make a user-facing change

Documentation

  • This change does not need a documentation update

Dark mode

  • This PR does not change the UI

RobertJoonas and others added 27 commits October 23, 2025 09:35
Otherwise, it can sometimes remain unclear in the diagnostics, whether
it was InstallationV2 or InstallationV2CacheBust that timed out.
The current production logs show two types of verification timeouts:

* service_error: "Unhandled Browserless response status: 408" (vast
  majority of cases)
* service_error: :timeout (only a few cases)

The latter happens when we hit the Req receive_timeout
(endpoint_timeout + 2s). I've seen Browserless not respect the timeout
param from time to time, so it's better to keep the timeout logic
"in-house" only.
...but still consider them "unhandled" for telemetry, also notifying Sentry
and logging the warning.
Also rename current liveview modules and routes, removing the v2 suffix
Also fix dockerignore and elixir.yml referencing a wrong priv path
@RobertJoonas RobertJoonas force-pushed the cleanup-detection-sentry branch from 1931b19 to c2163df Compare October 27, 2025 09:39
Base automatically changed from cleanup-scriptv2-flag to master October 27, 2025 09:53
Comment on lines +77 to +78
String.contains?(extra, "net::") -> failure(:client_issue)
String.contains?(String.downcase(extra), "execution context") -> failure(:client_issue)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, non-blocking: This remap from :browserless_client_error to :client_issue and :unknown_issue loses some information in the process. At this point, extra is known, but once we map it to :unknown_issue, it gets thrown away and in the checks module we go to the case

        _unknown_failure ->
          {true, true, "Unknown failure"}

It seems there's actually two codes that we want to emit from the Detection check, :browserless_client_error and :browserless_client_error_silenced

Copy link
Copy Markdown
Contributor Author

@RobertJoonas RobertJoonas Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The information doesn't get lost. These error-grouping atoms are just for determining which kind of telemetry we want. In any case, if it's an interesting case, the whole diagnostics struct (e.g. %{..., service_error: %{code: :browserless_client_error, extra: "whatever message"}}, ...) is logged and captured by Sentry as well.


def interpret(%__MODULE__{service_error: %{code: code}}, _url)
when code in [:domain_not_found, :invalid_url] do
failure(:client_issue)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, non-blocking: this does not seem like a client issue (browser driven by Browserless.io) issue, it's invalid input issue: we don't even start the browser in these scenarios

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, thanks! I've renamed it to :customer_website_issue which should make it more clear. 04942e6

@RobertJoonas RobertJoonas added this pull request to the merge queue Oct 27, 2025
Merged via the queue into master with commit 7540511 Oct 27, 2025
16 checks passed
@RobertJoonas RobertJoonas deleted the cleanup-detection-sentry branch October 27, 2025 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants