Skip to content

Conversation

@lcian
Copy link
Member

@lcian lcian commented Nov 25, 2025

The previous setup with maybe_rewrite_objectstore_url didn't work if you were using a local Symbolicator.
This happens, for example, in Symbolicator's CI, where both Sentry and Symbolicator are ran locally.

We introduce a new utility to get the URL to an object from Symbolicator's point of view, replacing maybe_rewrite_objectstore_url.
This checks whether a Symbolicator container is detected, therefore it doesn't require any extra env vars/settings from the user.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 25, 2025
@lcian lcian marked this pull request as ready for review November 25, 2025 12:39
@lcian lcian requested a review from a team as a code owner November 25, 2025 12:39
Copy link
Member

@Dav1dde Dav1dde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand that correctly you need to have 2 different URLs to access objectstore. One which is used by Sentry and one Sentry hands out to other services?

Why not just have 2 separate variables/configurations and a central point which hands out the correct URL based on use-case?

@codecov
Copy link

codecov bot commented Nov 25, 2025

Codecov Report

❌ Patch coverage is 35.29412% with 11 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/objectstore/__init__.py 18.18% 9 Missing ⚠️
src/sentry/lang/native/symbolicator.py 33.33% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #103964      +/-   ##
===========================================
- Coverage   80.62%    80.61%   -0.01%     
===========================================
  Files        9311      9311              
  Lines      397326    397369      +43     
  Branches    25338     25338              
===========================================
+ Hits       320338    320342       +4     
- Misses      76547     76586      +39     
  Partials      441       441              

@lcian
Copy link
Member Author

lcian commented Nov 25, 2025

@Dav1dde If we were to have other services which we need to pass the URL to, then yes, we would need similar rewriting logic.
We can make it an option and put this function somewhere else.
However you will still have to remember to call it, and set the option correctly depending on whether the other service is running locally or in a container.

@jan-auer
Copy link
Member

you need to have 2 different URLs to access objectstore
There's a point to this, though. Technically, nothing is preventing us from configuring objectstore with different hostnames for each of our services that call out to it. So maybe we should really start configuring separate URLs for every service we pass a token to (and then rewrite URLs properly).

That would mean you'd have to reconfigure Sentry in the symbolicator CI setup, though.

@lcian lcian requested a review from a team as a code owner November 25, 2025 14:35
@lcian
Copy link
Member Author

lcian commented Nov 25, 2025

That would mean you'd have to reconfigure Sentry in the symbolicator CI setup, though.

Yeah in practice we would have to do this with any of these solutions.
Note also that all of this doesn't matter in prod, it's just a matter of local development/testing.

@lcian lcian changed the title feat(symbolicator): Consider USE_LOCAL_SYMBOLICATOR for URL rewriting feat(objectstore): Introduce host rewriting utility and option Nov 25, 2025
@Dav1dde
Copy link
Member

Dav1dde commented Nov 26, 2025

What I was proposing was that session.object_url already returns the correct URL without having to parse it afterwards again and then to try to string replace things. This seems unnecessarily complicated and error prone (like we noticed).

Possibly get_attachments_session (and equivalent functions) can just return a session that works for Symbolicator and other services (Vroom?) which need an externally reachable URL.

I prefer tackling problems at the root, instead of trying to workaround them later.

@lcian
Copy link
Member Author

lcian commented Nov 26, 2025

@Dav1dde I also agree that it's better to tackle problems at the root, in fact I hate the current solution, however I struggle to see an approach that works cleanly and is not error prone here.
I have another approach in mind, let me implement that and then we can discuss it.

@Dav1dde
Copy link
Member

Dav1dde commented Nov 26, 2025

@lcian I haven't put too much thought into this, I was thinking of (very pseudo):

            # Need to bikeshed on the names
            session = get_attachments_session(self.project.organization_id, self.project.id, usage='external')
            storage_url = session.object_url(report.stored_id)
            json: dict[str, Any] = {

Internally get_attachments_session can dispatch between two clients:

       return _ATTACHMENTS_CLIENT_EXTERNAL if usage == 'external' else _ATTACHMENTS_CLIENT_INTERNAL

@jan-auer
Copy link
Member

jan-auer commented Nov 26, 2025

We have two dirty problems in general:

  1. A client and then a session need to be instantiated just to render a URL. This creates a connection pool and may in future even try to do some initial auth to fail early. This wouldn't work anymore if we use a hostname that's not reachable from the place where the URL is generated.
  2. The string replacement. +1 on @Dav1dde's solution to have a higher-level function that has some switch for where to access it from.

We will need something similar in the Sentry endpoints that are supposed to create public pre-signed URLs. Note that these pre-signed URLs don't necessarily are an object URL - they could also be the POST/batch endpoint. I think we should expose functions independent of an instantiated client from objectstore_client that do either:

  • render just the path including the signature / token (once that's implemented)
  • take a base URL (incl prefix path) and then render the full URL

The signature might be a bit unwieldy, but what I'm thinking of is:

# most generic version
format_url(base, usecase, scopes[, path])

# convenience helpers, use `format_url` internally
format_post_url(base, usecase, scopes)         # path is empty or "/"
format_object_url(base, usecase, scopes, key)  # path is the key

@jan-auer
Copy link
Member

There's another problem: Technically, the hostname under which objectstore is reachable can be different for every service - both in production and in development environments. I think we should embrace this and separate:

  1. Creating a Session, which is meant to make requests to Objectstore. In Sentry codebase, this session must have the correct hostname that Sentry can reach.
  2. Formatting URLs. Which kind of URL is specific to the use case and how it's exposed. For example, for symbolicator we need a symbolicator_attachment_url() which then uses a specific hostname for symbolicator and the JWT. In contrast, the public_attachment_url() will refer to the proxy endpoint and renders a pre-signed URL.

@lcian lcian changed the title feat(objectstore): Introduce host rewriting utility and option feat(objectstore): Improve host rewriting Nov 26, 2025
@loewenheim
Copy link
Contributor

I can confirm that checking docker ps makes it work locally for me, but could it be a problem to do that every time we want to send a URL to Symbolicator?

@lcian
Copy link
Member Author

lcian commented Nov 26, 2025

I can confirm that checking docker ps makes it work locally for me, but could it be a problem to do that every time we want to send a URL to Symbolicator?

Thanks for testing. I'm now caching the result so we only run the command once.
In general I believe this shouldn't be a big problem as we're only running the subprocess when doing local development/testing.

@loewenheim
Copy link
Contributor

In general I believe this shouldn't be a big problem as we're only running the subprocess when doing local development/testing.

Oh you're right, totally missed that.

@lcian
Copy link
Member Author

lcian commented Nov 26, 2025

@jan-auer I agree that we will need something similar for our proxy. While the problem is similar as it involves URLs, I also think that it's a different one, so I would focus on trying to solve this one for now.
I've implemented a get_symbolicator_url function as you suggested. When we'll have the JWT we might be able to adapt this function to account for it.
In dev/test it checks docker to see where Symbolicator is running, so we can automatically tell if the host needs rewriting or not without the need for new env variables.

@Dav1dde I understand that ideally you would want session.object_url to do this automatically, however that's part of the objectstore_client, which I think shouldn't be concerned with rewriting URLs.
Moreover, a single Client might in theory be shared between multiple services, with part of them running locally and the rest running in containerized form, so it wouldn't be right to rewrite the URL for all or none of them.
You would have to introduce an additional parameter on the session to govern the rewriting behavior, which seems even worse than having to remember to call get_symbolicator_url instead of session.object_url.

@Dav1dde
Copy link
Member

Dav1dde commented Nov 26, 2025

@lcian That is not what I tried to show with my comment/psuedo code.

Just gonna defer to @jan-auer's comments, he put more thought into it and explained it much better than me.

Copy link
Member

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move forward with this implementation, provided @loewenheim can confirm they are usually not running their local symbolicators on a different port in addition to the one in devservices.

Please just address the comment by seer, this is valid.

We should follow up on this once we're implementing auth.

@loewenheim
Copy link
Contributor

Let's move forward with this implementation, provided @loewenheim can confirm they are usually not running their local symbolicators on a different port in addition to the one in devservices.

That's correct, we either run the one from devservices or disable it and run the local one instead.

@lcian lcian enabled auto-merge (squash) November 27, 2025 12:49
@lcian lcian merged commit f425934 into master Nov 27, 2025
66 checks passed
@lcian lcian deleted the lcian/symbolicator-rewrite-url branch November 27, 2025 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants