Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting multiple index prefixes and choose from them in the UI #2726

Open
yoave23 opened this issue Jan 13, 2021 · 7 comments
Open

Allow setting multiple index prefixes and choose from them in the UI #2726

yoave23 opened this issue Jan 13, 2021 · 7 comments

Comments

@yoave23
Copy link
Contributor

yoave23 commented Jan 13, 2021

Requirement - what kind of business use case are you trying to solve?

this is a slim version of #2509
We'd like to allow the user to query different (pre-configured) data sources

Problem - what in Jaeger blocks you from solving the requirement?

Currently, jaeger does not have a solution for querying from different data sources (es indices in our case) under the same deployment

Proposal - what do you suggest to solve the problem or improve the existing situation?

We'd like to add a configuration flag similar to the --es.index-prefix flag that will take a collection of prefixes and will let the user select one of them before querying using some kind of a dropdown in the UI (dropdown will only be displayed if this configuration exist).
a real life example: an organization that wants to ship his traces based on the current environment (production / staging)

@yurishkuro
Copy link
Member

I have conceptual problems with this approach:

  • ES indices is an implementation detail of the storage backend, not a concept that the UI users need to be aware of
  • ES indices is just one of implementations, whereas "tenancy" could just as well apply to Cassandra backend, e.g. by using different namespaces

I don't think we can achieve the desired result via this shortcut. The UI needs a proper notion of tenancy, expressed in terms that are natural to the end users. The notion of "tenant" could be multi-dimensional, e.g. in your example it was prod vs. staging (one dimension), but some users may have other dimensions, e.g. environment + department. A specific tenant (by concretizing all possible dimensions) can map to a certain configuration of the backend, such as ES index prefix.

I think this more generic approach is not that much more difficult to implement, but it is significantly more flexible.

Caveat: there are some implementation details of the query service that may still be tricky regardless of how the frontend tenancy is handled. For example, Uber deployment implemented a self-refreshing cache of service names, because loading 3k entries from the storage on every UI load was taking too long. I don't think this was implemented generically in the OSS version (we still have open ticket #1743), but this is an example of functionality that will need to be aware of the tenancy.

@jkowall
Copy link
Contributor

jkowall commented Jan 14, 2021

This can get really complex especially without using a database to manage configs which we wanted to avoid I am sure. Also have dynamic names for the dimensions becomes difficult from a UI perspective. I do think having tenants and environments would give a lot of flexibility and likely meet 95% of the requirements that come up. We could have the environment selector in the search which would provide the necessary division of "services".

@yurishkuro
Copy link
Member

I think in the first implementation the tenancy should be described in a config file, not a database.

@jkowall
Copy link
Contributor

jkowall commented Jan 14, 2021

Just wouldn't work for us since we have thousands or tens of thousands of tenants and we'd have to dynamically change the config files on the fly which is not ideal. I'll let @albertteoh chime in when he's back at the keys.

@yurishkuro
Copy link
Member

So then you do need a database for tenants :-) Not sure how to interpret your previous #2726 (comment)

I think either way, we'd need an API on the query service for accessing this data, which can be backed by a set of configs or by a database.

@jpkrohling
Copy link
Contributor

jpkrohling commented Jan 15, 2021

we have thousands or tens of thousands of tenants

If they are all configured following a pattern, the multi-tenancy proposal would cover this by applying a generic configuration that uses the tenant value from the bearer token. Like this:

tenants: [] # no special rules for individual tenants
default: # all tenants follow the same pattern
  storageType: elasticsearch
  es:
    server-urls: "big-cluster.es.acme.example.com"
    index-prefix: "jaeger-%s" # %s is replaced by the tenant name

The token might also include a membership list that can act as a list of tenants that this user has access to. The UI then can use the information from the token to allow the user to select which tenant to use for the current query.

@jurgenweber
Copy link

however you tackle this, the idea is great and sorely needed. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants