Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Meta] Validate Plugins Usage of Opensearch Core System and Hidden Indices #9239

Open
8 of 17 tasks
Rishikesh1159 opened this issue Aug 10, 2023 · 11 comments
Open
8 of 17 tasks
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request v2.11.0 Issues and PRs related to version 2.11.0

Comments

@Rishikesh1159
Copy link
Member

Rishikesh1159 commented Aug 10, 2023

Description/Concept of System Indices and Hidden Indices on Opensearch core:

System Index - An index must extend system index plugin for a index to be called as system index.
Example: Security Plugin correctly extends and uses system Indices, more info here

Hidden index - An index must have hidden SETTING_INDEX_HIDDEN = "index.hidden" set on the index setting to call it as hidden. It doesn't matter if it starts with "." or not.
Example: Asynchronous-Search plugin correctly sets the index setting value here

Misconception:

Many plugins still misunderstand the actual definition of system and hidden indices. Usual misconception is that any index starting with . like .indexName is a system or hidden index, but this is incorrect. Any user can create an index with .indexName which is neither a system or hidden index. So users can mistakenly still create index starting with "." there is nothing stopping them from doing it.

To avoid this misconception all plugins should adopt/on-board with concept of system and hidden indices defined in opensearch core.

Goal:

The main ask of this issue is to make sure all plugins having/using system and hidden indices must on-board/adopt with concept defined in opensearch core.

Any plugin already on-board with opensearch core defined concept of system and hidden indices can ignore this issue and close the issue as completed on the plugin repo.

Additional info:

The following info provided below is not necessary for system/hidden indices, but might be useful info for plugins using system indices :

If your system indices need additonal security features/benefits provided by security plugin, follow the steps provided here. But to make sure these are additonal features provided by security plugin and it is completely decoupled from concept of system indices. It is upto the plugin owners to decide if they need these additional security benefits.

Opensearch Plugins:

@Rishikesh1159 Rishikesh1159 added enhancement Enhancement or improvement to existing feature or request untriaged distributed framework v2.10.0 and removed untriaged labels Aug 10, 2023
This was referenced Aug 10, 2023
@peternied
Copy link
Member

@Rishikesh1159 Since you are looking into system indices this topic recently came up, might be another source of data

@cwperks
Copy link
Member

cwperks commented Aug 14, 2023

Hi @Rishikesh1159 , X-Posting from the security plugin's issue because plugins must also register with the security plugin.

I agree that system indices are confusing. SystemIndexPlugin.getSystemIndexDescriptors is the way to officially declare a system index/system index pattern from a plugin, but from Security's POV there is still one more area to register the index if a plugin wishes to get system index protection from the security plugin. The index would also need to be added in this list

# For example:

plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices: [".plugins-ml-config", ".plugins-ml-connector", ".plugins-ml-model-group", ".plugins-ml-model", ".plugins-ml-task", ".opendistro-alerting-config", ".opendistro-alerting-alert*", ".opendistro-anomaly-results*", ".opendistro-anomaly-detector*", ".opendistro-anomaly-checkpoints", ".opendistro-anomaly-detection-state", ".opendistro-reports-*", ".opensearch-notifications-*", ".opensearch-notebooks", ".opensearch-observability", ".ql-datasources", ".opendistro-asynchronous-search-response*", ".replication-metadata-store", ".opensearch-knn-models", ".geospatial-ip2geo-data*"]

System index protection means that not even an admin can meddle with the index. The only user permitted to meddle with a system index is a user connecting with the admin certificate or a plugin after they have stashed the thread context and operate in a trusted local mode.

Not to add more confusion to system indices, but the security plugin also has a notion of protected indices which are indices that are given special protections, but are not system indices.

Do you have any documentation on system indices and what core does specific to system indices? From what I understand OpenSearch autocreates a system index if it has not already been created and a document is indexed. The system indices may also get precedence for queries, but is this documented somewhere outside of code?

@Rishikesh1159
Copy link
Member Author

Rishikesh1159 commented Aug 24, 2023

Sorry for Late response. Thanks @cwperks and @peternied for chiming in. Yes @cwperks you are right.

Not to add more confusion and reiterate what @cwperks said, there is distinction between system indices and security plugin's protected indices.

For an index to be a system index, you don't need to add your index in this list or use protected indices. A system index is completely decoupled from security plugin and can work even without registering security plugin.

What @cwperks said above about adding your index in this list or use protected indices is an additional feature provided by security plugin. It is not mandatory but a good to have. Although it is recommend to do it for all plugins as it adds additional security benefit. It is upto the plugin owners to decide if a plugin needs this additional security benefits.

@cwperks to answer your question of what core does specific to system indices. I don't see any documentation outside of code. I can take up this as an action item and add some documentation. Usually System indices are used to hold some metadata about the plugin, all core does is provide an interace for plugins to extend and do some validation checks.

@Rishikesh1159 Rishikesh1159 added the v2.11.0 Issues and PRs related to version 2.11.0 label Sep 25, 2023
@ankitkala
Copy link
Member

@Rishikesh1159 I think we do create . prefixed in core opensearch as well. Are we planning to fix those as well?

I know for sure that task management API(/_tasks) maintains the results in .tasks index. Not sure if there is any more such cases or not though.

@dhrubo-os
Copy link

Do we have any documentation now about system indices from core? Did you take any action item as you said earlier? I have few questions about system indices

  1. What's the benefit of creating a system index? Can any user (including admin) of the cluster can see the content of this system index?
  2. What is the benefit of creating a hidden index? If I make a system index hidden can any user (including admin) of the cluster can see the content of this system index?
  3. You were saying It is not mandatory but a good to have.. Can you please explain the reason?
  4. We have a use case where we want to store some information in index, which can only be accessed by code not by any user. Currently we use security plugin so that user can't see the content of this index for security enabled cluster. But the limitation is, for security disabled plugin this information is open to everybody. Does hidden system index can solve this problem?

@Arpit-Bandejiya
Copy link
Contributor

Hi @dhrubo-os, these are following benefits to use system/hidden indices which I'm aware of :

  • Dedicated read and write pool.
  • Prioritized recovery for system indices.
  • Enable automated upgrade of indices.
  • There are some dot indices that do not necessarily fit the mold of a normal system index; instead they store data that the system produces with the intent that users can also query against this data. This has mingled with system index definition today , which should rather move to hidden index.
  • System index writes can be forced regardless of whether the current index pressure is high or not

@Arpit-Bandejiya
Copy link
Contributor

Arpit-Bandejiya commented Dec 13, 2023

We have a use case where we want to store some information in index, which can only be accessed by code not by any user. Currently we use security plugin so that user can't see the content of this index for security enabled cluster. But the limitation is, for security disabled plugin this information is open to everybody. Does hidden system index can solve this problem?

if the index isn’t supposed to be queried directly by the user(for reasons that it might not make sense or expose internal implementation details) and should only ever be used by the plugin for its functioning or book keeping, such indices should be system indices by design.

Update: However, the core does not block the write/search on system indices, the core is responsible for providing it dedicated thread pool, bypassing the backpressure checks, etc on the system indices defined in the core. For access related functionality, we need to use the system indices defined in security plugin

@Arpit-Bandejiya
Copy link
Contributor

I think we do create . prefixed in core opensearch as well. Are we planning to fix those as well?

Yes

@ylwu-amzn
Copy link
Contributor

ylwu-amzn commented Dec 13, 2023

if the index isn’t supposed to be queried directly by the user(for reasons that it might not make sense or expose internal implementation details) and should only ever be used by the plugin for its functioning or book keeping, such indices should be system indices by design.

@Arpit-Bandejiya , "such indices should be system indices by design.", Do you mean the system index protected by security plugin or the default OS system index (not protected by system index ) ? I don't think default OS system index (not protected by system index ) should be used for this case , otherwise the user who have permission to such index can do anything on it, for example admin user may delete such index by mistake, then plugin can't work correctly.

@Arpit-Bandejiya
Copy link
Contributor

Arpit-Bandejiya commented Dec 13, 2023

What I meant here is, that indices which are created by plugins should be kept as system indices due to the following:

  1. The backpressure/circuit breakers do not apply to them. We are also going to extend it to create blocks.
  2. The index get an dedicated read and write pool.

Currently the system indices defined in core is basically to give extra priviliges and is a way to identify between the normal user index and plugin index.

Now the security aspect of an index is covered by the security indices. The security indices has it's own list which denies access by the user and is a totally different area.

In this issue, we want to make sure that indices created by plugins are marked as system/hidden. If they are hidden, we want to understand why System Indices are not useful for them and what specific behavior plugin teams are looking for.

@Rishikesh1159
Copy link
Member Author

Rishikesh1159 commented Dec 13, 2023

@dhrubo-os sorry I wasn't able to put out any documentation about system indices from core. I will do this soon. To answer your questions:

System index: An index containing configurations and other data used internally by the Opensearch. System indices are not intended for direct access or modification.

Hidden index: A regular index that's "hidden" from wildcard (*) patterns in API requests. Purpose of hidden indices is to store data that the system produces with the intent that users can access and also query against this data

What's the benefit of creating a system index? Can any user (including admin) of the cluster can see the content of this system index?

  1. Dedicated read and write pool
  2. Prioritized recovery for system indices
  3. Enable automated upgrade of indices .
  4. There are some dot indices that do not necessarily fit the mold of a normal system index; instead they store data that the system produces with the intent that users can also query against this data. This has mingled with system index definition today , which should rather move to hidden index.
  5. Ease of security configuration for system indices.

Only admin and users with necessary access permissions can see the content of system index

What is the benefit of creating a hidden index? If I make a system index hidden can any user (including admin) of the cluster can see the content of this system index?

Hidden indices store data that the system produces with the intent that users can access and also query against this data. Hidden indices are hidden/excluded from wildcard (*) patterns in API requests. System index is different from hidden index and any user with right permission can access the hidden indices.

You were saying It is not mandatory but a good to have.. Can you please explain the reason?

As I said previously the concept of system indices is different from security plugin's protected indices. Security plugin's protected indices provide additional security benifits to system indices. By default system index doesn't have these security benefits. It is upto the plugin owner to decide if they want these security benefits to their system index.

We have a use case where we want to store some information in index, which can only be accessed by code not by any user. Currently we use security plugin so that user can't see the content of this index for security enabled cluster. But the limitation is, for security disabled plugin this information is open to everybody. Does hidden system index can solve this problem?

I don't think system indices will be to solve this problem, as for security disabled plugin admin will have access to system indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request v2.11.0 Issues and PRs related to version 2.11.0
Projects
None yet
Development

No branches or pull requests

7 participants