Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Protection Flags for Tables and Clusters #1423

Open
donhardman opened this issue Sep 5, 2023 · 11 comments
Open

Implement Protection Flags for Tables and Clusters #1423

donhardman opened this issue Sep 5, 2023 · 11 comments
Assignees

Comments

@donhardman
Copy link
Contributor

donhardman commented Sep 5, 2023

We must add new flag functionalities to the Manticoresearch daemon to safeguard our sharded tables and internal clusters from unauthorized alterations. This enhancement will ensure that these crucial components are only modified through Buddy, thus enhancing the integrity and security of our system.

The specific tasks are as follows:

  1. Add a flag to the tables' meta-information (including distributed) which marks the table as protected from external modifications. This can be checked using the User-Agent header. Once the flag is presented in the table, we cannot alter or drop it by providing a query with no Buddy User-Agent in the headers.
  2. Introduce another flag to the tables' meta-information, making them invisible to requests outside of Buddy, similar to how system tables currently function.
  3. Create a flag for specified clusters that prevents them from being deleted and/or modified outside Buddy. This will prevent accidental or malicious deletion of entire clusters.
  4. Establish a flag for clusters that makes them invisible to requests outside of Buddy.

We must also design a method for passing these flags when creating a new cluster/table.

UPDATE from Dmitrii Feb 8 2024

We need the ability to create a distributed table that is protected yet still accessible to the user because the user needs to interact with it: selecting, inserting, and deleting data, but we should prevent altering or dropping the table.

The suggestion is to include a special flag (option) when creating the table. This mechanism will enable the daemon to identify if the table can be modified exclusively through Buddy (Buddy sends its header, and we recognize the request is coming from Buddy) and cannot be altered directly.

@sanikolaev
Copy link
Collaborator

In short, for now we need to be able to hide/prevent from altering/removing tables and clusters, namely:

  • system tables and clusters related with autosharding
  • final shards themselves
  • sharded distributed tables

Since this is under control of Buddy, it should be impossible to modify these things without it.

@sanikolaev
Copy link
Collaborator

As discussed in today's dev call, there might be an issue with simply marking a table as invisible. This is because not only will show tables fail to detect it, but distributed tables will also be affected.

We have considered several options:

  1. create table ... hidden=1: This creates table files but doesn't add them to the list of tables.
  2. create table ... flags/hidden/etc: This approach adds flags to the table or add the table to a new special list in manticore.json, which are then respected by show tables and data/schema modification queries.
  3. A special database for shards, requiring users to execute use ... before they can view the shard tables/clusters.
  4. Modifying show tables to show tables ... smth, which causes show tables to fail. This would then redirect the request to Buddy to exclude shard tables from the output.

@sanikolaev sanikolaev assigned tomatolog and unassigned klirichek Jan 17, 2024
@sanikolaev sanikolaev removed the rel::upcoming Upcoming release label Jan 17, 2024
@sanikolaev
Copy link
Collaborator

@tomatolog this task will be a blocker for completing the autosharding and the kafka tasks.
@tomatolog please think about it and let me know what you think. To me

create table ... flags/hidden/etc: This approach adds flags to the table or add the table to a new special list in manticore.json, which are then respected by show tables and data/schema modification queries.

seems optimal. We can then do create table ... hidden='1' in Buddy which will put flag hidden=1 on the table in manticore.json which will make it impossible to use it in show tables, alter table etc. But I may be missing smth.

@tomatolog
Copy link
Contributor

Since this is under control of Buddy, it should be impossible to modify these things without it.

but when something goes wrong and buddy can not handle it - how to fix all back without manual intervention?

@tomatolog
Copy link
Contributor

tomatolog commented Jan 17, 2024

We have considered several options:

I would also consider:

  • to use @@system.table_name as we already dumps system info as such tables and could add filter into all other functions to bypass such tables on tables iteration
  • add support for database and put systems tables into that database and allow to use database and query table as database.table_name or system.table_name similar to point 3

@tomatolog
Copy link
Contributor

for the approach with the flag create table ... hidden=1 I would use create table ... system_table=1 that got stored into manticore.json if that option will not replicate along the cluster or at the table path index_name.settings if the option will replicate along the cluster.
Then that option will load by daemon on loading index and set at the internal ServedDesc_t structure.

Then all requests code should be refactored to skip the system_table for regular requests.
For example show tables will be alias to show tables option tables = 'user' and skip tables with system_table set. And there will be other forms show tables option tables = 'system' to show only system tables and show tables option tables = 'all' to show all tables.

@sanikolaev
Copy link
Collaborator

but when something goes wrong and buddy can not handle it - how to fix all back without manual intervention?

Emulating Buddy, namely - it's user agent.

add support for database and put systems tables into that database and allow to use database and query table as database.table_name or system.table_name similar to point 3

Would we be able to replicate some of the system tables?

@sanikolaev
Copy link
Collaborator

As discussed, we can implement it as follows:

  • show tables from system
  • system.table support in sql/http queries
  • show tables shouldn't show system tables
  • no need to implement use, the above should be enough for now

@donhardman
Copy link
Contributor Author

We need the ability to create a distributed table that is protected yet still accessible to the user because the user needs to interact with it: selecting, inserting, and deleting data, but we should prevent altering or dropping the table.

The suggestion is to include a special flag (option) when creating the table. This mechanism will enable the daemon to identify if the table can be modified exclusively through Buddy (Buddy sends its header, and we recognize the request is coming from Buddy) and cannot be altered directly.

@tomatolog
Copy link
Contributor

the latest suggestion is to move table name rules into its own bison and flex parser and include that into all main parsers that will fix issue that current code breaks JSON at the select list.
However that will conflicts with the JOIN patch that lives in the branch and could be better to continue this work after JOIN will be merged into master. #1673

@sanikolaev
Copy link
Collaborator

sanikolaev commented May 16, 2024

JOIN is now in the master. Unblocked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants