Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuring storage as SETTINGS disk='<disk_name>' (instead of storage_policy) and explicit disk creation SETTINGS disk=disk(<disk_configuration>) #41976

Merged

Conversation

kssenii
Copy link
Member

@kssenii kssenii commented Sep 30, 2022

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Allow configuring storage as SETTINGS disk='<disk_name>' (instead of storage_policy) and with explicit disk creation SETTINGS disk=disk(type=s3, ...)

@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch from 9855545 to 51d4181 Compare September 30, 2022 14:37
@robot-ch-test-poll robot-ch-test-poll added the pr-improvement Pull request with some product improvements label Sep 30, 2022
@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch 2 times, most recently from 3f3d745 to a08267a Compare September 30, 2022 15:05
@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch from a08267a to d538b5f Compare September 30, 2022 15:20
@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch from bc877fe to 7fbc77e Compare September 30, 2022 15:49
@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch from 9cfe514 to ef0a3df Compare October 3, 2022 13:35
@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch from a059db9 to 301c7c9 Compare October 21, 2022 22:03
@vitlibar vitlibar self-assigned this Oct 26, 2022
src/Interpreters/Context.h Outdated Show resolved Hide resolved
src/Common/SettingsChanges.h Outdated Show resolved Hide resolved
@@ -109,7 +114,15 @@ bool ParserSetQuery::parseNameValuePair(SettingChange & change, IParser::Pos & p
value = std::make_shared<ASTLiteral>(Field(static_cast<UInt64>(1)));
else if (ParserKeyword("FALSE").ignore(pos, expected))
value = std::make_shared<ASTLiteral>(Field(static_cast<UInt64>(0)));
else if (!value_p.parse(pos, value, expected))
/// for SETTINGS disk=disk(type='s3', path='', ...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be dangerous. I mean what if somebody will write SETTINGS disk=disk(type='local', path='<some_system_path>')

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should accept only disk name, and path can be extracted from ATTACH TABLE FROM <path> (now it looks at user_files) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be dangerous. I mean what if somebody will write SETTINGS disk=disk(type='local', path='<some_system_path>')

and this is allowed, there are even tests for that, BUT it is allowed only if there is an explicit setting in server config that tells which base directory can be used for such out-of-config disks and in code all non-remote disks must lay in this directory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should accept only disk name, and path can be extracted from ATTACH TABLE FROM (now it looks at user_files) ?

hm, this does not seem convenient for me

@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch from 7f9ae9c to f0dab99 Compare November 7, 2022 22:28
@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch from f0dab99 to d876a39 Compare November 7, 2022 23:47
@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch 2 times, most recently from a04a938 to 2e69dc6 Compare February 6, 2023 16:25
@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch from 2e69dc6 to 6cea67d Compare February 6, 2023 17:11
@kssenii kssenii requested a review from vitlibar February 8, 2023 15:14
src/Core/Field.h Outdated Show resolved Hide resolved
src/Disks/DiskSelector.h Outdated Show resolved Hide resolved
@kssenii kssenii force-pushed the allow-single-disk-instead-of-storage-policy branch from f79c398 to 8458c78 Compare February 10, 2023 11:06
@kssenii kssenii merged commit 3be36c7 into ClickHouse:master Feb 10, 2023
@UnamedRus
Copy link
Contributor

Does it support named collections in disk definition (it make sense for s3)?

@kssenii
Copy link
Member Author

kssenii commented Feb 28, 2023

Does it support named collections in disk definition (it make sense for s3)?

I do not see much sense for this, you can use disk = 'disk_name', in this case in my opinion. Or you mean it makes sense because now we can create named collections from sql? Then probably we should just allow to create disks from sql in a similar way instead.
Also I made disk configuration hidden in #46670.

@UnamedRus
Copy link
Contributor

UnamedRus commented Feb 28, 2023

Or you mean it makes sense because now we can create named collections from sql?

It's more for:
Having per table s3 disk path. (so do per table disk)
But reusing the same s3 creds.

Then probably we should just allow to create disks from sql in a similar way instead.

Yeah, it would be nicely.
#29110

@kssenii
Copy link
Member Author

kssenii commented Feb 28, 2023

Or you mean it makes sense because now we can create named collections from sql?
It's more for:
Having per table s3 disk path. (so do per table disk)
But reusing the same s3 creds.

ok, I see, could you please create an issue for this?

@azat
Copy link
Collaborator

azat commented Feb 28, 2023

It's more for:
Having per table s3 disk path. (so do per table disk)
But reusing the same s3 creds.

I thought about this too, but then decided that this could be a security breach, since you can override path/endpoint and change something that was not allowed.

@UnamedRus
Copy link
Contributor

I thought about this too, but then decided that this could be a security breach, since you can override path/endpoint and change something that was not allowed.

But, arent it's already possible via feature like that:

<clickhouse>
        <s3>
                <datalake>
                        <endpoint>https://<BUCKET_NAME>.s3.eu-central-1.amazonaws.com</endpoint>
                        <use_environment_credentials>true</use_environment_credentials>
                       .......
                </datalake>
        </s3>
</clickhouse>

When we define creds per endpoint prefix.

@azat
Copy link
Collaborator

azat commented Feb 28, 2023

But, arent it's already possible via feature like that:

You mean if the endpoint does not limit to some path (i.e. use root bucket without using some prefix) ?
Then I would say that this is a user problem, or maybe it is not even a problem, maybe they have bucket per instance/load/env/...

I was worrying more about when endpoint=https://<BUCKET_NAME>.s3.eu-central-1.amazonaws.com/clickhouse, and this bucket for instance also has https://<BUCKET_NAME>.s3.eu-central-1.amazonaws.com/user_data and if you will allow overriding endpoint then you could allow to override anything in this bucket, including user_data.

@UnamedRus
Copy link
Contributor

I see,

But it the same problem which will happen for PostgreSQL/MySQL creds already?

If you can override table for example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-improvement Pull request with some product improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants