Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make wildcarded feature work on more patterns #250

Merged
merged 5 commits into from Oct 21, 2022

Conversation

mga-chka
Copy link
Collaborator

@mga-chka mga-chka commented Oct 16, 2022

Description

cf #249
the aim of this PR is to make the wildcarded feature work on usernames that matche one of the 3 following patterns

  • [prefix][wildcard]
  • [wildcard][suffix]
  • [willdcard]

instead of having the feature working only for usernames that matches the pattern [prefix]_[wildcard]

Pull request type

Please check the type of change your PR introduces:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation content changes
  • Other (please describe):

Checklist

  • Linter passes correctly
  • Add tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary

Does this introduce a breaking change?

  • Yes
  • No

Further comments

@render
Copy link

render bot commented Oct 16, 2022

proxy.go Outdated
found = false
case rp.hasWildcarded:
// checking if we have wildcarded users and if username matches one 3 possibles patterns
for _, user := range rp.users {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we have the following wildcarded users:
business_*
*_analyst
*

And somebody queries with business_analyst? It seems it now depends on how rp.users is loaded for what user settings are used.

Copy link
Collaborator Author

@mga-chka mga-chka Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I will add in the doc this behavior when the admin defines overlapping patterns

proxy_test.go Outdated
expStatusCode: http.StatusUnauthorized,
f: func(p *reverseProxy) *http.Response {
req := httptest.NewRequest("POST", fakeServer.URL, nil)
req.Header.Set("X-ClickHouse-User", "default")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nitpick: We should extract the header names to const to avoid typo's in the tests.

cu.name = name
cu.password = password
case name == "" || name == defaultUser:
// default user can't work with the wildcarded feature for security reasons
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate in the comment what are the security scenarios we're talking about? I'm going to forget it soon as I know myself 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the clickhouse server is ill-configured and the default user is usable it's not a big deal currently since "external" users have to use chproxy which doesn't allow the default user unless the chproxy admin created a default user with no password.
But, with the wildcarded feature and more specifically the * case, an external user/hacker could try to connect to clickhouse using the usr/pwd "default"/"" and access data is not supposed to access.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add that to the doc or comment?

proxy.go Outdated
Comment on lines 659 to 676
s := strings.Split(user.name, "*")
// cf a validation in config.go, the names must contains either a prefix, a suffix or a wildcard
switch {
case s[0] == "" && s[1] == "":
// the wildcarded user is "*"
return getUserInformations(rp, user, name, password)
case s[0] == "":
// the wildcarded user is "*[suffix]"
suffix := s[1]
if strings.HasSuffix(name, suffix) {
return getUserInformations(rp, user, name, password)
}
case s[1] == "":
// the wildcarded user is "[prefix]*"
prefix := s[0]
if strings.HasPrefix(name, prefix) {
return getUserInformations(rp, user, name, password)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we extract it out to dedicated function? This one gets biggy


The current implementation of wildcarded doesn't work with some limits on `out-users` (like the max_concurrent_queries) but works well with all the limits on `in-users`. The limits for `in_users` and `out-users` works as if all the users matching the pattern were the same user.

If the wildcarded users are overlapping, the real users will be attached randomly to one of the wildcarded users. For example, let's say:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it remain consistent for the same user? Meaning if analyst_john-UK is routed to analyst1 out user in first place, will it remain the same for the second query launched by him?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, while chproxy is up then yes it will be consistent.
If chproxy is restarted, it should be consistent (but I'm not sure).
If we change something and the admin updates chproxy then the behavior is unknown.
This is why I decided to put "randomly" and don't dive into details so that chproxy admins are warned that they should be extra careful when the wildcarded users are overlapping

proxy.go Outdated
return found, u, c, cu
}

func getUserInformations(rp *reverseProxy, user *user, name string, password string) (found bool, u *user, c *cluster, cu *clusterUser) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to better name it but initial naming hides the fact that it relates to wildcarded users. Maybe we could completely extract wildcarded users logic away to a dedicated file with possibly dedicated struct if needed?

Suggested change
func getUserInformations(rp *reverseProxy, user *user, name string, password string) (found bool, u *user, c *cluster, cu *clusterUser) {
func (rp *reverseProxy) overrideWildcardedUser(user *user, name string, password string) (found bool, u *user, c *cluster, cu *clusterUser) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a more explictit (but longer) name: generateWildcardedUserInformation
Regarding putting the wildcarded logic in a specfic file, I looked at the implementation and currently there are:

  • 2 lines on config.go
  • 37 lines of code in 2 specific function (I created a specific function like you ask) so I'm not sure it's worth creating a dedicated file. It would make sens if all the code was like that (for example if we were using a middleware pattern)

@@ -579,6 +579,23 @@ type clusterUser struct {
isWildcarded bool
}

func deepCopy(cu *clusterUser) *clusterUser {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why is it needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the current implementation doesn't do a deepcopy and only modify the name & pwd of the cu user.
So, if analyst_john-UK & analyst_jane-UK are doing a query at the same time we can end-up in this situation:

  • cu user name = john & password = pwd1
  • cu user name = jane & password = pwd2
  • the query of john is executed in clickhouse with the credentials jane/pwd2
  • the query of jane is executed in clickhouse with the credentials jane/pwd2

proxy.go Outdated
return found, u, c, cu
}

func findWildcardedUserInformation(rp *reverseProxy, name string, password string) (found bool, u *user, c *cluster, cu *clusterUser) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still make it a method receiver to reverseProxy, instead of a function accepting reverseProxy as parameter, but up to you the final choice

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mga-chka mga-chka merged commit 99db223 into master Oct 21, 2022
@mga-chka mga-chka deleted the wildcarded_user_improvment branch October 21, 2022 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

3 participants