Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: support TLS and authentication for Thanos Ruler queries #1939

Merged

Conversation

simonpasquier
Copy link
Contributor

Closes #1778

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Similar to #1838, this PR adds TLS and authentication support to Thanos Ruler for the query API endpoints. The YAML configuration format is very similar to the one used for configuring Alertmanager.

Verification

I have adapted the end-to-end tests to use the new parameters. The tests already exercised static addresses and file SD.

@simonpasquier simonpasquier force-pushed the tls-and-auth-for-query-rules branch 2 times, most recently from 97b6f73 to 51cb94e Compare January 6, 2020 12:40
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Some comments, but generally 👍 👍 👍

Thanks!

}
c.Transport = tracing.HTTPTripperware(logger, c.Transport)
queryClient, err := http_util.NewFanoutClient(logger, cfg.EndpointsConfig, c, queryProvider.Clone())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fanout? I think the logic for rule evaluation might be different. It's fanout = broadcast approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, the logic is ok, just the Name of client might be confusing. It's not Fanout for Querier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

// Each Alertmanager client has a different list of targets thus each needs its own DNS provider.
am, err := alert.NewAlertmanager(logger, cfg, amProvider.Clone())
amClient, err := http_util.NewFanoutClient(logger, cfg.EndpointsConfig, c, amProvider.Clone())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add tracing Tripperware as well I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Discover and resolve query addresses.
{
for _, c := range queryClients {
addToGroup(g, c, dnsSDInterval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not consistent with alertmanager addToGroup placement. Let's decide on one approach if we can (:

Reason: I spent some time trying to understand if we missed addToGroup for query or not (:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the call to addToGroup to the loop creating the query clients.

@@ -99,6 +97,8 @@ func registerRule(m map[string]setupFunc, app *kingpin.Application) {
queries := cmd.Flag("query", "Addresses of statically configured query API servers (repeatable). The scheme may be prefixed with 'dns+' or 'dnssrv+' to detect query API servers through respective DNS lookups.").
PlaceHolder("<query>").Strings()

queryConfig := extflag.RegisterPathOrContent(cmd, "query.config", "YAML file that contains query API servers configuration. See format details: https://thanos.io/components/rule.md/#configuration. If defined, it takes precedence over the '--query' and '--query.sd-files' flags.", false)

fileSDFiles := cmd.Flag("query.sd-files", "Path to file that contain addresses of query peers. The path can be a glob pattern (repeatable).").
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fileSDFiles := cmd.Flag("query.sd-files", "Path to file that contain addresses of query peers. The path can be a glob pattern (repeatable).").
fileSDFiles := cmd.Flag("query.sd-files", "Path to file that contains addresses of query API servers. The path can be a glob pattern (repeatable).").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// TODO(bwplotka): Propagate those to UI, probably requires changing rule manager code ):
level.Warn(logger).Log("warnings", strings.Join(warns, ", "), "query", q)
if err != nil {
level.Error(logger).Log("err", err, "query", q)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do continue instead of else - bit more readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
return v, nil
}
}
return nil, errors.Errorf("no query peer reachable")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, errors.Errorf("no query peer reachable")
return nil, errors.Errorf("no query API server reachable")

The peer name is obsolete: Came from gossip times. (:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
return nil, errors.Errorf("no query peer reachable")
}
}

func addToGroup(g *run.Group, c *http_util.FanoutClient, interval time.Duration) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not only add, it adds for DNS & file discovery, so maybe:

Suggested change
func addToGroup(g *run.Group, c *http_util.FanoutClient, interval time.Duration) {
func addDiscoveryGroups(g *run.Group, c *http_util.FanoutClient, interval time.Duration) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -738,36 +719,49 @@ func queryFunc(
}

return func(ctx context.Context, q string, t time.Time) (promql.Vector, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope we can proper timeout set on caller top this (:

pkg/http/http.go Outdated
}

// FanoutClient represents a client that can send requests to a cluster of HTTP-based endpoints.
type FanoutClient struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, fanout might be bad wording here, why not just Client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm bad at naming 😄 I think I dismissedClient because it would be confusing with http.Client but it 's indeed better than the incorrect FanoutClient.

span.Finish()
for _, i := range rand.Perm(len(queriers)) {
querier := queriers[i]
c := promclient.NewClient(logger, querier)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not creating this at the start of ruler? (:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@simonpasquier simonpasquier force-pushed the tls-and-auth-for-query-rules branch 2 times, most recently from ee16698 to 8c177a6 Compare January 8, 2020 15:37
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this and addressing all comments! Super tiny nits and we can merge IMO 👍

pkg/http/http.go Outdated Show resolved Hide resolved
test/e2e/rule_test.go Outdated Show resolved Hide resolved
@bwplotka
Copy link
Member

bwplotka commented Jan 9, 2020

Rebase needed ):

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
@bwplotka bwplotka merged commit 1a419c2 into thanos-io:master Jan 9, 2020
@simonpasquier simonpasquier deleted the tls-and-auth-for-query-rules branch January 10, 2020 08:13
@eswdd eswdd mentioned this pull request Mar 10, 2020
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ruler: Add authentication options query
2 participants