Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PushOnlyIngester gRPC service; Disable label name validation #3736

Merged

Conversation

trevorwhitney
Copy link
Contributor

What this PR does:
This PR enables users who want to store non-conventional prometheus data in cortex by exposing the ability to disable label name validation via a request parameter. This option is currently available to in the distributor config, but that requires you to run a whole separate distributor to handle requests for this type. By exposing this option through the WriteRequest, a single distributor can handle both payload types.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@jtlisi jtlisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, modulo linting fix

@@ -4,6 +4,7 @@ import (
"bytes"
"context"
"fmt"
"google.golang.org/grpc/status"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import needs to be separate from the standard library imports. We run rather strict linter settings for the project and this is triggering the CI failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, good to know, thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a bit more detail about the linting: https://cortexmetrics.io/docs/contributing/#formatting

@roidelapluie
Copy link
Contributor

What is the usecase here? I expect promql not to work without proper metric names.

@roidelapluie
Copy link
Contributor

I also quote @pstibrany

Now the question is, do we want to be a generic TSDB? I don't know, I want it to be.

I don't think Cortex should try to be a generic TSDB. There are already projects that do that better, and Cortex' strength is in making Prometheus horizontally-scalable (wherever that eventually leads). If there is low-hanging fruit to make some use-cases easier, I'm all for it, but generic TSDB is completely different beast altogether (starting with query language, underlying storage, architecture choices).

#3708

@pstibrany
Copy link
Contributor

pstibrany commented Jan 25, 2021

What is the usecase here? I expect promql not to work without proper metric names.

As far as I understand it, the use case here is to add a support for storing a non-Prometheus monitoring data into Cortex. This would be achieved by 1) having a proxy for incoming data that would sit in front of distributor, and call Push on distributor via gRPC with this new flag set to true (skip validation). 2) On query time, there would be a special querier that doesn't use PromQL at all, but use Cortex' store-gateway to fetch required data.

Additional support in Cortex needed to make this work is pretty small: disable label validation (this PR), and expose distributor push via gRPC. I think this falls well within "low-hanging fruit to make some use-cases easier".

@@ -33,6 +33,7 @@ message WriteRequest {
}
SourceEnum Source = 2;
repeated MetricMetadata metadata = 3 [(gogoproto.nullable) = true];
bool skip_label_name_validation = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should use google.protobuf.Any hints approach here. Strictly speaking this isn't supported by Ingester service, so it doesn't belong here. But Distributor implements the same method. (When we expose Distributor via gRPC, I expect it will reuse the same Push method, with cortex.WriteRequest and cortex.WriteResponse)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to keep WriteRequest compatible with Prometheus one. If we use the field number 4, it's very likely it will clash next time Prometheus remote write adds a new field. To fix this, we could use a very high field number (eg. 1000).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pstibrany is the google.protobuf.Any hints approach like the options field in the rules proto (

repeated google.protobuf.Any options = 9;
), making this field extensible in the case of future config options? Or am I missing the boat here?

@@ -366,9 +366,9 @@ func (d *Distributor) checkSample(ctx context.Context, userID, cluster, replica
// Validates a single series from a write request. Will remove labels if
// any are configured to be dropped for the user ID.
// Returns the validated series with it's labels/samples, and any error.
func (d *Distributor) validateSeries(ts ingester_client.PreallocTimeseries, userID string) (client.PreallocTimeseries, error) {
func (d *Distributor) validateSeries(ts ingester_client.PreallocTimeseries, userID string, skipLabelNameValidation bool) (ingester_client.PreallocTimeseries, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! I've sent separate PR #3738 to fix double-import of ingester_client package in this file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've just merged #3738. Could you rebase master, please?

@@ -33,6 +33,7 @@ message WriteRequest {
}
SourceEnum Source = 2;
repeated MetricMetadata metadata = 3 [(gogoproto.nullable) = true];
bool skip_label_name_validation = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to keep WriteRequest compatible with Prometheus one. If we use the field number 4, it's very likely it will clash next time Prometheus remote write adds a new field. To fix this, we could use a very high field number (eg. 1000).

@@ -366,9 +366,9 @@ func (d *Distributor) checkSample(ctx context.Context, userID, cluster, replica
// Validates a single series from a write request. Will remove labels if
// any are configured to be dropped for the user ID.
// Returns the validated series with it's labels/samples, and any error.
func (d *Distributor) validateSeries(ts ingester_client.PreallocTimeseries, userID string) (client.PreallocTimeseries, error) {
func (d *Distributor) validateSeries(ts ingester_client.PreallocTimeseries, userID string, skipLabelNameValidation bool) (ingester_client.PreallocTimeseries, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've just merged #3738. Could you rebase master, please?

@@ -495,7 +495,8 @@ func (d *Distributor) Push(ctx context.Context, req *client.WriteRequest) (*clie
return nil, err
}

validatedSeries, err := d.validateSeries(ts, userID)
skipLabelNameValidation := d.cfg.SkipLabelNameValidation || req.GetSkipLabelNameValidation()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@replay SkipLabelNameValidation was added to the config in #2835, based on needs of a downstream project. Do we still need it in the config or can we remote it favour of specifying it directly in the request? If it's needed in the config too, then OK, otherwise I would prefer to keep code cleaner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we won't need the SkipLabelNameValidation in the config struct anymore.

Would it be fine for you if we remove that flag from the Distributor config in a separate PR after this one has been merged? That would help because otherwise merging this PR is going to break the down-stream CI tests of projects relying on the Distributor config flag, if we remove it in a separate PR we can first switch the down-stream users to rely on this new per-request flag and only after that we remove the Distributor config flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine if you will do it in a separate PR.

Signed-off-by: Trevor Whitney <trevorjwhitney@gmail.com>
* distributor needs to be registered with grpc server so
  that other services have this new validation bypass option
  when talking to distributor directly
* util used to translate http -> grpc always sets
  SkipLabelNameValidation to false to force validation on all
  http requests as only gRPC requests should be allowed to bypass
  this validation

Signed-off-by: Trevor Whitney <trevorjwhitney@gmail.com>
@trevorwhitney
Copy link
Contributor Author

Thanks everyone, just pushed up a commit that hopefully resolves above mentioned issues.

  • created a new grpc service with just the Push method to expose the distributor via grpc
  • changed the grpc field position in WriteRequest to 1000 to guard against future changes to upstream WriteRequest in prometheus
  • updated the util/push/push.go to always set the SkipLabelNameValidation property to false so that all http request will continue to be validated.

Copy link
Contributor

@replay replay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

(trevor, note that I'm not a maintainer, so my approval doesn't count towards the requirement of 2 to merge)

@trevorwhitney
Copy link
Contributor Author

haha, @replay appreciate the support nonetheless!

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

(Nit: it seems little strange that "cortex.Distributor" gRPC service is defined in ingester package)

@trevorwhitney
Copy link
Contributor Author

@pstibrany how would you feel about renaming the new service to PushIngestor instead of Distrubutor, since that's really what it is? That way it would probably make more sense to keep in the ingester package.

@pstibrany
Copy link
Contributor

@pstibrany how would you feel about renaming the new service to PushIngestor instead of Distrubutor, since that's really what it is? That way it would probably make more sense to keep in the ingester package.

I don’t have a strong preference, and also I’m fine with keeping it as is.

Signed-off-by: Trevor Whitney <trevorjwhitney@gmail.com>
@trevorwhitney
Copy link
Contributor Author

I went ahead with the rename of Distributor service to PushOnlyIngester as I think (as @pstibrany pointed out) that makes more sense for something in the ingester package. I think this is ready for a merge, but I don't have write permission to do so. Thanks!

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Please remind to open a PR to remove the "skip label name validation" from config, since you mentioned you don't need to anymore.

@pracucci pracucci merged commit e414714 into cortexproject:master Jan 27, 2021
@trevorwhitney trevorwhitney deleted the disable-label-name-validation branch January 27, 2021 17:14
@bboreham bboreham changed the title Disable label name validation Add PushOnlyIngester gRPC service; Disable label name validation Feb 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants