Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement(http source): URL path configuration flexibility, path also exposed as a new field #6626

Merged
merged 7 commits into from
Mar 10, 2021

Conversation

prognant
Copy link
Contributor

@prognant prognant commented Mar 4, 2021

Closes #5112

Add three options to the http source:

  • url_path: the path where log event should be posted (default: /)
  • path_key: the name of the key used to export the path for each log even (default: path)
  • strict_path: whether strict path should be enforce, if true only event send to the exact url_path will be accepted, if false log sent to a path that start with the value of url_path will be accepted

For example:

url_path: "/events"
strict_path: false

Will accept events posted to "/events/1234" and "/events/5678".

Outside the http source there is no behaviour change.

Open question: should we accept all URLs for the heroku_logs source as it would allow a user to send logs from multiple app to the same source.

@prognant prognant requested review from a team and bruceg and removed request for a team March 4, 2021 17:29
@binarylogic binarylogic requested review from jszwedko and removed request for a team March 4, 2021 18:25
Copy link
Member

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @prognant !

I left one comment about configuration, but otherwise I think this looks good. I appreciate the comprehensive tests as well as support for url_path and strict_path config options.

In my opinion, we should expand this support to heroku_logs and any other HTTP-based sources for consistency. I think this would include: prometheus_remote_write, splunk_hec, aws_kinesis_firehose. We can split that off into a separate issue though.

@@ -20,41 +17,47 @@ use tokio_util::codec::Decoder;
use warp::http::{HeaderMap, HeaderValue, StatusCode};

#[derive(Deserialize, Serialize, Debug, Clone)]
#[serde(default)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this (plus the Default trait implementation below) will have the undesired effect of not requiring the user to specify the address field in their config. We currently require this field.

On the other hand, I could see trying to choose a sensible default for it like you did here. I might suggest something other than :80 as that is likely to be blocked.

For now, I'd recommend just replacing the #[serde(default)]s below and removing it from the struct. You can provide a default value for the field with something like:

https://github.com/timberio/vector/blob/c79545c74bb981c66e1082c12d635240c777aefd/src/sources/syslog.rs#L40-L41

https://github.com/timberio/vector/blob/c79545c74bb981c66e1082c12d635240c777aefd/src/sources/syslog.rs#L64-L66

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not changing the default behaviour should be favoured when possible, I followed your suggestion to keep the current mandatory behaviour for the address setting and rely on function to provided other default values.

@@ -107,8 +107,38 @@ components: sources: http: {
}
}
}
url_path: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think of just calling this path? I think, in the context it appears, it'll be pretty clear it refers to an HTTP URL path.

@prognant prognant force-pushed the prognant/5112-path-handling-for-http-sources branch 2 times, most recently from e3c0538 to 7527840 Compare March 5, 2021 12:27
@prognant
Copy link
Contributor Author

prognant commented Mar 5, 2021

Thanks for the review, I addressed your comments and created a follow-up issue to leverage this change in other relevant sources.

Nice work @prognant !

I left one comment about configuration, but otherwise I think this looks good. I appreciate the comprehensive tests as well as support for url_path and strict_path config options.

In my opinion, we should expand this support to heroku_logs and any other HTTP-based sources for consistency. I think this would include: prometheus_remote_write, splunk_hec, aws_kinesis_firehose. We can split that off into a separate issue though.

Copy link
Member

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @prognant ! I noticed one more thing, but otherwise this looks good to me.

@@ -30,6 +30,12 @@ pub struct SimpleHttpConfig {
query_parameters: Vec<String>,
tls: Option<TlsConfig>,
auth: Option<HttpSourceAuthConfig>,
#[serde(default = "crate::serde::default_true")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL we have these helpers

.await;

assert_eq!(
405,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that 405, "Method not allowed", is a bit of an odd error for this condition. I would have expected a 404.

This looks like a bug in warp: seanmonstar/warp#77

I think you could work around this by, instead of relying on warp's error handling with path::end, just accepting all requests and then, in the handler itself, when strict_match is true, check that the path matches and return a 404 manually if not.

Copy link
Member

@bruceg bruceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, except for the HTTP response code issue identified by @jszwedko. Otherwise, I just have a few wording suggestions for the docs.

docs/reference/components/sources/http.cue Outdated Show resolved Hide resolved
docs/reference/components/sources/http.cue Outdated Show resolved Hide resolved
docs/reference/components/sources/http.cue Outdated Show resolved Hide resolved
docs/reference/components/sources/http.cue Outdated Show resolved Hide resolved
docs/reference/components/sources/http.cue Outdated Show resolved Hide resolved
for event in events.iter_mut() {
event
.as_mut_log()
.insert(key, Value::from(path.to_string()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW this should be equivalent to:

Suggested change
.insert(key, Value::from(path.to_string()));
.insert(key, path.to_string().into());

(Not really a suggestion to change it, just a note)

samgiles and others added 6 commits March 9, 2021 12:23
…source

Signed-off-by: Sam Giles <sam.giles@citymapper.com>
Allow the HTTP source to use a customised path (instead of / previously).
The URL path is also extracted into the "path" key (can be changed by
configuration).

Signed-off-by: prognant <pierre.rognant@datadoghq.com>
Signed-off-by: prognant <pierre.rognant@datadoghq.com>
Signed-off-by: prognant <pierre.rognant@datadoghq.com>
Co-authored-by: Bruce Guenter <bruce@timber.io>
Signed-off-by: prognant <pierre.rognant@datadoghq.com>
Signed-off-by: prognant <pierre.rognant@datadoghq.com>
@prognant prognant force-pushed the prognant/5112-path-handling-for-http-sources branch from a9acf69 to 9a8556e Compare March 9, 2021 11:27
Copy link
Member

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

debug!(message = "Path rejected.");
Err(warp::reject::custom(
ErrorMessage::new(StatusCode::NOT_FOUND,
StatusCode::NOT_FOUND.canonical_reason().unwrap().to_string())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use unwrap_or_default() here? I realize that this particular code always has a canonical reason, but unwrap()s tend to jump out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds safer. In this case I think we may directly go for "Not found".to_string() WDYT ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also fine with that 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there is also a warp::reject::not_found() you could use instead of warp::reject::custom():

https://docs.rs/warp/0.2.5/warp/reject/fn.not_found.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to used warp::reject::not_found() but it seems we hit the same issue you shared earlier (seanmonstar/warp#77) that why I moved forward reusing the ErrorMessage struct that already has proper handling code to propagate a http return code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, gotcha, thanks for the note 👍

Signed-off-by: prognant <pierre.rognant@datadoghq.com>
@prognant prognant requested a review from bruceg March 9, 2021 16:06
@prognant prognant merged commit 76ef136 into master Mar 10, 2021
@prognant prognant deleted the prognant/5112-path-handling-for-http-sources branch March 10, 2021 13:34
@binarylogic
Copy link
Contributor

Nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add path event key to http source
5 participants