Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve URL parsing in 'hosts' to take a full URL #363

Closed
wants to merge 1 commit into from

Conversation

@andrewvc
Copy link
Contributor

andrewvc commented Feb 10, 2016

This will also warn on more error cases, like localhost:9200/ /cc @pmusa

@elasticsearch-release

This comment has been minimized.

Copy link

elasticsearch-release commented Feb 10, 2016

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run; then say 'jenkins, test it'.

# Enable SSL/TLS secured communication to Elasticsearch cluster. Leaving this unspecified will use whatever scheme
# is specified in the URLs listed in 'hosts'. If no explicit protocol is specified plain HTTP will be used.
# If SSL is explicitly disabled here the plugin will refuse to start if an HTTPS URL is given in 'hosts'
config :ssl, :validate => :boolean

This comment has been minimized.

Copy link
@suyograo

suyograo Feb 10, 2016

Contributor

If ssl is true and url does not have protocol, do we default to https?

If SSL is explicitly disabled here the plugin will refuse to start if an HTTPS URL is given in 'hosts'
good

This comment has been minimized.

Copy link
@suyograo

suyograo Feb 10, 2016

Contributor

Ok, checked the code. Does the right thing with ssl=>true

@@ -24,6 +24,10 @@
eso.close
end

describe "setting the buffer timeout" do

This comment has been minimized.

Copy link
@suyograo

suyograo Feb 10, 2016

Contributor

remove this?

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

Oops! Good catch.

}.to raise_error(LogStash::ConfigurationError)
end

it "should parse full URLs correctly" do

This comment has been minimized.

Copy link
@suyograo

suyograo Feb 10, 2016

Contributor

Can we add one test for trailing / ? http://#{hostname_port}/

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

For sure

@@ -1,3 +1,6 @@
## 2.5.0
- Host settings now are more robust to bad input and can take (almost) full URLs

This comment has been minimized.

Copy link
@jordansissel

jordansissel Feb 10, 2016

Contributor

almost?

Also, I would mention the URL part first since that's the major change. Possibly make this two entries; one about URLs acceptance, and one about the input validation

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

can do!

describe "Host/URL Parsing" do
subject { described_class.new(base_options) }

let(:hostname) { "my.hostname" }

This comment has been minimized.

Copy link
@suyograo

suyograo Feb 10, 2016

Contributor

Also, one more test for hostname with -. frontend-logs.abc.com

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

If you don't mind I'll just put the dash in the hostname for completeness.

@@ -74,6 +74,8 @@ def self.included(mod)
# Remember the `http` protocol uses the http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-http.html#modules-http[http] address (eg. 9200, not 9300).
# `"127.0.0.1"`
# `["127.0.0.1:9200","127.0.0.2:9200"]`
# `["https://127.0.0.1:9200"]`

This comment has been minimized.

Copy link
@jordansissel

jordansissel Feb 10, 2016

Contributor

Let's have an example w/ a url without a port.

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

Can do!

@@ -136,6 +128,54 @@ def build_client(options)
Elasticsearch::Client.new(client_options)
end

HOSTNAME_PORT_REGEX=/\A(?<hostname>([A-Za-z0-9\.\-]+))(:(?<port>\d+))?\Z/
# Parse a configuration host to a normalized URL
def host_to_url(host, ssl=nil, path=nil)

This comment has been minimized.

Copy link
@jordansissel

jordansissel Feb 10, 2016

Contributor

I don't see any instance variables in use here. Is that right? If so, this can probably go in its own module (outside a class instance)

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

Is this a general principle? In most of the LS codebase I see people weigh single responsibility principle against too many classes.

This comment has been minimized.

Copy link
@jordansissel

jordansissel Feb 10, 2016

Contributor

For this specific case, I don't have strong feelings, so as a way to not add noise to a review, I retract my comment :)

This comment has been minimized.

Copy link
@andrewvc
@@ -136,6 +128,54 @@ def build_client(options)
Elasticsearch::Client.new(client_options)
end

HOSTNAME_PORT_REGEX=/\A(?<hostname>([A-Za-z0-9\.\-]+))(:(?<port>\d+))?\Z/

This comment has been minimized.

Copy link
@jordansissel

jordansissel Feb 10, 2016

Contributor

I'm not sure it's important, but this will fail to match IPv6 address:port combos which usually show up as [ipv6]:port (with the brackets, like localhost:9200 ipv6 would be [::1]:9200

We can just make a note about this and move on. I'm not sure many folks will need that anyway when DNS is usually simpler to work with ;P

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

That is a good point! I think I should patch this to work.

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

What do you think of this regex?
\A(?<hostname>([A-Za-z0-9\.\-]+)|\[[0-9\:]+\])(:(?<port>\d+))?\Z

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

I should say
\A(?<hostname>([A-Za-z0-9\.\-]+)|\[[0-9A-Fa-f\:]+\])(:(?<port>\d+))?\Z

HOSTNAME_PORT_REGEX=/\A(?<hostname>([A-Za-z0-9\.\-]+))(:(?<port>\d+))?\Z/
# Parse a configuration host to a normalized URL
def host_to_url(host, ssl=nil, path=nil)
explicit_scheme = if ssl.nil?

This comment has been minimized.

Copy link
@jordansissel

jordansissel Feb 10, 2016

Contributor

feels like this would be better as a case statement instead of a if/else+ternary

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

will fix

end

url = nil
if host =~ /\A#{URI::regexp(['http', 'https'])}\z/

This comment has been minimized.

Copy link
@jordansissel

jordansissel Feb 10, 2016

Contributor

Can you make this regex a constant?

url = nil
if host =~ /\A#{URI::regexp(['http', 'https'])}\z/
url = URI.parse(host)
if url.scheme == "http" && ssl == true

This comment has been minimized.

Copy link
@jordansissel

jordansissel Feb 10, 2016

Contributor

style: instead of ssl == true and later ssl == false -- thoughts on just using ssl as the value?

like: url.scheme == "http" && ssl

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

Ah, but we're handling the nil case differently!

This comment has been minimized.

Copy link
@andrewvc

andrewvc Feb 10, 2016

Author Contributor

I should probably leave a comment about that.

@andrewvc andrewvc force-pushed the andrewvc:fix_url_parsing branch from 3a283f9 to e0e7806 Feb 10, 2016
@jordansissel

This comment has been minimized.

Copy link
Contributor

jordansissel commented Feb 10, 2016

Read over the specs and they look good to me. Conditional LGTM if travis passes and the minor comments from myself and @suyograo are addressed.

@jordansissel

This comment has been minimized.

Copy link
Contributor

jordansissel commented Feb 10, 2016

Travis failed:

....F.................[2016-02-10 22:06:42,053][INFO ][rest.suppressed          ] /some-path//_template/logstash Params: {id=logstash, index=some-path, type=_template}
[some-path] IndexNotFoundException[no such index]

https://travis-ci.org/logstash-plugins/logstash-output-elasticsearch/builds/108396566#L257

@andrewvc andrewvc force-pushed the andrewvc:fix_url_parsing branch from e0e7806 to cc96e2b Feb 10, 2016
@andrewvc andrewvc force-pushed the andrewvc:fix_url_parsing branch from cc96e2b to 6e55efb Feb 10, 2016
@andrewvc

This comment has been minimized.

Copy link
Contributor Author

andrewvc commented Feb 10, 2016

Just addressed the minor comments with a force push + fixed the failing test (the expectation was bad).

@suyograo

This comment has been minimized.

Copy link
Contributor

suyograo commented Feb 10, 2016

LGTM

@elasticsearch-bot

This comment has been minimized.

Copy link

elasticsearch-bot commented Feb 10, 2016

Andrew Cholakian merged this into the following branches!

Branch Commits
master 1e58b96
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.