Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change structure of URL #7

Merged
merged 4 commits into from
Jun 6, 2018
Merged

Change structure of URL #7

merged 4 commits into from
Jun 6, 2018

Conversation

ruflin
Copy link
Member

@ruflin ruflin commented May 30, 2018

So far the url structure was heavily inspired by whatwg/url#337. I initially only wanted to make some tweaks to it to improve querying but I realised I never fully felt comfortable with the field names used here. So I started to look at the url parser of different languages like Go, Ruby, Python and the output they provide are surprisingly similar but not consistent with whatwg. The change made here brings the field names closer to what most url parsers output.

README.md Outdated
| <a name="url.port"></a>`url.port` | The port of the request, e.g. 443. | long | | `443` |
| <a name="url.path"></a>`url.path` | The path of the request, e.g. "/search". | text | | |
| <a name="url.path.raw"></a>`url.path.raw` | The url path. This is a non-analyzed field that is useful for aggregations. | keyword | 1 | |
| <a name="url.query"></a>`url.query` | The search describes the query string of the request, e.g. "q=elasticsearch".<br/>The `?` is excluded from the query string. In case an URL contains no `?` it is expected that the query field is left out. In case there is a `?` but no query, the query field is expected to exist with an empty string. Like this the `exists` query can be used to differentiate between the two cases. | text | | |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“The search” still references the old field name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

README.md Outdated
| <a name="url.hash"></a>`url.hash` | The hash of the request URL, e.g. "top". | keyword | | |
| <a name="url.scheme"></a>`url.scheme` | The scheme of the request, e.g. "https".<br/>Note: The `:` is not part of the scheme. | keyword | | `https` |
| <a name="url.hostname"></a>`url.hostname` | The hostname of the request, e.g. "example.com".<br/>For correlation the this field can be copied into the `host.name` field. | keyword | | `elastic.co` |
| <a name="url.port"></a>`url.port` | The port of the request, e.g. 443. | long | | `443` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integer type should be large enough, is there any reason for going with long instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ES it does not really matter from a compression perspective if it is set as integer or long so we default to long. Agree that it is not needed here but it also does not make a difference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still makes sense IMO to go with integer to signal to anyone implementing ECS the appropriate type to use in their implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to integer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mpdreamz The template example Elasticsearch template still has long inside because I reuse the code from Beats and there it automatically converts all integer to long for the template generation. Please ignore that for now.

schemas/url.yml Outdated
- name: port
type: keyword
type: long
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

integer since port numbers are unsigned 16-bit integers? Nitpicking :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And again already covered in a previous comment, I'll see myself out :)

So far the url structure was heavily inspired by whatwg/url#337. I initially only wanted to make some tweaks to it to improve querying but I realised I never fully felt comfortable with the field names used here. So I started to look at the url parser of different languages like Go, Ruby, Python and the output they provide are surprisingly similar but not consistent with whatwg. The change made here brings the field names closer to what most url parsers output.
@andrewkroh andrewkroh merged commit 4388c69 into elastic:master Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants