Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Add new topic about data deduplication #15171

Merged
merged 7 commits into from
Feb 5, 2020

Conversation

dedemorton
Copy link
Contributor

@dedemorton dedemorton commented Dec 18, 2019

Closes #13739

@dedemorton dedemorton added docs in progress Pull request is currently in progress. needs_backport PR is waiting to be backported to other branches. Team:Beats labels Dec 18, 2019
@dedemorton
Copy link
Contributor Author

dedemorton commented Jan 16, 2020

@ycombinator I can't the add_id or fingerprint processors to work in my config (using Metricbeat 7.6 BC1). I see the following error:

2020-01-16T15:36:33.232-0800	ERROR	instance/beat.go:933	Exiting: error initializing 
processors: the processor action add_id does not exist. Valid actions: add_labels, copy_fields, 
decompress_gzip_field, rename, community_id, registered_domain, add_observer_metadata, 
decode_base64_field, drop_event, drop_fields, add_cloud_metadata, add_host_metadata, 
add_locale, add_kubernetes_metadata, add_fields, add_tags, decode_json_fields, truncate_fields, 
dissect, extract_array, include_fields, add_process_metadata, convert, dns, add_docker_metadata
Exiting: error initializing processors: the processor action add_id does not exist. Valid actions: 
add_labels, copy_fields, decompress_gzip_field, rename, community_id, registered_domain, 
add_observer_metadata, decode_base64_field, drop_event, drop_fields, add_cloud_metadata, 
add_host_metadata, add_locale, add_kubernetes_metadata, add_fields, add_tags, 
decode_json_fields, truncate_fields, dissect, extract_array, include_fields, add_process_metadata, 
convert, dns, add_docker_metadata

@ycombinator
Copy link
Contributor

@dedemorton Thanks for bringing this up. I am able to reproduce this issue locally and am working on a fix now. There's a chance that the fix may not go into 7.6.0 (but we might be able to get it into 7.6.1).

@ycombinator
Copy link
Contributor

@dedemorton The PR to fix the issue is up: #15624.

@dedemorton dedemorton removed the in progress Pull request is currently in progress. label Jan 25, 2020
@dedemorton
Copy link
Contributor Author

@ycombinator This is ready for review. I wish I had more time to add realistic examples, but this at least gets the basic content in place with simple examples. It would be really nice to add an extended example that shows all the bits (log file, filebeat configs, LS configs) and tells a story. Since the processors were broken until recently, I didn't have time to come up with something better. Maybe we can add the blog later?

@dedemorton dedemorton changed the title [WIP][DOCS] Add new topic about data deduplication [DOCS] Add new topic about data deduplication Jan 25, 2020
@urso
Copy link

urso commented Jan 27, 2020

@dedemorton Elasticsearch input also defaults to @metadata._id. We are switching to @metadata._id and add missing support for the document_id setting in the decode_json_fields processor: #15859

@dedemorton
Copy link
Contributor Author

@ycombinator I've addressed your feedback and updated the field name to _id.

I think I should also add an example of decode_json_fields now that you can use it to set the id. WDYT?

@ycombinator
Copy link
Contributor

I think I should also add an example of decode_json_fields now that you can use it to set the id. WDYT?

Agreed.

Otherwise, LGTM.

@dedemorton
Copy link
Contributor Author

@ycombinator Please take a look at the latest changes. I've added a section for the decode_json_fields processor, but could not test it yet because the document_id setting does not seem to be valid in the latest build candidate (BC4).

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@dedemorton dedemorton merged commit a00ae65 into elastic:master Feb 5, 2020
@dedemorton dedemorton deleted the issue#13739 branch February 5, 2020 20:36
@zube zube bot added [zube]: Done and removed [zube]: Inbox labels Feb 5, 2020
dedemorton added a commit to dedemorton/beats that referenced this pull request Feb 5, 2020
dedemorton added a commit to dedemorton/beats that referenced this pull request Feb 5, 2020
@dedemorton dedemorton removed the needs_backport PR is waiting to be backported to other branches. label Feb 5, 2020
@andresrc andresrc added the Team:Integrations Label for the Integrations team label Mar 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Team:Integrations Label for the Integrations team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update documentation with document_id
4 participants