Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingest: processor stats #34724

Merged
merged 3 commits into from
Oct 23, 2018

Conversation

jakelandis
Copy link
Contributor

@jakelandis jakelandis commented Oct 22, 2018

Note - this change was originally merge with PR #34202, but was reverted due to a failing test

27c7077 has been introduced to fix the failing test. All other code has been reviewed (cherry-picked from before reverted)


This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.

Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only included in the stats if the condition evaluates
to true.

This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.

Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only included in the stats if the condition evaluates
to true.
@jakelandis jakelandis added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v7.0.0 v6.5.0 labels Oct 22, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but the bwc behavior still needs a slight change, and as discussed offline, I think a unit test for the bwc behavior of IngestStats would be good (a simple round trip test with pre 6.5 version).

String pipelineId = in.readString();
Stats pipelineStat = new Stats(in);
this.pipelineStats.add(new PipelineStat(pipelineId, pipelineStat));
if (in.getVersion().onOrAfter(Version.V_7_0_0_alpha1)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the version we are going back to, so 6.5. In this PR, disable bwc tests in the root build.gradle file. Then re-enable in the backport, and do a followup to master to re-enable there as well. This minimizes the changes necessary to followup to master (instead of needing to remember and change all the places that have this version constant).

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much @jakelandis. Given the code was previously reviewed, I reviewed the changes need to address the BWC failures. I am comfortable with those changes. LGTM.

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakelandis. LGTM.

jakelandis added a commit to jakelandis/elasticsearch that referenced this pull request Oct 23, 2018
This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.

Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only included in the stats if the condition evaluates
to true.
@jakelandis jakelandis merged commit ad94e79 into elastic:master Oct 23, 2018
@jakelandis
Copy link
Contributor Author

jakelandis commented Oct 23, 2018

jakelandis added a commit that referenced this pull request Oct 23, 2018
This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.

Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only included in the stats if the condition evaluates
to true.
@jakelandis jakelandis mentioned this pull request Oct 23, 2018
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Oct 23, 2018
* master: (24 commits)
  ingest: better support for conditionals with simulate?verbose (elastic#34155)
  [Rollup] Job deletion should be invoked on the allocated task (elastic#34574)
  [DOCS] .Security index is never auto created (elastic#34589)
  CCR: Requires soft-deletes on the follower (elastic#34725)
  re-enable bwc tests (elastic#34743)
  Empty GetAliases authorization fix (elastic#34444)
  INGEST: Document Processor Conditional (elastic#33388)
  [CCR] Add total fetch time leader stat (elastic#34577)
  SQL: Support pattern against compatible indices (elastic#34718)
  [CCR] Auto follow pattern APIs adjustments (elastic#34518)
  [Test] Remove dead code from ExceptionSerializationTests (elastic#34713)
  A small typo in migration-assistance doc (elastic#34704)
  ingest: processor stats (elastic#34724)
  SQL: Implement IN(value1, value2, ...) expression. (elastic#34581)
  Tests: Add checks to GeoDistanceQueryBuilderTests (elastic#34273)
  INGEST: Rename Pipeline Processor Param. (elastic#34733)
  Core: Move IndexNameExpressionResolver to java time (elastic#34507)
  [DOCS] Force Merge: clarify execution and storage requirements (elastic#33882)
  TESTING.asciidoc fix examples using forbidden annotation (elastic#34515)
  SQL: Implement `CONVERT`, an alternative to `CAST` (elastic#34660)
  ...
kcm pushed a commit that referenced this pull request Oct 30, 2018
This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.

Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only included in the stats if the condition evaluates
to true.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement v6.5.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants