Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] If stopping _all data frames, then querying _all stats is not correct #43203

Closed
sophiec20 opened this issue Jun 13, 2019 · 1 comment · Fixed by #43206

Comments

@sophiec20
Copy link

commented Jun 13, 2019

Found in 7.2.0-BC5

  1. Create and start 100 data frame transforms (same transform, incrementing id, scripted, and can be on small source indices) - give them time to complete
  2. Check in UI - all transforms are visible in list and show 100% progress
  3. Run POST _data_frame/transforms/_all/_stop
  4. Check in UI - the first 10 transforms show 100%, the rest shows 0%

For example, this system contains transforms from farequote-airline-100 to farequote-airline-205.
All transforms completed 100%.
All data frames contain 19 documents.
All were stopped using POST _data_frame/transform/_all/_stop.
However when querying using GET _data_frame/transform/_all/_stats, then the 10th transform shows checkpoint 0, stats 0 and has a missing progress object.

...
  {
      "id" : "farequote-airline-109",
      "state" : {
        "task_state" : "stopped",
        "indexer_state" : "stopped",
        "checkpoint" : 1,
        "progress" : {
          "total_docs" : 86274,
          "docs_remaining" : 0,
          "percent_complete" : 100.0
        }
      },
      "stats" : {
        "pages_processed" : 2,
        "documents_processed" : 86274,
        "documents_indexed" : 19,
        "trigger_count" : 1,
        "index_time_in_ms" : 7,
        "index_total" : 1,
        "index_failures" : 0,
        "search_time_in_ms" : 4,
        "search_total" : 2,
        "search_failures" : 0
      },
      "checkpointing" : {
        "operations_behind" : 0
      }
    },
    {
      "id" : "farequote-airline-110",
      "state" : {
        "task_state" : "stopped",
        "indexer_state" : "stopped",
        "checkpoint" : 0
      },
      "stats" : {
        "pages_processed" : 0,
        "documents_processed" : 0,
        "documents_indexed" : 0,
        "trigger_count" : 0,
        "index_time_in_ms" : 0,
        "index_total" : 0,
        "index_failures" : 0,
        "search_time_in_ms" : 0,
        "search_total" : 0,
        "search_failures" : 0
      },
      "checkpointing" : {
        "operations_behind" : 0
      }
    },
...

However when querying using GET _data_frame/transform/farequote-airline-110/_stats then the correct progress is returned.

{
  "count" : 1,
  "transforms" : [
    {
      "id" : "farequote-airline-110",
      "state" : {
        "task_state" : "stopped",
        "indexer_state" : "stopped",
        "checkpoint" : 1,
        "progress" : {
          "total_docs" : 86274,
          "docs_remaining" : 0,
          "percent_complete" : 100.0
        }
      },
      "stats" : {
        "pages_processed" : 2,
        "documents_processed" : 86274,
        "documents_indexed" : 19,
        "trigger_count" : 1,
        "index_time_in_ms" : 16,
        "index_total" : 1,
        "index_failures" : 0,
        "search_time_in_ms" : 14,
        "search_total" : 2,
        "search_failures" : 0
      },
      "checkpointing" : {
        "operations_behind" : 0
      }
    }
  ]
}

Note that if I repeat the test, but stop the transform by explicitly naming the transform id, then the _all/_stats returns the correct 100% responses.

Note that if I query _all/_stats at the point when all transforms are not yet stopped, then _all/_stats returns the correct 100% responses.

Weird ... but managed to repeat this.

@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Jun 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.