New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup and simplify timings in QueryStateMachine #11874

Open
wants to merge 6 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@dain
Contributor

dain commented Nov 8, 2018

No description provided.

@raghavsethi

I love Simplify and cleanup QueryStateMachine

@@ -709,7 +710,7 @@ public boolean transitionToPlanning()
resourceWaitingStartNanos.compareAndSet(null, tickerNanos());
resourceWaitingTime.compareAndSet(null, nanosSince(resourceWaitingStartNanos.get()).convertToMostSuccinctTimeUnit());
totalPlanningStartNanos.compareAndSet(null, tickerNanos());
return queryState.setIf(PLANNING, currentState -> currentState == QUEUED || currentState == WAITING_FOR_RESOURCES);
return queryState.setIf(PLANNING, currentState -> currentState.ordinal() < PLANNING.ordinal());

This comment has been minimized.

@raghavsethi

raghavsethi Nov 8, 2018

Contributor

This pattern looks like a good candidate for a helper method.

@raghavsethi

Move QueryStateMachine timings to new QueryStateTimer looks good with nits (but is likely the commit causing test failure).

@raghavsethi

Remove unused testing QueryStats constructor LGTM

@@ -427,6 +427,20 @@ private QueryStats getQueryStats(Optional<StageInfo> rootStage)
elapsedTime = nanosSince(createNanos);
}
Duration executionTime;

This comment has been minimized.

@raghavsethi

raghavsethi Nov 9, 2018

Contributor

Query execution begins when the query is transitioned to planning.

This is a bit of an editorial decision.. it's not super clear to me that execution begins after planning vs after resource-waiting. Intuitively, I think execution should begin when we start actually processing data. I don't feel super strongly about this, but I'm curious to hear your thoughts.

This comment has been minimized.

@dain

dain Nov 13, 2018

Contributor

In the UIs, when we were presenting execution time, we were showing elapsed - queued, which until resource waiting was introduced, meant that execution begins when planning starts.

In my mind there are current 3 top level phases: queued, resource acquisition, and execution. Then I would divide execution into: planning, startup (until full scheduled), fully running, finishing (all tasks finished to query finished), and possibly committing (not sure that can be separated). To me, planning is part of execution, just like analysis, distributed planning, optimizing, and so on.

@raghavsethi

Improve QueryStateMachine test stats coverage LGTM

@dain

This comment has been minimized.

Contributor

dain commented Nov 9, 2018

@raghavsethi the test failure was caused by "Produce all query timings for query regardless of state" because one of the product test had a hardcoded capture of the status object which was missing the now required stats.

dain added some commits Nov 4, 2018

Simplify and cleanup QueryStateMachine
Use enum ordinals for state change condiitons to reduce risk of adding new states
Simplify computation of queuedTime and execution time
Fix query execution time calculation
Query execution begins when the query is transitioned to planning.  This was expected
by the computation that computed excution time by subtracting queued time from elapsed
time, but that is not longer correct with the addition of the resource waiting state.

The should not have a significant impact since resource waiting is a new feature that is
disabled by default.
Produce all query timings for query regardless of state
Instead of delaying production of state timings until each state is complete,
produce final duration for completed states, partial duration for current state,
and zero duration for future states.

This simplifies the code, and always gives up to date data for the current
state. This does not have a large impact as these stats are use for informational
purposes and for final events (which would have final data).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment