Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make wait() method wait for query execution #31

Merged
merged 2 commits into from
May 7, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified ThriftSQL.phar
Binary file not shown.
7 changes: 5 additions & 2 deletions src/ThriftSQL/HiveQuery.php
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,15 @@ public function __construct( $response, $client ) {
}

public function wait() {
if ( $this->_ready ) {
return $this;
}

// Wait for results
$sleeper = new \ThriftSQL\Utils\Sleeper();
$sleeper->reset();
do {
$slept = $sleeper->sleep()->getSleptSecs();
if ( $slept > 18000 ) { // 5 Hours
if ( $sleeper->sleep()->getSleptSecs() > 18000 ) { // 5 Hours
// TODO: Actually kill the query then throw exception.
throw new \ThriftSQL\Exception( 'Hive Query Killed!' );
}
Expand Down
15 changes: 7 additions & 8 deletions src/ThriftSQL/ImpalaQuery.php
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,15 @@ public function __construct( $queryStr, $client ) {
}

public function wait() {
$sleeper = new \ThriftSQL\Utils\Sleeper();
if ( $this->_ready ) {
return $this;
}

// Wait for results
// Wait for query to be ready
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/query/results/ ?

Copy link
Contributor Author

@xyu xyu May 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the problem, sometimes the thrift query handle gets the query state marked as FINISHED before the query is ready to return results. Perhaps it gets marked as finished when fragments finish executing but before the gateway aggregates the results?

This can be seen if we run something like SELECT sleep(5000) which will return almost immediately even with wait() called on it. Then the initial fetch() call will take ~5s before it returns results.

I'm not sure if this change is being too clever with trying to wait for the initial fetch as well which might be just an edge case with the UDF.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this change is being too clever with trying to wait for the initial fetch as well which might be just an edge case with the UDF.

Yeah my bad, I should have read the whole thing and that there are actually 2 distinct waits ( which your comments explained quite well in fact, I should stop commenting as I go when reviewing and do a first full pass :D ).

I'd keep it as-is though: should we maybe try it with the cupcakes query that triggered this investigation?

$sleeper = new \ThriftSQL\Utils\Sleeper();
$sleeper->reset();
do {

$slept = $sleeper->sleep()->getSleptSecs();

if ( $slept > 18000 ) { // 5 Hours
if ( $sleeper->sleep()->getSleptSecs() > 18000 ) { // 5 Hours
// TODO: Actually kill the query then throw exception.
throw new \ThriftSQL\Exception( 'Impala Query Killed!' );
}
Expand Down Expand Up @@ -68,9 +68,8 @@ public function fetch( $maxRows ) {
if ( $response->ready ) {
break;
}
$slept = $sleeper->sleep()->getSleptSecs();

if ( $slept > 60 ) { // 1 minute
if ( $sleeper->sleep()->getSleptSecs() > 60 ) { // 1 Minute
throw new \ThriftSQL\Exception( 'Impala Query took too long to fetch!' );
}

Expand Down