Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"with totals" and rows_before_limit_at_least implementation #10

Closed
sleptor opened this issue May 2, 2017 · 10 comments
Closed

"with totals" and rows_before_limit_at_least implementation #10

sleptor opened this issue May 2, 2017 · 10 comments

Comments

@sleptor
Copy link

sleptor commented May 2, 2017

Let's imagine you need to show some stats in a table. The last row of the table must contain Total values. The table must have pagination as well.

Clickhouse allows to fetch all needed data in one query. for example

SELECT
    idsite,
    sum(impressions) AS impressions
FROM stat
GROUP BY idsite
WITH TOTALS
LIMIT 10
FORMAT JSON

as result you will get this json:
...
"totals":
{
"idsite": 0,
"impressions": "73886972"
},
"rows": 10,
"rows_before_limit_at_least": 226,
...

Consider to implement these very useful features.

@sanchezzzhak
Copy link
Owner

sanchezzzhak commented May 3, 2017

done;

$sql = 'SELECT
    idsite,
    sum(impressions) AS impressions
FROM stat
GROUP BY idsite
WITH TOTALS
LIMIT 10'; 	
    $command = $clickhouse->createCommand($sql);  	
    $result  = $command->queryAll();
   
    var_dump($command->meta());  	// columns meta info array (columnName, dataType)
    var_dump($command->totals());    	// result WITH TOTALS
    var_dump($command->data());  	// get rows data
    var_dump($command->rows());  	// rows count current result
    var_dump($command->countAll());    // rows count before limit at least, easy pagination
    var_dump($command->extremes());  // extremes values	
    var_dump($command->statistics());  // stat query 

or

//...
 $command = $clickhouse->createCommand($sql);  
 $result = $command->data();
 $total  = $command->countAll();

@sleptor
Copy link
Author

sleptor commented May 3, 2017

great job men!

  1. some points about design. you've mixed "getter" and "execute" functionality. it's not the best practice in my opinion. a weak point of your approach is the getters may execute query implicitly but should only return values related to a query that was executed earlier.
    for example, meta() method could be rewritten as:
public function rows()
{
     $this->ensureQueryExecuted();
     return $this->_rows;
}
public function ensureQueryExecuted()
{
      if(true !== $this->_is_result) {
         throw new DbException('Query was not executed yet');
      }
}

it's just my opinion we can discuss it:)

  1. QueryBuilder could be improved to allow this syntax:
    (new Query)->from('stat')->withTotals()->all()

because currently, we cannot use query builder with "with totals" syntax at all. we must write raw sql :(

@sanchezzzhak
Copy link
Owner

sanchezzzhak commented May 3, 2017

$q = (new \kak\clickhouse\Query())->from('stat')
    ->withTotals()
    ->where(['event_date' => '2017-05-01' , 'user_id' => 5 ])
    ->offset(2)
    ->limit(1);

$command = $q->createCommand();
$result  = $command->queryAll();
$total   = $command->countAll();

var_dump($result);     // result data
var_dump($total);      // result WITH TOTALS

// -----
// all array result data meta totlas etc...
$result = (new \kak\clickhouse\Query())
    ->from('stat')
    ->withTotals()
    ->all();
var_dump($result);

@sleptor
Copy link
Author

sleptor commented May 3, 2017

checked latest "fix; re" commit. you've refactored the code and I guess had broken Query::all() standard interface. This functions must return an array of rows. All developers expect this behaviour.
but now, if I add withTotals then Query::all() and one() will return array of arrays with data, totals & etc. the code will be incompatible with others.

one of the ways to fix it is add an additional parameter to withTotal($returnMetaData=false)
by default it's false, it means all() and one() will have standard behaviour.

but a better way is to keep withTotal() as it is. but create additional function withMetaData()

(new Query())
->withTotal()
->withMetaData()
->all()
in this case data will be included to results array

(new Query())
->withTotal()
->all()

in this case data will NOT be included to results array

@sanchezzzhak
Copy link
Owner

sanchezzzhak commented May 3, 2017

(new Query())
->withTotal()
->withMetaData()
->all()

result all data current verson

Array
(
    [meta] => Array
        (
            [0] => Array
                (
                    [name] => user_id
                    [type] => Int32
                )

        )

    [data] => Array
        (
            [0] => Array
                (
                    [user_id] => 2
                )

            [1] => Array
                (
                    [user_id] => 2
                )

            [2] => Array
                (
                    [user_id] => 2
                )

            [3] => Array
                (
                    [user_id] => 2
                )

            [4] => Array
                (
                    [user_id] => 2
                )

        )

    [rows] => 5
    [countAll] => 39
    [totals] => Array
        (
            [user_id] => 0
        )

    [statistics] => Array
        (
            [elapsed] => 8.1887E-5
            [rows_read] => 39
            [bytes_read] => 156
        )

    [extremes] => 
)

is use only withTotal

(new Query())
->withTotal(false)
->all()

result

[totals] => Array  (
    [user_id] => 0
 )

So?

@sleptor
Copy link
Author

sleptor commented May 3, 2017

image

I guess the vars

public $withTotals = false;
public $withMetaData = false;

must be private.
UPD
tried to change a visibility of those vars to private or protected.. something went wrong :( got error

@sleptor
Copy link
Author

sleptor commented May 3, 2017

About ->withTotal() and ->withMetaData() again. Some misunderstanding here, probably I elaborated my idea not enough. The current behavior is still incorrect in my opinion.

adding ->withTotal() to a query must NOT change a result format. Currently, you must execute 2 queries: 1st to fetch rows, 2nd to get totals but only one query is really needed.

for example.

$db = Yii::$app->clickhouse;
$cmd = (new Query())
            ->select(['iddevice', new Expression('sum(impressions) as impressions')])
            ->from('stat')
            ->groupBy('iddevice');
$res = $cmd->all($db);

it returns rows

array(9) {
  [0]=>
  array(2) {
    ["iddevice"]=>
    int(0)
    ["impressions"]=>
    string(7) "4220435"
  }
  [1]=>
  array(2) {
    ["iddevice"]=>
    int(4)
    ["impressions"]=>
    string(1) "5"
  }

query with totals

$cmd = (new Query())
            ->select(['iddevice', new Expression('sum(impressions) as impressions')])
            ->from('stat')
            ->groupBy('iddevice')
            ->withTotals();
$res = $cmd->all($db);

it must return the same results

array(9) {
  [0]=>
  array(2) {
    ["iddevice"]=>
    int(0)
    ["impressions"]=>
    string(7) "4220435"
  }
  [1]=>
  array(2) {
    ["iddevice"]=>
    int(4)
    ["impressions"]=>
    string(1) "5"
  }  
...

the difference is only in generated SQL. in the 1st case sql-query was generated without "with totals" in the 2nd case sql contains "with totals".

After 1st query $cmd->totals() should return null, but after 2nd query $cmd->totals() should return:

array(2) {
 ["iddevice"]=>
 int(0)
 ["impressions"]=>
 string(8) "59938864"
}

about "->withMetaData()"... maybe remove it at all. or rename somehow.
anyway all data are available via special getters meta(), totals() and etc.
it's syntax sugar.

@sanchezzzhak
Copy link
Owner

fixed

     $cmd = (new Query())
            ->select(['iddevice, sum(impressions) as impressions'])
                ->from('stat')
                ->withTotals()
                ->groupBy('iddevice');

        $result = $cmd->all();

        echo "=== result\n";
        var_dump($result);
        echo "=== meta\n";
        var_dump($cmd->meta());
        echo "=== extremes\n";
        var_dump($cmd->extremes());
        echo "=== totals\n";
        var_dump($cmd->totals());
        echo "=== rows\n";
        var_dump($cmd->rows());
        echo "=== countAll\n";
        var_dump($cmd->countAll());
       

sanchezzzhak added a commit that referenced this issue May 3, 2017
@sleptor
Copy link
Author

sleptor commented May 4, 2017

almost perfect :)

I guess we should rename functions to getMeta(), getTotals() & etc
because they are definitely getters. yii2 magic will be applied. we will be able to call it either as a function $this->getMeta() nor as a property $this->meta

@sanchezzzhak
Copy link
Owner

I agree it is necessary to rename..
done, rename special methods

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants