New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for fetching Query results in batches #2409

Closed
wants to merge 2 commits into
from

Conversation

Projects
None yet
6 participants
@nineinchnick
Contributor

nineinchnick commented Feb 13, 2014

Resolves #2387 by adding the ability to process Query results in batches.

This PR introduces a next() method in db\Query and db\ActiveQuery, which is a wrapper over db\DataProvider. DataProvider can't be used directly, because in ActiveQuery private methods are used to perform lazy loading.

Executing following code, when there are 4 users:

$q = app\models\User::find()->with('userProfilePictures');
while(!empty($users=$q->next(2))) {
    // process
}

results in having the following queries:

SELECT * FROM `users`;
SELECT * FROM `user_profile_pictures` WHERE `user_id` IN (1, 2);
SELECT * FROM `user_profile_pictures` WHERE `user_id` IN (3, 4);

Should other Query implementations also introduce such method? Probably not if there's no cursor available thus there's no gain in batch processing.

@cebe

This comment has been minimized.

Show comment
Hide comment
@cebe

cebe Feb 13, 2014

Member

This implementation does not allow re-using the Query object for different queries but I really like the idea! :) Needs some adjustments for reusing Query, have not checked how, yet.

Member

cebe commented Feb 13, 2014

This implementation does not allow re-using the Query object for different queries but I really like the idea! :) Needs some adjustments for reusing Query, have not checked how, yet.

Show outdated Hide outdated framework/db/Query.php
$this->_dataReader = $this->createCommand($db)->query();
}
$count = 0;
$results = [];

This comment has been minimized.

@cebe

cebe Feb 13, 2014

Member

$results -> $result

@cebe

cebe Feb 13, 2014

Member

$results -> $result

Show outdated Hide outdated framework/db/Query.php
$result[$key] = $row;
}
}
if (empty($results)) {

This comment has been minimized.

@cebe

cebe Feb 13, 2014

Member

$results -> $result

@cebe

cebe Feb 13, 2014

Member

$results -> $result

@nineinchnick

This comment has been minimized.

Show comment
Hide comment
@nineinchnick

nineinchnick Feb 13, 2014

Contributor

@cebe you mean the scenario, when not all results are read from the cursor, the query is altered and another call to next will return unexpected results?

I don't like tracking the dataReader as a protected var but I want to keep it simple. What about passing it as an argument instead the $db? Then the developer would be responsible for tracking the dataReader and closing the cursor manually if not all results are read.

Contributor

nineinchnick commented Feb 13, 2014

@cebe you mean the scenario, when not all results are read from the cursor, the query is altered and another call to next will return unexpected results?

I don't like tracking the dataReader as a protected var but I want to keep it simple. What about passing it as an argument instead the $db? Then the developer would be responsible for tracking the dataReader and closing the cursor manually if not all results are read.

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 13, 2014

Member

It looks great to me. Yes, I have the same concern about having dataReader within a query object. The query classes are designed so that they only contain information needed to build and execute a query.

How about the following syntax?

foreach (Post::find()->with('author')->batch(10) as $posts) {
    // $posts is an array with maximum length 10
}

The method batch($limit, $db) returns an iterator object which contains the reference to the query object as well as a DataReader instance. The object can have a method to close the cursor.

Member

qiangxue commented Feb 13, 2014

It looks great to me. Yes, I have the same concern about having dataReader within a query object. The query classes are designed so that they only contain information needed to build and execute a query.

How about the following syntax?

foreach (Post::find()->with('author')->batch(10) as $posts) {
    // $posts is an array with maximum length 10
}

The method batch($limit, $db) returns an iterator object which contains the reference to the query object as well as a DataReader instance. The object can have a method to close the cursor.

@Ragazzo

This comment has been minimized.

Show comment
Hide comment
@Ragazzo

Ragazzo Feb 13, 2014

Contributor

such feature is in L4, can look how it is done there, but they make simple wrapper.

Contributor

Ragazzo commented Feb 13, 2014

such feature is in L4, can look how it is done there, but they make simple wrapper.

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 13, 2014

Member

Can't find it. What is its syntax?

Member

qiangxue commented Feb 13, 2014

Can't find it. What is its syntax?

@Ragazzo

This comment has been minimized.

Show comment
Hide comment
@Ragazzo

Ragazzo Feb 13, 2014

Contributor

Chunking Results - http://laravel.com/docs/eloquent. I also like there collections but not sure if it will be implemented in Yii2.

Contributor

Ragazzo commented Feb 13, 2014

Chunking Results - http://laravel.com/docs/eloquent. I also like there collections but not sure if it will be implemented in Yii2.

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 13, 2014

Member

It seems to be based on limit/offset, which we want to avoid as discussed in #2387.

Member

qiangxue commented Feb 13, 2014

It seems to be based on limit/offset, which we want to avoid as discussed in #2387.

@nineinchnick

This comment has been minimized.

Show comment
Hide comment
@nineinchnick

nineinchnick Feb 13, 2014

Contributor

@qiangxue I like the iterator idea but in case of ActiveQuery the prepareResults method (that I extracted from all) would have to be public. Is that ok?
Should the iterator class be named QueryReader or QueryIterator? I'm for the former one.

@Ragazzo I also like the collections idea. I remember this forum post and extension. An issue should be created for this.

Contributor

nineinchnick commented Feb 13, 2014

@qiangxue I like the iterator idea but in case of ActiveQuery the prepareResults method (that I extracted from all) would have to be public. Is that ok?
Should the iterator class be named QueryReader or QueryIterator? I'm for the former one.

@Ragazzo I also like the collections idea. I remember this forum post and extension. An issue should be created for this.

@Ragazzo

This comment has been minimized.

Show comment
Hide comment
@Ragazzo

Ragazzo Feb 13, 2014

Contributor

I also like the collections idea. I remember this forum post and extension. An issue should be created for this.

ok, i think this can be an extension yii2-ar-collection. @qiangxue whats your thoughts about this one? of course it is out of scope 2.0GA.

It seems to be based on limit/offset, which we want to avoid as discussed in #2387.

yeah, can you point to sentence about not using limit/offset? cant find it.

Contributor

Ragazzo commented Feb 13, 2014

I also like the collections idea. I remember this forum post and extension. An issue should be created for this.

ok, i think this can be an extension yii2-ar-collection. @qiangxue whats your thoughts about this one? of course it is out of scope 2.0GA.

It seems to be based on limit/offset, which we want to avoid as discussed in #2387.

yeah, can you point to sentence about not using limit/offset? cant find it.

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 13, 2014

Member

@nineinchnick Yes, we need to turn some private method into public in ActiveQuery. Would you like to work on this?

Regarding the collection idea, we have discussed about it at the very beginning when we were designing Yii2 AR. Our conclusion was to keep things simple, at least for 2.0. You may go ahead and create a ticket about this. In the ticket, we would like to see every possible use case example of the collection so that we have better understanding of the requirements.

Member

qiangxue commented Feb 13, 2014

@nineinchnick Yes, we need to turn some private method into public in ActiveQuery. Would you like to work on this?

Regarding the collection idea, we have discussed about it at the very beginning when we were designing Yii2 AR. Our conclusion was to keep things simple, at least for 2.0. You may go ahead and create a ticket about this. In the ticket, we would like to see every possible use case example of the collection so that we have better understanding of the requirements.

@Ragazzo

This comment has been minimized.

Show comment
Hide comment
@Ragazzo

Ragazzo Feb 14, 2014

Contributor

every possible use case example

to be true, i dont think there will be a lot of use-cases, main of them is to not consume a lot of memory, for example when exproting to pdf/excel. So there will not be a lot of use-cases. Worth creating ticket?

Contributor

Ragazzo commented Feb 14, 2014

every possible use case example

to be true, i dont think there will be a lot of use-cases, main of them is to not consume a lot of memory, for example when exproting to pdf/excel. So there will not be a lot of use-cases. Worth creating ticket?

@samdark

This comment has been minimized.

Show comment
Hide comment
@samdark

samdark Feb 14, 2014

Member

@Ragazzo yes. It's requested quite often but use cases aren't clear each time so it definitely needs a good separate discussion.

Member

samdark commented Feb 14, 2014

@Ragazzo yes. It's requested quite often but use cases aren't clear each time so it definitely needs a good separate discussion.

@Ragazzo

This comment has been minimized.

Show comment
Hide comment
@Ragazzo

Ragazzo Feb 14, 2014

Contributor

Done - #2423. Feel free to add your use-cases and delete here comments to not to mess them with main discussion here.

Contributor

Ragazzo commented Feb 14, 2014

Done - #2423. Feel free to add your use-cases and delete here comments to not to mess them with main discussion here.

@qiangxue qiangxue referenced this pull request Feb 14, 2014

Closed

Data sets collections #2423

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 14, 2014

Member

Just FYI, I'm working on this currently. will finish soon.

Member

qiangxue commented Feb 14, 2014

Just FYI, I'm working on this currently. will finish soon.

@qiangxue qiangxue closed this Feb 14, 2014

@nineinchnick

This comment has been minimized.

Show comment
Hide comment
@nineinchnick

nineinchnick Feb 14, 2014

Contributor

Is it finished? I haven't seen a relevant commit. I've been working on this too.

Contributor

nineinchnick commented Feb 14, 2014

Is it finished? I haven't seen a relevant commit. I've been working on this too.

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 14, 2014

Member

Not committed yet. Almost finished. Writing documentation now.

Member

qiangxue commented Feb 14, 2014

Not committed yet. Almost finished. Writing documentation now.

@nineinchnick nineinchnick deleted the nineinchnick:2387-batch-query branch Feb 14, 2014

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 14, 2014

Member

Will let you know. Please help review my change. Sorry, I should let you know earlier.

Member

qiangxue commented Feb 14, 2014

Will let you know. Please help review my change. Sorry, I should let you know earlier.

@Ragazzo

This comment has been minimized.

Show comment
Hide comment
@Ragazzo

Ragazzo Feb 14, 2014

Contributor

Great)

Contributor

Ragazzo commented Feb 14, 2014

Great)

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 14, 2014

Member

Done (api doc to be finished). For the usage, please see:
https://github.com/yiisoft/yii2/blob/master/docs/guide/query-builder.md#batch-query
https://github.com/yiisoft/yii2/blob/master/docs/guide/active-record.md#querying-data-from-the-database

Please help review the code. Your feedback is very welcome. Thanks.

Member

qiangxue commented Feb 14, 2014

Done (api doc to be finished). For the usage, please see:
https://github.com/yiisoft/yii2/blob/master/docs/guide/query-builder.md#batch-query
https://github.com/yiisoft/yii2/blob/master/docs/guide/active-record.md#querying-data-from-the-database

Please help review the code. Your feedback is very welcome. Thanks.

@Ragazzo

This comment has been minimized.

Show comment
Hide comment
@Ragazzo

Ragazzo Feb 14, 2014

Contributor

Looks fine to me, docs are good, have not tried it yet though. Also want to note that for implementing such feature there was added not so much code, that is because of great Yii2 AR design.
Thanks)

Contributor

Ragazzo commented Feb 14, 2014

Looks fine to me, docs are good, have not tried it yet though. Also want to note that for implementing such feature there was added not so much code, that is because of great Yii2 AR design.
Thanks)

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 14, 2014

Member

Thank you for your comments and review. All work are done.
I kept the properties of BatchQueryResult public with the intention that they might be useful outside of the class. The API doc describes when the properties can be modified.

Member

qiangxue commented Feb 14, 2014

Thank you for your comments and review. All work are done.
I kept the properties of BatchQueryResult public with the intention that they might be useful outside of the class. The API doc describes when the properties can be modified.

@cebe

This comment has been minimized.

Show comment
Hide comment
@cebe

cebe Feb 15, 2014

Member

First I have to say I am impressed how great the AR design works and how extensible it is with different features :-)
But I thought of the implementation/usage in a bit different way:

usage:

foreach($query->batch(10) as $user) {
    // $user will be one single instance
}

This way the BatchQueryResult will only hold the datareader and will load next batch when I iterate over the size of the batch. This would be more like what I would expect from this method to do. Otherwise I would need nested foreach to handle all users in DB:

foreach($query->batch(10) as $users) {
    foreach($users as $user) {
        // $user will be one single instance here
    }
}

This approach would also make it simpler to switch from:

foreach($query->all() as $user) { //...

to:

foreach($query->batch(10) as $user) { //...

This would make the iterator implementation a bit more complicated but in general more easy to use interface.

Member

cebe commented Feb 15, 2014

First I have to say I am impressed how great the AR design works and how extensible it is with different features :-)
But I thought of the implementation/usage in a bit different way:

usage:

foreach($query->batch(10) as $user) {
    // $user will be one single instance
}

This way the BatchQueryResult will only hold the datareader and will load next batch when I iterate over the size of the batch. This would be more like what I would expect from this method to do. Otherwise I would need nested foreach to handle all users in DB:

foreach($query->batch(10) as $users) {
    foreach($users as $user) {
        // $user will be one single instance here
    }
}

This approach would also make it simpler to switch from:

foreach($query->all() as $user) { //...

to:

foreach($query->batch(10) as $user) { //...

This would make the iterator implementation a bit more complicated but in general more easy to use interface.

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 15, 2014

Member

@cebe: the reason for this design is to support some functions or handlers that can handle a batch of data each time. For example,

foreach ($query->batch(10) as $users) {
    $handler->handle($users);
}

Otherwise, batch size 10 and batch size 1 really don't differ much.

If you want to get one user a time, you can use each():

foreach ($query->each() as $user) {
}
Member

qiangxue commented Feb 15, 2014

@cebe: the reason for this design is to support some functions or handlers that can handle a batch of data each time. For example,

foreach ($query->batch(10) as $users) {
    $handler->handle($users);
}

Otherwise, batch size 10 and batch size 1 really don't differ much.

If you want to get one user a time, you can use each():

foreach ($query->each() as $user) {
}
@cebe

This comment has been minimized.

Show comment
Hide comment
@cebe

cebe Feb 15, 2014

Member

okay, makes sense.

Member

cebe commented Feb 15, 2014

okay, makes sense.

@cebe

This comment has been minimized.

Show comment
Hide comment
@cebe

cebe Feb 15, 2014

Member

Well... thinking a bit more about it this behavior is still bad for AR because of relational queries. each() should still be able to fetch a batch internally and populate relations for a number of records.
We could override each() method in ActiveQuery to add this feature.

Member

cebe commented Feb 15, 2014

Well... thinking a bit more about it this behavior is still bad for AR because of relational queries. each() should still be able to fetch a batch internally and populate relations for a number of records.
We could override each() method in ActiveQuery to add this feature.

@nineinchnick

This comment has been minimized.

Show comment
Hide comment
@nineinchnick

nineinchnick Feb 15, 2014

Contributor

I agree.

Contributor

nineinchnick commented Feb 15, 2014

I agree.

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Feb 15, 2014

Member

You are right. 'Each' also needs batch size. Will fix it.

On Saturday, February 15, 2014, Jan Waś notifications@github.com wrote:

I agree but how to pass the batch size? Rename to eachInBatch(int)?


Reply to this email directly or view it on GitHubhttps://github.com/yiisoft/yii2/pull/2409#issuecomment-35151119
.

Member

qiangxue commented Feb 15, 2014

You are right. 'Each' also needs batch size. Will fix it.

On Saturday, February 15, 2014, Jan Waś notifications@github.com wrote:

I agree but how to pass the batch size? Rename to eachInBatch(int)?


Reply to this email directly or view it on GitHubhttps://github.com/yiisoft/yii2/pull/2409#issuecomment-35151119
.

@qiangxue

This comment has been minimized.

Show comment
Hide comment
Member

qiangxue commented Feb 15, 2014

Done: 9a068f5

@cebe cebe added this to the 2.0 Beta milestone Feb 15, 2014

@Koudy

This comment has been minimized.

Show comment
Hide comment
@Koudy

Koudy Mar 5, 2014

Contributor

I have a problem when using this:
foreach (Customer::find()->with('orders')->each() as $customer) {
I get false in $customer.

It started to work after I added this:
public function next()
{
if ($this->_batch === null || !$this->each || $this->each && next($this->_batch) === false) {
$this->_batch = $this->fetchData();
reset($this->_batch);
}

There must be some mistake or I do something wrong?

Contributor

Koudy commented Mar 5, 2014

I have a problem when using this:
foreach (Customer::find()->with('orders')->each() as $customer) {
I get false in $customer.

It started to work after I added this:
public function next()
{
if ($this->_batch === null || !$this->each || $this->each && next($this->_batch) === false) {
$this->_batch = $this->fetchData();
reset($this->_batch);
}

There must be some mistake or I do something wrong?

@qiangxue

This comment has been minimized.

Show comment
Hide comment
@qiangxue

qiangxue Mar 5, 2014

Member

Fixed. Thanks!

Member

qiangxue commented Mar 5, 2014

Fixed. Thanks!

qiansen1386 pushed a commit to qiansen1386/yii2 that referenced this pull request Mar 9, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment