Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative scoll/scan implementation #51

Merged
merged 7 commits into from Feb 26, 2016
Merged

Alternative scoll/scan implementation #51

merged 7 commits into from Feb 26, 2016

Conversation

beowulfenator
Copy link
Collaborator

Hi!

I've implemented my own version of batch query for elasticsearch. It is consistent with other db classes. In other words, batch query processing is done with batch() and each() functions. Also, I've written some tests for this functionality.

I would love to see this commit merged into trunk, because then I shall be able to fix issues #48 and #19.

@cebe
Copy link
Member

cebe commented Jan 20, 2016

This looks awesome, thanks! Will make sure this goes into the next release.

@cebe cebe added this to the 2.0.4 milestone Jan 20, 2016
@cebe cebe self-assigned this Jan 20, 2016
@beowulfenator
Copy link
Collaborator Author

Should I fix updateAll and deleteAll in this branch as well?

@cebe
Copy link
Member

cebe commented Jan 20, 2016

not sure how this is related? you mean to use the scroll api to iterate over the models to update/delete?

@beowulfenator
Copy link
Collaborator Author

Exactly! The problem with updateAll/deleteAll is that it first needs to get all document ids, then perform the action. As far as I know, search API doesn't even let you get all results (you can set size to max_int or some really high value, but that's a bad solution). Even if it did, it would be terribly inefficient for anything over a few hundred records. So I suggest using scroll api in scan mode to get all ids first, and then carry on as usual.

@cebe
Copy link
Member

cebe commented Jan 21, 2016

Good idea, afair, setting size to max_int will result in ES to allocate that amount of memory regardsless of the number of results or so. May not be the case anymore but using scroll api sounds like a good idea.

@beowulfenator
Copy link
Collaborator Author

My last commit fixes issues #48 and #39.

/**
* @var string|integer the key for the current iteration
*/
private $_key = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parent method key() and current() will not access the private members in this class but the ones of the parent class.
I'd prefer to duplicate the BatchQueryResult from framework here, we may later extract a proper base class from these if it makes sense.

@cebe
Copy link
Member

cebe commented Feb 25, 2016

looks good so far except for the comments added. Also needs a CHANGELOG entry and @since 2.0.4 annotations.

beowulfenator added a commit that referenced this pull request Feb 26, 2016
- Enh #62: Added support for scroll API in `batch()` and `each()`
- Bug #48: `UpdateAll` now updates all entries, not first 10
- Bug #19: `DeleteAll` now deletes all entries, not first 10
- Enh: Unified model creation from result set in `Query` and `ActiveQuery` with `populate()`
@beowulfenator beowulfenator merged commit 6c1fede into yiisoft:master Feb 26, 2016
@cebe
Copy link
Member

cebe commented Feb 26, 2016

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants