Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datastore: enable to set query batch size #90

Merged
merged 6 commits into from
May 16, 2018

Conversation

delphinus
Copy link
Contributor

This adds q.BatchSize() method to avoid the error that claims too many
datastore.next() calls due to the lack of specifying the count
property (= batch size).

Fixes #88

This adds q.BatchSize() method to avoid the error that claims too many
datastore.next() calls due to the lack of specifying the `count`
property (= batch size).

Fixes golang#88
@googlebot
Copy link
Collaborator

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If your company signed a CLA, they designated a Point of Contact who decides which employees are authorized to participate. You may need to contact the Point of Contact for your company and ask to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the project maintainer to go/cla#troubleshoot.
  • In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again.

@delphinus
Copy link
Contributor Author

I signed it!

@googlebot
Copy link
Collaborator

CLAs look good, thanks!

@delphinus
Copy link
Contributor Author

Any response? Can anyone review this??

@s-mang
Copy link
Contributor

s-mang commented Oct 26, 2017

Hey, thanks for the PR. Can you add some tests?

@delphinus
Copy link
Contributor Author

@adams-sarah Thanks for reviewing.

I have added one test case for checking the created proto to include Count with a valid value if specified the BatchSize method in the Query. Should I add more tests?

@s-mang
Copy link
Contributor

s-mang commented Oct 27, 2017

Oh my apologies I missed that.

@s-mang
Copy link
Contributor

s-mang commented Nov 3, 2017

hey sorry it's taking me so long to review this. want to make sure we don't mess this up.
i'll try to get to this soon!

// at once. This value should be equal to or less than the Limit.
func (q *Query) BatchSize(size int) *Query {
q = q.clone()
if size < math.MinInt32 || size > math.MaxInt32 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the check be if size < 0 || size > math.MaxInt32?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for late reply. Yes, it seems the negative values has no means. I rewrite this.

@@ -536,9 +553,13 @@ func (q *Query) Run(c context.Context) *Iterator {
return t
}
offset := q.offset - t.res.GetSkippedResults()
count := t.limit
if t.count > 0 && (count == 0 || t.count < count) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the intent of this as, "If count is set, and either the query is unlimited or the count is under the limit, use the count." Correct me if that's wrong.

Based on the above, the condition should be t.count > 0 && (count < 0 || t.count < count), right? The limit can be less than 0 (comment for limit field in Iterator says A negative value means unlimited., and NewQuery defaults the query's limit to -1). If count is positive and limit is negative we want to use the count.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assigning t.limit to local var count and then checking for it is a bit confusing here.

Would not it be better to simply use t.limit in if?

	var count int
	if t.count > 0 && (t.limit < 0 || t.count < t.limit) {
		count = t.count
	} else {
		count = t.limit
	} 

@mtraver will appreciate if you can point out what is purpose allowing to pass 0 as count. It may result in unnecessary datastore request that returns nothing. Wouldn't it be more logical to to treat 0s as unlimited in which case to not pass it over wire in *pb.Query and if a negative number passed to Limit() or Count() either return error or assign 0 internally? It's not obviously clear from current implementation what happens if you pass count=0 to internal.Call(c, "datastore_v3", "RunQuery", &req, &t.res) - would it return 0 results or be treated as unlimited?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree re: the local var. Your suggestion is more readable. Could also be

count := t.limit
if t.count > 0 && (t.limit < 0 || t.count < t.limit) {
	count = t.count
}

Yes, a count of 0 doesn't make sense, at least when passed in by the user. If a value <= 0 is passed to BatchSize then an error should be returned.

A limit of 0 is ok though, and it's actually used internally. See Query's Count method, which includes this comment:

We also set the limit to zero, as we don't want any actual entity data,
just the number of skipped results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtraver thanks, -1 and negative limit in query now makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review and rewrite to deal with negative values for t.limit.

@@ -566,6 +587,8 @@ type Iterator struct {
// limit is the limit on the number of results this iterator should return.
// A negative value means unlimited.
limit int32
// count is the number of results this iterator should fetch at once.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document behavior for negative values and zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.count can take equal to or greater than zero. Negative values are rejected when BatchSize() sets q.count, and t.count is set as to be equal to q.count when constructing &Iterator{}. So I add note that it can take zero & positive values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure 0 makes sense for count. In fact, the if block we've been discussing checks that it's greater than 0. I feel like BatchSize should reject count values <= 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, that's correct. I fix BatchSize.

@@ -605,7 +628,11 @@ func (t *Iterator) next() (*Key, *pb.EntityProto, error) {
return nil, nil, t.err
}
t.prevCC = t.res.CompiledCursor
if err := callNext(t.c, &t.res, 0, t.limit); err != nil {
count := t.limit
if t.count > 0 && (count == 0 || t.count < count) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto comment above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logic should be changed as the same if block in Run()?

var count int32
if t.count > 0 && (t.limit < 0 || t.count < t.limit) {
  count = t.count
} else {
  count = t.limit
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@mtraver mtraver requested a review from broady April 16, 2018 20:50
@delphinus
Copy link
Contributor Author

Fixed all points. 👍

@mtraver mtraver requested a review from zombiezen April 27, 2018 18:08
@zombiezen
Copy link
Contributor

I'm not really the right person to review anymore. If you have specific questions, happy to answer, but if it looks good to you, looks good to me. :)

@mtraver
Copy link
Contributor

mtraver commented Apr 27, 2018

@zombiezen Ok sounds good. I was just looking for a second LGTM.

@broady, could you please review?

@mtraver mtraver removed the request for review from zombiezen April 27, 2018 18:16
@lgrote
Copy link

lgrote commented May 15, 2018

Hi, I'm waiting for this PR and I'm very happy to see that it is almost there. Any idea when this will be merged?
Thx to all who worked on this!

@sbuss sbuss merged commit b9aad5d into golang:master May 16, 2018
@sbuss
Copy link
Contributor

sbuss commented May 16, 2018

For some reason travis is now broken. I don't think this PR was the culprit but I am reverting it just in case.

sbuss added a commit that referenced this pull request May 16, 2018
@delphinus delphinus deleted the feature/enable-to-set-query-batch-size branch May 17, 2018 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants