Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for Partition API #1320

Merged
merged 6 commits into from
Oct 13, 2020
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
191 changes: 191 additions & 0 deletions dev/src/collection-group.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
/*
* Copyright 2020 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

import * as firestore from '@google-cloud/firestore';
import * as protos from '../protos/firestore_v1_proto_api';

import {QueryPartition} from './query-partition';
import {requestTag} from './util';
import {logger} from './logger';
import {Query, QueryOptions} from './reference';
import {FieldPath} from './path';
import {Firestore} from './index';

import api = protos.google.firestore.v1;

/**
* A `CollectionGroup` refers to all documents that are contained in a
* collection or subcollection with a specific collection ID.
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add @class CollectionGroup.

(Otherwise, the constructor will still appear in the docs even with @hideconstructor)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added @class to match the other types. Thanks for catching.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need add the CollectionGroup after @class, or else the constructor will still appear. Not sure why, but I regenerated the docs and still see the constructor with your current PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice. I didn't know that. Added here and in a bunch of other places.

export class CollectionGroup<T = firestore.DocumentData>
extends Query<T>
implements firestore.CollectionGroup<T> {
/** @hideconstructor */
constructor(
firestore: Firestore,
collectionId: string,
converter: firestore.FirestoreDataConverter<T> | undefined
) {
super(
firestore,
QueryOptions.forCollectionGroupQuery(collectionId, converter)
);
}

/**
* Partitions a query by returning partition cursors that can be used to run
* the query in parallel. The returned cursors are split points that can be
* used as starting and end points for individual query invocations.
*
* @param {number} desiredPartitionCount The desired maximum number of
* partition points. The number must be strictly positive. The actual number
* of partitions returned may be fewer.
* @return {AsyncIterable<QueryPartition>} An AsyncIterable of
* `QueryPartition`s.
*/
async *getPartitionsAsync(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getPartitionsAsync()'s name was initially confusing to me, since I thought once the non-async one would be a function that had a : void return signature. Both getPartitionsAsync() and getPartitions() are essentially async methods, which added to my confusion.

How about getPartititionsIterator and getPartitionsPromise, or something more along those lines (granted, those 2 names are the best)? Is there any other established pattern for methods that return an iterator vs. a promise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This follows the precedent in the generated Gapic library. I am kind of leaning towards removing the "non-async" method (which is also async) altogether. Streaming iterators are very easy to use, so maybe there is no benefit of exposing both methods. I will ask for opinions before merging.

desiredPartitionCount: number
): AsyncIterable<QueryPartition<T>> {
const tag = requestTag();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider calling validateInteger() on desiredPartitionCount, since it currently accepts negative inputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, for any value that is simply passed through to the backend, we should just let the backend perform the validation. That means that if the backend decides to change the range of acceptable values, all clients will benefit immediately.
There are some cases where the backend error is hard to understand where we have deviated from this rule. The error message here is "3 INVALID_ARGUMENT: Partition count must be greater than zero." and seems adequate.

Changed my mind: Since we pass desiredPartitions-1, the error message from the backend doesn't make sense for us. Added validation.

await this.firestore.initializeIfNeeded(tag);

// Partition queries require explicit ordering by __name__.
const queryWithDefaultOrder = this.orderBy(FieldPath.documentId());
const request: api.IPartitionQueryRequest = queryWithDefaultOrder.toProto();

// Since we are always returning an extra partition (with en empty endBefore

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/en/an

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why do we need to return an extra empty endBefore cursor? Is it to signal that the iterator is done yielding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last query should be query.startAt(lastPartition) and return all elements until the end of the result set, which in the Firestore API means that we must omit the last cursor.

// cursor), we reduce the desired partition count by one.
request.partitionCount = desiredPartitionCount - 1;

const stream = await this.firestore.requestStream(
'partitionQueryStream',
request,
tag
);
stream.resume();

let lastValues: api.IValue[] | undefined = undefined;
let partitionCount = 0;

for await (const currentCursor of stream) {
++partitionCount;
const currentValues = currentCursor.values ?? [];
yield new QueryPartition(
this._firestore,
this._queryOptions.collectionId,
this._queryOptions.converter,
lastValues,
currentValues
);
lastValues = currentValues;
}

logger(
'Firestore.getPartitionsAsync',
tag,
'Received %d partitions',
partitionCount
);

// Return the extra partition with the empty cursor.
yield new QueryPartition(
this._firestore,
this._queryOptions.collectionId,
this._queryOptions.converter,
lastValues,
undefined
);
}

/**
* Partitions a query by returning partition cursors that can be used to run
* the query in parallel. The returned cursors are split points that can be
* used as starting and end points for individual query invocations.
*
* @param {number} desiredPartitionCount The desired maximum number of
* partition points. The number must be strictly positive. The actual number
* of partitions returned may be fewer.
* @return {Promise<QueryPartition[]>} A Promise with the `QueryPartition`s

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: A Promise with an array of QueryPartitions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed for now.

* returned as an array.
*/
async getPartitions(
desiredPartitionCount: number
): Promise<QueryPartition<T>[]> {
const result: QueryPartition<T>[] = [];
for await (const partition of this.getPartitionsAsync(
desiredPartitionCount
)) {
result.push(partition);
}
return result;
}

/**
* Applies a custom data converter to this `CollectionGroup`, allowing you
* to use your own custom model objects with Firestore. When you call get()
* on the returned `CollectionGroup`, the provided converter will convert
* between Firestore data and your custom type U.
*
* Using the converter allows you to specify generic type arguments when
* storing and retrieving objects from Firestore.
*
* @example
* class Post {
* constructor(readonly title: string, readonly author: string) {}
*
* toString(): string {
* return this.title + ', by ' + this.author;
* }
* }
*
* const postConverter = {
* toFirestore(post: Post): FirebaseFirestore.DocumentData {
* return {title: post.title, author: post.author};
* },
* fromFirestore(
* snapshot: FirebaseFirestore.QueryDocumentSnapshot
* ): Post {
* const data = snapshot.data();
* return new Post(data.title, data.author);
* }
* };
*
* const postSnap = await Firestore()
* .collectionGroup('posts')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You can't call doc() on a CollectionGroup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very well then. Fixed.

* .withConverter(postConverter)
* .doc().get();
* const post = postSnap.data();
* if (post !== undefined) {
* post.title; // string
* post.toString(); // Should be defined
* post.someNonExistentProperty; // TS error
* }
*
* @param {FirestoreDataConverter} converter Converts objects to and from
* Firestore.
* @return {CollectionGroup} A `CollectionGroup<U>` that uses the provided
* converter.
*/
withConverter<U>(
converter: firestore.FirestoreDataConverter<U>
): CollectionGroup<U> {
return new CollectionGroup<U>(
this.firestore,
this._queryOptions.collectionId,
converter
);
}
}
11 changes: 7 additions & 4 deletions dev/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ import {
validateResourcePath,
} from './path';
import {ClientPool} from './pool';
import {CollectionReference, Query, QueryOptions} from './reference';
import {CollectionReference} from './reference';
import {DocumentReference} from './reference';
import {Serializer} from './serializer';
import {Timestamp} from './timestamp';
Expand Down Expand Up @@ -75,6 +75,7 @@ import {interfaces} from './v1/firestore_client_config.json';
const serviceConfig = interfaces['google.firestore.v1.Firestore'];

import api = google.firestore.v1;
import {CollectionGroup} from './collection-group';

export {
CollectionReference,
Expand All @@ -91,6 +92,8 @@ export {Timestamp} from './timestamp';
export {DocumentChange} from './document-change';
export {FieldPath} from './path';
export {GeoPoint} from './geo-point';
export {CollectionGroup};
export {QueryPartition} from './query-partition';
export {setLogFunction} from './logger';
export {Status as GrpcStatus} from 'google-gax';

Expand Down Expand Up @@ -632,7 +635,7 @@ export class Firestore implements firestore.Firestore {
* @param {string} collectionId Identifies the collections to query over.
* Every collection or subcollection with this ID as the last segment of its
* path will be included. Cannot contain a slash.
* @returns {Query} The created Query.
* @returns {CollectionGroup} The created CollectionGroup.
*
* @example
* let docA = firestore.doc('mygroup/docA').set({foo: 'bar'});
Expand All @@ -646,14 +649,14 @@ export class Firestore implements firestore.Firestore {
* });
* });
*/
collectionGroup(collectionId: string): Query {
collectionGroup(collectionId: string): CollectionGroup {
if (collectionId.indexOf('/') !== -1) {
throw new Error(
`Invalid collectionId '${collectionId}'. Collection IDs must not contain '/'.`
);
}

return new Query(this, QueryOptions.forCollectionGroupQuery(collectionId));
return new CollectionGroup(this, collectionId, /* converter= */ undefined);
}

/**
Expand Down
116 changes: 116 additions & 0 deletions dev/src/query-partition.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
/*
* Copyright 2020 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

import * as firestore from '@google-cloud/firestore';
import * as protos from '../protos/firestore_v1_proto_api';

import {FieldOrder, Query, QueryOptions} from './reference';
import {FieldPath} from './path';
import {Serializer} from './serializer';
import {Firestore} from './index';

import api = protos.google.firestore.v1;

/**
* A split point that can be used in a query as a starting and/or end point for
* the query results. The cursors returned by {@link #startAt} and {@link
* #endBefore} can only be used in a query that matches the constraint of query
* that produced this partition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add @class QueryPartition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added @class

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*/
export class QueryPartition<T = firestore.DocumentData>
implements firestore.QueryPartition<T> {
private readonly _serializer: Serializer;
private _memoizedStartAt: unknown[] | undefined;
private _memoizedEndBefore: unknown[] | undefined;

constructor(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add hideconstructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

private readonly _firestore: Firestore,
private readonly _collectionId: string,
private readonly _converter:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this converter have to be undefined here? QueryPartition's converter comes from QueryOptions, which always has FirestoreDataConverter defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Simplified.

| firestore.FirestoreDataConverter<T>
| undefined,
private readonly _startAt: api.IValue[] | undefined,
private readonly _endBefore: api.IValue[] | undefined
) {
this._serializer = new Serializer(_firestore);
}

/**
* The cursor that defines the first result for this partition or `undefined`
* if this is the first partition.
*
* @return a cursor value that can be used with {@link Query#startAt} or

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding other annotations (@return, @type, etc.) like in other getters. As it stands, the docs file looks pretty bare: https://screenshot.googleplex.com/mNvFsQh4X2E2msJ.png.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. We should always do this for JSDoc.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uber-nit: s/a cursor/A cursor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* `undefined` if this is the first partition. The returned array must be
* destructured when passed to `startAt`.
*/
get startAt(): unknown[] | undefined {
if (this._startAt && !this._memoizedStartAt) {
this._memoizedStartAt = this._startAt.map(v =>
this._serializer.decodeValue(v)
);
}

return this._memoizedStartAt;
}

/**
* The cursor that defines the first result after this partition or
* `undefined` if this is the last partition.
*
* @return a cursor value that can be used with {@link Query#endBefore} or
* `undefined` if this is the last partition. The returned array must be
* destructured when passed to `endBefore`.
*/
get endBefore(): unknown[] | undefined {
if (this._endBefore && !this._memoizedEndBefore) {
this._memoizedEndBefore = this._endBefore.map(v =>
this._serializer.decodeValue(v)
);
}

return this._memoizedEndBefore;
}

/**
* Returns a query that only returns the documents for this partition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider: Returns a query that only contains documents for/inside this partition.

Technically, a query doesn't "return" the documents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used encapsulates. Let me know what you think.

*
* @return a query partitioned by a {@link Query#startAt} and {@link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding return type: {Query<T>}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* Query#endBefore} cursor.
*/
toQuery(): Query<T> {
// Since the api.Value to JavaScript type conversion can be lossy (unless
// `useBigInt` is used), we pass the original protobuf representaion to the
// created query.
let queryOptions = QueryOptions.forCollectionGroupQuery(
this._collectionId,
this._converter
);
queryOptions = queryOptions.with({
fieldOrders: [new FieldOrder(FieldPath.documentId())],
});
if (this._startAt !== undefined) {
queryOptions = queryOptions.with({
startAt: {before: true, values: this._startAt},
});
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how important it is to test L105-108, but codecov is complaining about missing tests here. Flagging this in-case you missed it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test that should cover this.

if (this._endBefore !== undefined) {
queryOptions = queryOptions.with({
endAt: {before: true, values: this._endBefore},
});
}
return new Query(this._firestore, queryOptions);
}
}
Loading