-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add recursiveDelete() #1427
Changes from 10 commits
3d1fa79
4f40d8c
aa3c1cc
e3aaea4
8952a37
ec6be1c
a0c1162
a8d1250
baee6fd
474e1c6
e24c591
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,14 +16,14 @@ | |
|
||
import * as firestore from '@google-cloud/firestore'; | ||
|
||
import {CallOptions, grpc, RetryOptions} from 'google-gax'; | ||
import {CallOptions, GoogleError, grpc, RetryOptions, Status} from 'google-gax'; | ||
import {Duplex, PassThrough, Transform} from 'stream'; | ||
|
||
import {URL} from 'url'; | ||
|
||
import {google} from '../protos/firestore_v1_proto_api'; | ||
import {ExponentialBackoff, ExponentialBackoffSetting} from './backoff'; | ||
import {BulkWriter} from './bulk-writer'; | ||
import {BulkWriter, BulkWriterError} from './bulk-writer'; | ||
import {BundleBuilder} from './bundle'; | ||
import {fieldsFromJson, timestampFromJson} from './convert'; | ||
import { | ||
|
@@ -141,7 +141,7 @@ const CLOUD_RESOURCE_HEADER = 'google-cloud-resource-prefix'; | |
/*! | ||
* The maximum number of times to retry idempotent requests. | ||
*/ | ||
const MAX_REQUEST_RETRIES = 5; | ||
export const MAX_REQUEST_RETRIES = 5; | ||
|
||
/*! | ||
* The default number of idle GRPC channel to keep. | ||
|
@@ -166,7 +166,7 @@ const MAX_CONCURRENT_REQUESTS_PER_CLIENT = 100; | |
* | ||
* @private | ||
*/ | ||
const REFERENCE_NAME_MIN_ID = '__id-9223372036854775808__'; | ||
export const REFERENCE_NAME_MIN_ID = '__id-9223372036854775808__'; | ||
|
||
/** | ||
* Document data (e.g. for use with | ||
|
@@ -1198,27 +1198,127 @@ export class Firestore implements firestore.Firestore { | |
this.bulkWritersCount -= 1; | ||
} | ||
|
||
/** | ||
* Recursively deletes all documents and subcollections at and under the | ||
* specified level. | ||
* | ||
* If any delete fails, the promise is rejected with an error message | ||
* containing the number of failed deletes and the stack trace of the last | ||
* failed delete. The provided reference is deleted regardless of whether | ||
* all deletes succeeded. | ||
* | ||
* `recursiveDelete()` uses a BulkWriter instance with default settings to | ||
* perform the deletes. To customize throttling rates or add success/error | ||
* callbacks, pass in a custom BulkWriter instance. | ||
* | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You should add a code snippet here that shows how to collect the references for the failed delete. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. I included the default error handler to make it easier to paste. Do you think including it would confuse developers? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks good. |
||
* @param ref The reference of a document or collection to delete. | ||
* @param bulkWriter Custom BulkWriter instance used to perform the deletes. | ||
* @return A promise that resolves when all deletes have been performed. | ||
* The promise is rejected if any of the deletes fail. | ||
* | ||
* @example | ||
* // Recursively delete a reference and log the references of failures. | ||
* const bulkWriter = firestore.bulkWriter(); | ||
* bulkWriter | ||
* .onWriteError((error) => { | ||
* if ( | ||
* error.failedAttempts < MAX_RETRY_ATTEMPTS | ||
* ) { | ||
* return true; | ||
* } else { | ||
* console.log('Failed write at document: ', error.documentRef.path); | ||
* return false; | ||
* } | ||
* }); | ||
* await firestore.recursiveDelete(docRef, bulkWriter); | ||
*/ | ||
recursiveDelete( | ||
ref: | ||
| firestore.CollectionReference<unknown> | ||
| firestore.DocumentReference<unknown>, | ||
bulkWriter?: BulkWriter | ||
): Promise<void> { | ||
// Capture the error stack to preserve stack tracing across async calls. | ||
const stack = Error().stack!; | ||
|
||
const writer = bulkWriter ?? BulkWriter._getInstance(this); | ||
writer._verifyNotClosed(); | ||
const docStream = this._getAllDescendants( | ||
ref instanceof CollectionReference | ||
? (ref as CollectionReference<unknown>) | ||
: (ref as DocumentReference<unknown>) | ||
); | ||
const resultDeferred = new Deferred<void>(); | ||
let errorCount = 0; | ||
let lastError: Error | undefined; | ||
const incrementErrorCount = (err: Error): void => { | ||
errorCount++; | ||
lastError = err; | ||
}; | ||
const onStreamEnd = (): void => { | ||
if (ref instanceof DocumentReference) { | ||
writer.delete(ref).catch(err => incrementErrorCount(err)); | ||
} | ||
writer.flush().then(async () => { | ||
if (lastError === undefined) { | ||
resultDeferred.resolve(); | ||
} else { | ||
let error = new GoogleError( | ||
`${errorCount} ` + | ||
`${errorCount !== 1 ? 'deletes' : 'delete'} ` + | ||
'failed. The last delete failed with: ' | ||
); | ||
if (lastError instanceof BulkWriterError) { | ||
error.code = (lastError.code as number) as Status; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure why these lines aren't covered by Codecov, but I'm pretty sure they're being called. Maybe it's b/c I put it in a lambda function to access |
||
error = wrapError(error, stack); | ||
|
||
// Wrap the BulkWriter error last to provide the full stack trace. | ||
resultDeferred.reject( | ||
lastError.stack ? wrapError(error, lastError.stack ?? '') : error | ||
); | ||
} | ||
}); | ||
}; | ||
|
||
docStream | ||
.on('error', err => { | ||
err.code = Status.UNAVAILABLE; | ||
err.message = 'Failed to fetch children documents.'; | ||
lastError = err; | ||
onStreamEnd(); | ||
}) | ||
.on('data', (snap: QueryDocumentSnapshot) => { | ||
writer.delete(snap.ref).catch(err => incrementErrorCount(err)); | ||
}) | ||
.on('end', () => onStreamEnd()); | ||
|
||
return resultDeferred.promise; | ||
} | ||
|
||
/** | ||
* Retrieves all descendant documents nested under the provided reference. | ||
* | ||
* @private | ||
* @return {Stream<QueryDocumentSnapshot>} Stream of descendant documents. | ||
*/ | ||
// TODO(chenbrian): Make this a private method after adding recursive delete. | ||
_getAllDescendants( | ||
ref: CollectionReference | DocumentReference | ||
private _getAllDescendants( | ||
ref: CollectionReference<unknown> | DocumentReference<unknown> | ||
): NodeJS.ReadableStream { | ||
// The parent is the closest ancestor document to the location we're | ||
// deleting. If we are deleting a document, the parent is the path of that | ||
// document. If we are deleting a collection, the parent is the path of the | ||
// document containing that collection (or the database root, if it is a | ||
// root collection). | ||
|
||
let parentPath = ref._resourcePath; | ||
if (ref instanceof CollectionReference) { | ||
parentPath = parentPath.popLast(); | ||
} | ||
const collectionId = | ||
ref instanceof CollectionReference ? ref.id : ref.parent.id; | ||
ref instanceof CollectionReference | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. CollectionReference means There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The TS compiler was not happy with that suggestion: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the TS compiler is right, since the generics support of TypeScript is pretty much like Java and "erased at compile time". I was able to get the code to transpiler even without the FWIW, an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added the |
||
? ref.id | ||
: (ref as DocumentReference).parent.id; | ||
|
||
let query: Query = new Query( | ||
this, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2629,69 +2629,109 @@ describe('BulkWriter class', () => { | |
return firestore.terminate(); | ||
}); | ||
|
||
// TODO(chenbrian): This is a temporary test used to validate that the | ||
// StructuredQuery calls work properly. Remove these tests after adding | ||
// recursive delete tests. | ||
it('finds nested documents and collection', async () => { | ||
// ROOT-DB | ||
// └── randomCol | ||
// ├── anna | ||
// └── bob | ||
// └── parentsCol | ||
// ├── charlie | ||
// └── daniel | ||
// └── childCol | ||
// ├── ernie | ||
// └── francis | ||
const batch = firestore.batch(); | ||
batch.set(randomCol.doc('anna'), {name: 'anna'}); | ||
batch.set(randomCol.doc('bob'), {name: 'bob'}); | ||
batch.set(randomCol.doc('bob/parentsCol/charlie'), {name: 'charlie'}); | ||
batch.set(randomCol.doc('bob/parentsCol/daniel'), {name: 'daniel'}); | ||
batch.set(randomCol.doc('bob/parentsCol/daniel/childCol/ernie'), { | ||
name: 'ernie', | ||
}); | ||
batch.set(randomCol.doc('bob/parentsCol/daniel/childCol/francis'), { | ||
name: 'francis', | ||
}); | ||
await batch.commit(); | ||
describe('recursiveDelete()', () => { | ||
async function countDocumentChildren( | ||
ref: DocumentReference | ||
): Promise<number> { | ||
let count = 0; | ||
const collections = await ref.listCollections(); | ||
for (const collection of collections) { | ||
count += await countCollectionChildren(collection); | ||
} | ||
return count; | ||
} | ||
|
||
const numStreamItems = async ( | ||
stream: NodeJS.ReadableStream | ||
): Promise<number> => { | ||
async function countCollectionChildren( | ||
ref: CollectionReference | ||
): Promise<number> { | ||
let count = 0; | ||
// eslint-disable-next-line @typescript-eslint/no-unused-vars | ||
for await (const _ of stream) { | ||
++count; | ||
const docs = await ref.listDocuments(); | ||
for (const doc of docs) { | ||
count += (await countDocumentChildren(doc)) + 1; | ||
} | ||
return count; | ||
}; | ||
} | ||
|
||
// Query all descendants of collections. | ||
let descendantsStream = await firestore._getAllDescendants(randomCol); | ||
expect(await numStreamItems(descendantsStream)).to.equal(6); | ||
descendantsStream = await firestore._getAllDescendants( | ||
randomCol.doc('bob').collection('parentsCol') | ||
); | ||
expect(await numStreamItems(descendantsStream)).to.equal(4); | ||
descendantsStream = await firestore._getAllDescendants( | ||
randomCol.doc('bob').collection('parentsCol/daniel/childCol') | ||
); | ||
expect(await numStreamItems(descendantsStream)).to.equal(2); | ||
beforeEach(async () => { | ||
// ROOT-DB | ||
// └── randomCol | ||
// ├── anna | ||
// └── bob | ||
// └── parentsCol | ||
// ├── charlie | ||
// └── daniel | ||
// └── childCol | ||
// ├── ernie | ||
// └── francis | ||
const batch = firestore.batch(); | ||
batch.set(randomCol.doc('anna'), {name: 'anna'}); | ||
batch.set(randomCol.doc('bob'), {name: 'bob'}); | ||
batch.set(randomCol.doc('bob/parentsCol/charlie'), {name: 'charlie'}); | ||
batch.set(randomCol.doc('bob/parentsCol/daniel'), {name: 'daniel'}); | ||
batch.set(randomCol.doc('bob/parentsCol/daniel/childCol/ernie'), { | ||
name: 'ernie', | ||
}); | ||
batch.set(randomCol.doc('bob/parentsCol/daniel/childCol/francis'), { | ||
name: 'francis', | ||
}); | ||
await batch.commit(); | ||
}); | ||
|
||
// Query all descendants of documents. | ||
descendantsStream = await firestore._getAllDescendants( | ||
randomCol.doc('bob') | ||
); | ||
expect(await numStreamItems(descendantsStream)).to.equal(4); | ||
descendantsStream = await firestore._getAllDescendants( | ||
randomCol.doc('bob/parentsCol/daniel') | ||
); | ||
expect(await numStreamItems(descendantsStream)).to.equal(2); | ||
descendantsStream = await firestore._getAllDescendants( | ||
randomCol.doc('anna') | ||
); | ||
expect(await numStreamItems(descendantsStream)).to.equal(0); | ||
it('on top-level collection', async () => { | ||
await firestore.recursiveDelete(randomCol); | ||
expect(await countCollectionChildren(randomCol)).to.equal(0); | ||
}); | ||
|
||
it('on nested collection', async () => { | ||
const coll = randomCol.doc('bob').collection('parentsCol'); | ||
await firestore.recursiveDelete(coll); | ||
|
||
expect(await countCollectionChildren(coll)).to.equal(0); | ||
expect(await countCollectionChildren(randomCol)).to.equal(2); | ||
}); | ||
|
||
it('on nested document', async () => { | ||
const doc = randomCol.doc('bob/parentsCol/daniel'); | ||
await firestore.recursiveDelete(doc); | ||
|
||
const docSnap = await doc.get(); | ||
expect(docSnap.exists).to.be.false; | ||
expect(await countDocumentChildren(randomCol.doc('bob'))).to.equal(1); | ||
expect(await countCollectionChildren(randomCol)).to.equal(3); | ||
}); | ||
|
||
it('on leaf document', async () => { | ||
const doc = randomCol.doc('bob/parentsCol/daniel/childCol/ernie'); | ||
await firestore.recursiveDelete(doc); | ||
|
||
const docSnap = await doc.get(); | ||
expect(docSnap.exists).to.be.false; | ||
expect(await countCollectionChildren(randomCol)).to.equal(5); | ||
}); | ||
|
||
it('does not affect other collections', async () => { | ||
const collA = randomCol.doc('bob').collection('parentsCol'); | ||
|
||
// Add other nested collection under 'bob' that shouldn't be deleted. | ||
const collB = randomCol.doc('bob').collection('pets'); | ||
await collB.doc('doggo').set({name: 'goodboi'}); | ||
|
||
await firestore.recursiveDelete(collA); | ||
|
||
expect(await countCollectionChildren(collA)).to.equal(0); | ||
expect(await countCollectionChildren(randomCol)).to.equal(3); | ||
expect(await countCollectionChildren(collB)).to.equal(1); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is now a similar test to what you have above (something akin to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for clarifying what you meant. Added a test for that. |
||
}); | ||
|
||
it('with custom BulkWriter instance', async () => { | ||
const bulkWriter = firestore.bulkWriter(); | ||
let callbackCount = 0; | ||
bulkWriter.onWriteResult(() => { | ||
callbackCount++; | ||
}); | ||
await firestore.recursiveDelete(randomCol, bulkWriter); | ||
expect(callbackCount).to.equal(6); | ||
}); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we also get a test that has "randomCollA/docA" and "randomCollB/docB" and only deletes randomCollA? There might be some nuances in how to delete parent collection (in other ports) and I want to make sure that all of our clients handle this case correctly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. good point, done. |
||
}); | ||
|
||
it('can retry failed writes with a provided callback', async () => { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't think about the fact that this would break if multiple Firestore instances are used together. I am sightly inclined to ask you to undo this change since it would simplify the instance creation and its lifetime management (at the cost of further increasing the size of Firestore).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see arguments either way, but at this point, I'm more in favor of leaving it on the Firestore instance, since trying to track the different firestore instances to pass into BulkWriter is more confusing than just having the BulkWriter instance available on the Firestore instance.
Reverted back to before.