Skip to content

fix(firestore-bigquery-export): update snapshot script#1289

Merged
dackers86 merged 17 commits intonextfrom
@invertase/fix-resource-error
Nov 15, 2022
Merged

fix(firestore-bigquery-export): update snapshot script#1289
dackers86 merged 17 commits intonextfrom
@invertase/fix-resource-error

Conversation

@cabljac
Copy link
Copy Markdown
Contributor

@cabljac cabljac commented Nov 8, 2022

Reopened #703 (rebased on next).

  • added relevant stress tests

  • added tests for the case Will this cover if there are null time stamps or the latest entry for a document has two entries with the same timestamp? (mentioned in this comment)

  • added script to generate bq fixtures

fixes #757

@cabljac cabljac changed the title @invertase/fix resource error fix(firestore-bigquery-export): update snapshot script Nov 8, 2022
@cabljac cabljac marked this pull request as ready for review November 8, 2022 17:30
@cabljac cabljac requested a review from a team as a code owner November 8, 2022 17:30
@cabljac cabljac force-pushed the @invertase/fix-resource-error branch from fac48f5 to 7aa49a3 Compare November 9, 2022 09:20
@cabljac
Copy link
Copy Markdown
Contributor Author

cabljac commented Nov 9, 2022

Update

  • needs to handle null timestamps, and test needs to be updated to reflect this (partially solved)

sophie4869 and others added 9 commits November 10, 2022 13:50
The old script uses FIRST_VALUE and OVER, which sorts the entire changelog and finds the first record for each document. It can result in a memory issue when running BigQuery reading from the latest snapshot. (Resources exceeded during query execution: The query could not be executed in the allotted memory. Peak usage: 110% of limit. Top memory consumer(s):  sort operations used for analytic OVER() clauses: 96%)

The updated script selects the maximum timestamp for each document_id, and joins back with the table by the latest timestamp instead.
…chema view. There's no need to find the latest value again.
@cabljac cabljac force-pushed the @invertase/fix-resource-error branch from d753143 to fc59a53 Compare November 10, 2022 13:55
@cabljac cabljac requested a review from dackers86 November 10, 2022 13:58
@cabljac cabljac force-pushed the @invertase/fix-resource-error branch from 4ffd7e3 to dfb68ec Compare November 14, 2022 16:53
@dackers86 dackers86 merged commit 6d5228a into next Nov 15, 2022
@dackers86 dackers86 deleted the @invertase/fix-resource-error branch November 15, 2022 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[firestore-bigquery-export] Resources exceeded during query execution error

3 participants