-
Notifications
You must be signed in to change notification settings - Fork 7
YACHT-1295: documented not supported use cases #136
Conversation
Pull Request Test Coverage Report for Build 1100
💛 - Coveralls |
Pull Request Test Coverage Report for Build 1184
💛 - Coveralls |
NOT_SUPPORTED_USE_CASES.md
Outdated
@@ -0,0 +1,42 @@ | |||
# Rare cases in which backups are not supported by BBQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Known (rare) cases where BBQ backups may not work as expected
NOT_SUPPORTED_USE_CASES.md
Outdated
#### Prerequisites | ||
* Part of data needs to be deleted | ||
* Deleted part is whole BigQuery internal chunk of data | ||
* There is no further changes to the partition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no further data changes in that partition
NOT_SUPPORTED_USE_CASES.md
Outdated
|
||
## Data stored in [__UNPARTITIONED__ partition](https://cloud.google.com/bigquery/docs/querying-partitioned-tables#ingestion-time_partitioned_tables_unpartitioned_partition) for a long time | ||
|
||
When data is slowly streamed to BigQuery partitioned table, then that data might be moved from `__UNPARTITIONED__` partition into correct one after several hours. `lastModifiedTime` is set to time of streaming (which might be several hours ago), not when data was moved between partitions. This results in not backing up new part of data, because BBQ looks at time of last backup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace:
This results in not backing up new part of data, because BBQ looks at time of last backup.
with:
BBQ might not discover this "late-streamed" data and skip the backup.
NOT_SUPPORTED_USE_CASES.md
Outdated
* Ingestion-time partitioned table | ||
* Data is streamed to the BigQuery without specifying `partitionId` | ||
* Ingestion is very slow so that `__UNPARTITIONED__` partition store data for more than 24 hours | ||
* There is no further changes to the partition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<There are no further changes to the partition
NOT_SUPPORTED_USE_CASES.md
Outdated
|
||
## Backing up empty partition | ||
|
||
Due to asynchronous nature of scheduling backup for table/partition it is possible that: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... it can happen that:
NOT_SUPPORTED_USE_CASES.md
Outdated
|
||
Due to asynchronous nature of scheduling backup for table/partition it is possible that: | ||
1. Source data is modified and backup is scheduled for given table/partition | ||
1. Between scheduling copy-job task and this task execution, the data is deleted manually or by partition expiration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data is deleted manually (or by partition expiration) after copy-job is scheduled but before task execution
NOT_SUPPORTED_USE_CASES.md
Outdated
|
||
#### Prerequisites | ||
* Backup is scheduled for given table/partition, i.e. `lastModifiedTime` is modified | ||
* Between scheduling copy-job task and this task execution, the data is deleted manually or by partition expiration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data is deleted manually (or by partition expiration) after copy-job is scheduled but before task execution
No description provided.