Skip to content
This repository has been archived by the owner on Aug 25, 2023. It is now read-only.

Async copy job is unable to copy table with id longer than 500 chars #24

Merged
merged 3 commits into from
Aug 3, 2018

Conversation

przemyslaw-jasinski
Copy link
Contributor

No description provided.

@coveralls
Copy link

coveralls commented Aug 2, 2018

Pull Request Test Coverage Report for Build 381

  • 2 of 2 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.007%) to 82.257%

Totals Coverage Status
Change from base Build 372: 0.007%
Covered Lines: 2012
Relevant Lines: 2446

💛 - Coveralls

# but still valid.
#
# The con is - if task_name is None,
# that means it's not unique and same task may be invoked more then once.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the same backup may be created more than once

# what protect us from failures where source table name is very long,
# but still valid.
#
# The con is - if task_name is None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

con -> disadvanteage

def test_return_none_if_calculated_name_is_too_long(self):
# given
task_name_suffix = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe more Pythonic way:
task_name_suffix = "x" * 501

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL, thanks :D

README.md Outdated
* Modification of table metadata (including table description) qualifies table to be backed up at the next cycle. It can be a problem for partitioned tables, where such change updates last modified time in every partition. Then BBQ will backup all partitions again, even though there was no actually change in partition data,
* There's 10,000 [copy jobs per project per day limit](https://cloud.google.com/bigquery/quotas#copy_jobs), which you may hit on the first day. This limit can be increased by Google Support,
* Data in table streaming buffer will be backed up on the next run, once the buffer is flushed. BBQ uses [copy-job](https://cloud.google.com/bigquery/docs/managing-tables#copy-table) for creating backups and *"Records in the streaming buffer are not considered when a copy or extract job runs"* (check [Life of a BigQuery streaming insert](https://cloud.google.com/blog/big-data/2017/06/life-of-a-bigquery-streaming-insert) for more details),
* BBQ may backup more than once tables which name is longer than 400 characters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When table name is longer than 400 characters, then in rare cases BBQ may backup tables more than once. Such backup duplicates are automatically removed by retention process.

@przemyslaw-jasinski przemyslaw-jasinski merged commit 408f67a into master Aug 3, 2018
@przemyslaw-jasinski przemyslaw-jasinski deleted the YACHT_948 branch August 3, 2018 11:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants