Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bigtable-backup: Tool for creating and managing periodic tables stored in Bigtable #729

Merged
merged 3 commits into from
Aug 2, 2019

Conversation

sandeepsukhani
Copy link
Contributor

Relies on this tool: https://github.com/grafana/bigtable-backup

It has 2 commands:

  1. backup-active-periodic-table: For backing up the active periodic table
  2. ensure-backups: It helps ensure right tables are backed up and retain only single backup for non-active periodic tables. It also ensures we have a backup created after table becomes non-active i.e after the last possible timestamp in the period for which table was active.

Usage:

python3 bigtable-backup.py --help
usage: bigtable-backup.py [-h] --bigtable-project-id BIGTABLE_PROJECT_ID
                          --bigtable-instance-id BIGTABLE_INSTANCE_ID
                          --bigtable-table-id-prefix BIGTABLE_TABLE_ID_PREFIX
                          --destination-path DESTINATION_PATH --temp-prefix
                          TEMP_PREFIX --periodic-table-duration
                          PERIODIC_TABLE_DURATION
                          {backup-active-periodic-table,ensure-backups} ...

positional arguments:
  {backup-active-periodic-table,ensure-backups}
                        commands
    backup-active-periodic-table
                        Backup active periodic table
    ensure-backups      Ensure backups of right tables exist

optional arguments:
  -h, --help            show this help message and exit
  --bigtable-project-id BIGTABLE_PROJECT_ID
                        The ID of the GCP project of the Cloud Bigtable
                        instance
  --bigtable-instance-id BIGTABLE_INSTANCE_ID
                        The ID of the Cloud Bigtable instance that contains
                        the tables
  --bigtable-table-id-prefix BIGTABLE_TABLE_ID_PREFIX
                        Prefix to build IDs of the tables using periodic-
                        table-duration
  --destination-path DESTINATION_PATH
                        GCS path where data should be written. For example,
                        gs://mybucket/somefolder/
  --temp-prefix TEMP_PREFIX
                        Path and filename prefix for writing temporary files.
                        ex: gs://MyBucket/tmp
  --periodic-table-duration PERIODIC_TABLE_DURATION
                        Periodic config set for loki tables in days

Makefile Outdated Show resolved Hide resolved
Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM to me but I don't have enough knowledge to approve. I think @gouthamve should have a look.

Copy link
Member

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I think this is super brittle. We need some checks and monitoring here. When will this exit unsuccessfully?


while True:
line = popen.stdout.readline()
if i == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be a bool. But this seems hacky, can we add a json output to the tool and we parse json?

bigtable-backup .... -ojson? And we parse the json here.


backup_active_periodic_table_parser = subparser.add_parser('ensure-backups',
help="Ensure backups of right tables exist")
backup_active_periodic_table_parser.add_argument('--period-from', required=True, type=valid_date)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have a time-range, like 90 days? --period-for? And make sure only one is set.

if last_timestamp_from_table_number > timestamps[-1]:
# Checking whether backup is created after last timestamp of tables period.
create_backup(table_id, args)
elif len(timestamps) > 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need of an elif here. We need both checks always.

@tomwilkie tomwilkie removed their request for review July 30, 2019 13:08
@sandeepsukhani
Copy link
Contributor Author

@gouthamve Thanks for the review! All the requested changes are done.

Copy link
Member

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits, other than that good to go!

registry = CollectorRegistry()
bigtable_backup_job_last_run_unixtime = Gauge('bigtable_backup_job_last_run_unixtime', 'Last time a bigtable backup job ran at', registry=registry)
bigtable_backup_job_last_success_unixtime = Gauge('bigtable_backup_job_last_success_unixtime', 'Last time a bigtable backup job successfully finished', registry=registry)
bigtable_backup_job_runtime_seconds = Gauge('bigtable_backup_job_runtime_seconds', 'Runtime of last successfully finished bigtable backup job', registry=registry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add 2 more metrics: backups_created and backups_existing?

from prometheus_client import CollectorRegistry, Gauge, push_to_gateway

registry = CollectorRegistry()
bigtable_backup_job_last_run_unixtime = Gauge('bigtable_backup_job_last_run_unixtime', 'Last time a bigtable backup job ran at', registry=registry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be _seconds and not _unixtime.

It has 2 commands:
1. backup-active-periodic-table: For backing up active periodic table
2. ensure-backups: It helps ensure right tables are backed up and retain only single backup for non active periodic tables
@sandeepsukhani sandeepsukhani merged commit 9930f70 into grafana:master Aug 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants