Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to clean remote broken (can't stat metadata.json) #924

Closed
zurikus opened this issue May 15, 2024 · 6 comments
Closed

Unable to clean remote broken (can't stat metadata.json) #924

zurikus opened this issue May 15, 2024 · 6 comments

Comments

@zurikus
Copy link

zurikus commented May 15, 2024

Hi, I'm using latest version of clickhouse-backup tool and it's really great and powerful and makes life easier, but at the moment I've some minor issue and unable to clean up some broken metadata.

Backup does not already exist.

[root@dba01 tmp]# LOG_LEVEL=debug clickhouse-backup list remote 2024/05/15 17:01:47.595248 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse 2024/05/15 17:01:47.597664 info clickhouse connection success: tcp://localhost:9000 logger=clickhouse 2024/05/15 17:01:47.597691 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse 2024/05/15 17:01:47.603832 info SELECT macro, substitution FROM system.macros logger=clickhouse 2024/05/15 17:01:47.606388 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse 2024/05/15 17:01:47.612704 info SELECT macro, substitution FROM system.macros logger=clickhouse 2024/05/15 17:01:47.740470 debug /tmp/.clickhouse-backup-metadata.cache.S3 load 0 elements logger=s3 2024/05/15 17:01:47.801144 debug /tmp/.clickhouse-backup-metadata.cache.S3 save 0 elements logger=s3 ??? 20/03/2024 10:36:21 remote broken (can't stat metadata.json) 2024/05/15 17:01:47.801570 info clickhouse connection closed logger=clickhouse

Executing clean remote

[root@dba01 tmp]# clickhouse-backup clean_remote_broken 2024/05/15 16:58:22.798382 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse 2024/05/15 16:58:22.800333 info clickhouse connection success: tcp://localhost:9000 logger=clickhouse 2024/05/15 16:58:22.800404 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse 2024/05/15 16:58:22.804907 info SELECT macro, substitution FROM system.macros logger=clickhouse 2024/05/15 16:58:22.806342 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse 2024/05/15 16:58:22.812055 info SELECT macro, substitution FROM system.macros logger=clickhouse 2024/05/15 16:58:22.979528 info clickhouse connection closed logger=clickhouse 2024/05/15 16:58:22.979612 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse 2024/05/15 16:58:22.981613 info clickhouse connection success: tcp://localhost:9000 logger=clickhouse 2024/05/15 16:58:22.981679 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse 2024/05/15 16:58:22.987974 info SELECT macro, substitution FROM system.macros logger=clickhouse 2024/05/15 16:58:22.990067 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse 2024/05/15 16:58:22.994326 info SELECT macro, substitution FROM system.macros logger=clickhouse 2024/05/15 16:58:23.126664 info SELECT value FROM system.build_optionswhere name='VERSION_INTEGER' logger=clickhouse 2024/05/15 16:58:23.129801 info SELECT countIf(name='type') AS is_disk_type_present, countIf(name='object_storage_type') AS is_object_storage_type_present, countIf(name='free_space') AS is_free_space_present, countIf(name='disks') AS is_storage_policy_present FROM system.columns WHERE database='system' AND table IN ('disks','storage_policies') logger=clickhouse 2024/05/15 16:58:23.139090 info SELECT d.path, any(d.name) AS name, any(d.type) AS type, min(d.free_space) AS free_space, groupUniqArray(s.policy_name) AS storage_policies FROM system.disks AS d LEFT JOIN (SELECT policy_name, arrayJoin(disks) AS disk FROM system.storage_policies) AS s ON s.disk = d.name GROUP BY d.path logger=clickhouse 2024/05/15 16:58:23.223063 info done backup= duration=243ms location=remote logger=RemoveBackupRemote operation=delete 2024/05/15 16:58:23.223254 info clickhouse connection closed logger=clickhouse

But it's not working.

Deletion of /tmp/.clickhouse-backup-metadata.cache.S3 is not working as well. This file appears again with next backup list command run.

Please suggest how to fix this issue.

@Slach
Copy link
Collaborator

Slach commented May 15, 2024

could share
aws s3 ls s3://s3-bucket-from-config/s3-path-from-config/

 ???   20/03/2024 10:36:21   remote      broken (can't stat metadata.json) 

means you have key with the same name as s3->path from config
you need to delete it manually
something like that

aws s3 rm s3://s3-bucket-from-config/s3-path-from-config

and check clickhouse-backup list remote after it

@zurikus
Copy link
Author

zurikus commented May 15, 2024

[root@dba01 tmp]# cat /etc/clickhouse-backup/config.yml
general:
 remote_storage: s3
 max_file_size: 1073741824
 backups_to_keep_local: 2 # keep 1 last backups locally.
 backups_to_keep_remote: 2 # s3 is responsible for cleanup.
 log_level: info
 download_concurrency: 1
 upload_concurrency: 1
clickhouse:
 username: default
 password: SecurePassHere
 host: localhost
 port: 9000
 disk_mapping: {}
 skip_tables:
 - system.*
 - INFORMATION_SCHEMA.*
 - information_schema.*
s3:
  access_key: SecureKeyHere
  secret_key: SecureSecretHere
  bucket: backups
  region: eu-central-1
  path: "/clickhouse/"

This is how config looks like.

[root@dba01 tmp]# /usr/local/bin/aws s3 ls s3://backups/clickhouse/
2024-03-20 13:36:21          0

Command just dropped root folder for clickhouse backups:

[root@dba01 tmp]# /usr/local/bin/aws s3 rm s3://backups/clickhouse/
delete: s3://backups/clickhouse/

Executed backup list looks ok

[root@dba01 tmp]# clickhouse-backup list remote
2024/05/15 17:27:10.414012  info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/05/15 17:27:10.417324  info clickhouse connection success: tcp://localhost:9000 logger=clickhouse
2024/05/15 17:27:10.417393  info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros'  SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse
2024/05/15 17:27:10.424550  info SELECT macro, substitution FROM system.macros logger=clickhouse
2024/05/15 17:27:10.426835  info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros'  SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse
2024/05/15 17:27:10.433860  info SELECT macro, substitution FROM system.macros logger=clickhouse
2024/05/15 17:27:10.601145  info clickhouse connection closed logger=clickhouse

But created "clickhouse" directory on AWS S3 side again and broken back again:

[root@dba01 tmp]# clickhouse-backup list remote
2024/05/15 17:29:15.757570  info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/05/15 17:29:15.760031  info clickhouse connection success: tcp://localhost:9000 logger=clickhouse
2024/05/15 17:29:15.760056  info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros'  SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse
2024/05/15 17:29:15.765824  info SELECT macro, substitution FROM system.macros logger=clickhouse
2024/05/15 17:29:15.767730  info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros'  SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse
2024/05/15 17:29:15.782615  info SELECT macro, substitution FROM system.macros logger=clickhouse
   ???   15/05/2024 14:29:13   remote      broken (can't stat metadata.json)
2024/05/15 17:29:15.965758  info clickhouse connection closed logger=clickhouse

@Slach
Copy link
Collaborator

Slach commented May 15, 2024

clickhouse-backup list remote can't create anything on S3 this is readonly operation

Are you sure you just run clickhouse-backup list remote twice? and dind't do anything?

@zurikus
Copy link
Author

zurikus commented May 15, 2024

I did created "clickhouse" root directory on AWS S3 side manually, myself.

After directory is back, broken is back as well.

@Slach
Copy link
Collaborator

Slach commented May 15, 2024

Why did you create it and how?

S3 doesn't contains "directories"
this is a KEY->VALUE storage, which contains only prefixes with separators ("/")

try to remove again aws s3 rm s3://backups/clickhouse/
and change backup config

s3:
 path: clickhouse

instead of path: "/clickhouse/"

@Slach Slach closed this as completed May 15, 2024
@zurikus
Copy link
Author

zurikus commented May 15, 2024

Ok, I've got it. Seems I misunderstood the concept of "path" parameter from documentation.
I thought it's like path to directory where backups are located.

I created that "clickhouse" directory using S3 browser tool. I believe it could be done on AWS side using gui as well.
I've made changes in config as you suggested and created test backup.

Everything looks good right now. Thank you for so fast response and support.

[root@dba01 clickhouse-backup]# clickhouse-backup list remote 2024/05/15 18:18:31.355430 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse 2024/05/15 18:18:31.357843 info clickhouse connection success: tcp://localhost:9000 logger=clickhouse 2024/05/15 18:18:31.357882 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse 2024/05/15 18:18:31.364131 info SELECT macro, substitution FROM system.macros logger=clickhouse 2024/05/15 18:18:31.365791 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse 2024/05/15 18:18:31.369204 info SELECT macro, substitution FROM system.macros logger=clickhouse test_backup 24.45KiB 15/05/2024 15:17:02 remote tar, regular 2024/05/15 18:18:31.496676 info clickhouse connection closed logger=clickhouse

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants