Skip to content

[求助/Help]v3.11.3=>v3.11.9 升级操作后telegraf和monitor异常 #21957

@chenjacken

Description

@chenjacken

v3.11.3=>v3.11.9 升级操作后telegraf和monitor异常

telegraf日志:

[root@master1 ocboot]# kubectl -n onecloud get onecloudclusters default -o=jsonpath='{.spec.version}'
v3.11.9


[root@master1 ocboot]# kubectl logs default-telegraf-t777f -n onecloud
[info 250112 01:36:04 all.init.0(all.go:222)] init onecloud executor client, socket path: /hostfs/run/onecloud/exec.sock
2025-01-11T17:36:04Z I! Starting Telegraf 
2025-01-11T17:36:04Z E! [telegraf] Error running agent: Error: no outputs found, did you provide a valid config file?
[root@master1 ocboot]# 

monitor的日志:

[info 2025-01-11 17:43:12 cloudcommon.InitDB(database.go:122)] using inmemory lockman
[info 2025-01-11 17:43:12 db.CheckSync(models.go:116)] Start check database schema: autoSync(true), enableChecksumTables(false), skipInitChecksum(false)
[warning 2025-01-11 17:43:12 db.CheckSync(models.go:155)] table __default__-alerts_tbl-enabled-created_at-updated_at-update_version-deleted_at-deleted-id-description-is_emulated-name-status-progress-domain_id-tenant_id-frequency-settings-level-message-used_by-execution_error-for-eval_data-state-no_data_state-execution_error_state-last_state_change-state_changes-customize_config-res_type has been synced!
[warning 2025-01-11 17:43:12 db.CheckSync(models.go:155)] table __default__-alerts_tbl-enabled-created_at-updated_at-update_version-deleted_at-deleted-id-description-is_emulated-name-status-progress-domain_id-tenant_id-frequency-settings-level-message-used_by-execution_error-for-eval_data-state-no_data_state-execution_error_state-last_state_change-state_changes-customize_config-res_type has been synced!
[warning 2025-01-11 17:43:12 db.CheckSync(models.go:155)] table __default__-alerts_tbl-enabled-created_at-updated_at-update_version-deleted_at-deleted-id-description-is_emulated-name-status-progress-domain_id-tenant_id-frequency-settings-level-message-used_by-execution_error-for-eval_data-state-no_data_state-execution_error_state-last_state_change-state_changes-customize_config-res_type has been synced!
[info 2025-01-11 17:43:12 informer.(*EtcdBackendForClient).StartClientWatch(etcd_client.go:84)] /onecloud/informer watched
[info 2025-01-11 17:43:12 informer.NewWatchManagerBySessionBg.func1(watcher.go:51)] callback with watchMan success.
[info 2025-01-11 17:43:12 db.setDbConnection(database.go:60)] Total 27 db workers, set db connection max
[info 2025-01-11 17:43:12 service.startServices(service.go:113)] Initializing dataSourceManager
goroutine 170 [running]:
runtime/debug.Stack()
        /usr/lib/go/src/runtime/debug/stack.go:24 +0x5e
runtime/debug.PrintStack()
        /usr/lib/go/src/runtime/debug/stack.go:16 +0x13
yunion.io/x/log.Fatalf({0x235bad3, 0x1a}, {0xc00125bfa8, 0x2, 0x2})
        /root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/log/log.go:138 +0x2c
yunion.io/x/onecloud/pkg/monitor/service.startServices()
        /root/go/src/yunion.io/x/onecloud/pkg/monitor/service/service.go:115 +0x19f
created by yunion.io/x/onecloud/pkg/monitor/service.StartService in goroutine 1
        /root/go/src/yunion.io/x/onecloud/pkg/monitor/service/service.go:77 +0x134
[info 2025-01-11 17:43:12 worker.(*Worker).Start(worker.go:66)] start to get api Resource
[fatal 2025-01-11 17:43:12 service.startServices(service.go:115)] Service dataSourceManager init failed: get default TSDB source: [get internal service type "influxdb": catalog.GetServiceURLs: No such service influxdb: NotFoundError, get internal service type "victoria-metrics": catalog.GetServiceURLs: No such service victoria-metrics: NotFoundError]
[root@master1 ocboot]# 

victoria-metrics的POD

[root@master1 ~]# kubectl get pods -n onecloud |grep victoria-metrics
default-victoria-metrics-5d6b86fc9d-snjvs            1/1     Running            0          121m
[root@master1 ~]# 

[root@master1 ~]# kubectl logs -n onecloud $(kubectl get pods -n onecloud | grep monitor | awk '{print $1}') | grep 'TSDB data source'
[root@master1 ~]# 


[root@master1 ~]# climc endpoint-list --search victoria-metrics --details
+----------------------------------+-----------+----------------------------------+------------------+------------------+----------------------------------------+-----------+---------+
|                ID                | Region_ID |            Service_ID            |   Service_Name   |   Service_Type   |                  URL                   | Interface | Enabled |
+----------------------------------+-----------+----------------------------------+------------------+------------------+----------------------------------------+-----------+---------+
| d950494f9db547998937167798e4306f | region0   | 4374e8a091d5448b8c1c44d44cb4644d | victoria-metrics | victoria-metrics | https://172.16.1.200:30428             | public    | true    |
| af956cdacf0547168a558bbddc3f23ea | region0   | 4374e8a091d5448b8c1c44d44cb4644d | victoria-metrics | victoria-metrics | https://default-victoria-metrics:30428 | internal  | true    |
+----------------------------------+-----------+----------------------------------+------------------+------------------+----------------------------------------+-----------+---------+
***  Total: 2 Pages: 1 Limit: 20 Offset: 0 Page: 1  ***
[root@master1 ~]# 

麻烦帮忙看下,指导下如何解决。谢谢!

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions