Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xapi or state.db broken after XCP-NG 7.6 Upgrade from 7.5 #94

Closed
cocoon opened this issue Nov 3, 2018 · 36 comments
Closed

xapi or state.db broken after XCP-NG 7.6 Upgrade from 7.5 #94

cocoon opened this issue Nov 3, 2018 · 36 comments

Comments

@cocoon
Copy link

@cocoon cocoon commented Nov 3, 2018

As discussed here:
https://xcp-ng.org/forum/topic/575/vmware-xcp-ng-7-6-iso-upgrade-from-7-5-network-gone

After upgrading 2 VMware VMs xapi is not working anymore because it can't read the database state.db.
Upgrade was tested with ISO anmd YUM with the same problem.

Problems that are caused by this:

There is no network adapter shown in the console menu and it is not configurable with "Configure Management Interface" as it shows no interface to select, only "".
But on the command line there are still the interfaces and bridges and ping/ssh from/to the ip is still working.

xe commands are not working:

xe network-list
Error: Connection refused (calling connect )`

On the console the following message appears:

xapi-nbd[6121]: main: Failed to log in via xapi's Unix domain socket in 300.000000 seconds

The real problem:

xapi error in /var/log/xensource.log:

Nov  1 09:25:52 xen-01 xapi: [error|xen-01|0 ||backtrace] server_init D:7b76fe698182 failed with exception Db_exn.DBCache_NotFound("missing column", "Cluster", "network")
Nov  1 09:25:52 xen-01 xapi: [error|xen-01|0 ||backtrace] Raised Db_exn.DBCache_NotFound("missing column", "Cluster", "network")
Nov  1 09:25:52 xen-01 xapi: [error|xen-01|0 ||backtrace] 1/1 xapi @ xen-01 Raised at file (Thread 0 has no backtrace table. Was with_backtraces called?, line 0
Nov  1 09:25:52 xen-01 xapi: [error|xen-01|0 ||backtrace]
Nov  1 09:25:52 xen-01 xapi: [debug|xen-01|0 ||xapi] xapi top-level caught exception: INTERNAL_ERROR: [ missing column; Cluster; network ]
Nov  1 09:25:52 xen-01 xapi: [error|xen-01|0 ||backtrace] Raised Db_exn.DBCache_NotFound("missing column", "Cluster", "network")
Nov  1 09:25:52 xen-01 xapi: [error|xen-01|0 ||backtrace] 1/1 xapi @ xen-01 Raised at file (Thread 0 has no backtrace table. Was with_backtraces called?, line 0
Nov  1 09:25:52 xen-01 xapi: [error|xen-01|0 ||backtrace]
Nov  1 09:25:53 xen-01 xapi: [ warn|xen-01|0 ||xapi] Duplicate configuration keys in Xcp_service.configure: disable-logging-for in [ use-switch; switch-path; search-path; pidfile; log; daemon; disable-logging-for; loglevel; inventory; config; config-dir; master_connection_reset_timeout; master_connection_retry_timeout; master_connection_default_timeout; qemu_dm_ready_timeout; hotplug_timeout; pif_reconfigure_ip_timeout; pool_db_sync_interval; pool_data_sync_interval; domain_shutdown_total_timeout; emergency_reboot_delay_base; emergency_reboot_delay_extra; ha_xapi_healthcheck_interval; ha_xapi_healthcheck_timeout; ha_xapi_restart_attempts; ha_xapi_restart_timeout; logrotate_check_interval; rrd_backup_interval; session_revalidation_interval; update_all_subjects_interval; wait_memory_target_timeout; snapshot_with_quiesce_timeout; host_heartbeat_interval; host_assumed_dead_interval; fuse_time; db_restore_fuse_time; inactive_session_timeout; pending_task_timeout; completed_task_timeout; minimum_time_between_bounces; minimum_time_between_reboot_with_no_added_delay; ha_monitor_interval; ha_monitor_plan_interval; ha_monitor_startup_timeout; ha_default_timeout_base; guest_liveness_timeout; permanent_master_failure_retry_interval; redo_log_max_block_time_empty; redo_log_max_block_time_read; redo_log_max_block_time_writedelta; redo_log_max_block_time_writedb; redo_log_max_startup_time; redo_log_connect_delay; default-vbd3-polling-duration; default-vbd3-polling-idle-threshold; vm_call_plugin_interval; xapi_clusterd_port; sm-plugins; hotfix-fingerprint; logconfig; writereadyfile; writeinitcomplete; nowatchdog; log-getter; onsystemboot; relax-xsm-sr-check; disable-logging-for; disable-dbsync-for; xenopsd-queues; xenopsd-default; nvidia-whitelist; igd-passthru-vendor-whitelist; gvt-g-whitelist; mxgpu-whitelist; pass-through-pif-carrier; cluster-stack-default; ciphersuites-good-outbound; ciphersuites-legacy-outbound; gpumon_stop_timeout; reboot_required_hfxs; xen_livepatch_list; kpatch_list; modprobe_path; db_idempotent_map; post-install-scripts-dir; gpg-homedir; xen-cmdline; cluster-stack-root; web-dir; tools-sr-dir; sm-dir; udhcpd-skel; db-config-file; pool_config_file; fcoe-driver; xen-cmdline-script; static-vdis; xsh; xe-toolstack-restart; xe; host-restore; host-backup; upload-wrapper; update-mh-info; logs-download; xe-syslog-reconfigure; set-hostname; host-bugreport-upload; fence; vhd-tool; sparse_dd; redo-log-block-device-io; pbis-force-domain-leave-script; busybox; xapissl; startup-script-hook; rolling-upgrade-script-hook; xapi-message-script; non-managed-pifs; update-issue; killall; nbd-firewall-config; firewall-port-config; nbd_client_manager; pool_secret_path; udhcpd-conf; remote-db-conf-file; logconfig; cpu-info-file; server-cert-path; iscsi_initiatorname; master-scripts-dir; packs-dir; xapi-hooks-root; xapi-plugins-root; xapi-extensions-root; static-vdis-root ]

Details about the environment:

  • VMware ESXi v6.0.0 6921384
  • 3 network adapters, 2 adapters on one network and the one in the middle on another network
  • VM nic tested: Intel E1000E and also changing to E1000
  • VM was I think initially installed with Citrix Xen Server 7.4 and then upgraded to XCP-NG 7.4 or directly fresh installed with XCP-NG 7.4 (can't remember), then Upgraded to 7.5 and now to 7.6
  • It was a Cluster with a pool and also used with CloudStack and had multiple vlan
    -OVS backend network enabled

Image

Image

The same problem was reportet on the forum to happen on a Cluster with Dell M600.

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 3, 2018

New Test: Upgrade XCP-NG 7.5 to Citrix Xen Server 7.6
Outcome: same problem here

@tltow
Copy link

@tltow tltow commented Nov 4, 2018

I can add a Cluster of 3 Proliant DL320e G8 to this list. Upgrading the master from XCP-NG 7.4 to XCP-NG 7.5 and later 7.6 leads to the same issue described here. Let me know if i can do something to help finding the source of the problem. I tried to upgrade the ISO way. Restoring 7.4 did solve the issue both times.

@olivierlambert
Copy link
Member

@olivierlambert olivierlambert commented Nov 4, 2018

@cocoon so it could be a XS issue in the first place? Can you try from a XS 7.5 to XS 7.6, this way we can "rule out" XCP-ng specific issue?

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 4, 2018

@olivierlambert I was thinking the same, I could try to upgrade XCP-NG 7.5 to Xen Server 7.5 and then to Xen Server 7.6 (to have it a bit as a supported upgrade way).

Or upgrade XCP-NG 7.5 to XS 7.5 (or fresh install) and restore from state.db, this should work as it should be the same version I expect?

@tltow
Copy link

@tltow tltow commented Nov 4, 2018

My Cluster was also XS based before i upgraded it to XCP-NG 7.4. I have another Cluster of 2 similar Proliant Servers, which went the same way from XS to XCP-NG, but this Cluster did not show this behaviour. I was able to upgrade to 7.5 and today to 7.6?!

@olivierlambert
Copy link
Member

@olivierlambert olivierlambert commented Nov 4, 2018

@cocoon For bug chasing purpose, the easiest diagnostic would be to install a fresh XS 7.5 and upgrade it to XS 7.6

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 5, 2018

XS 7.5 fresh installation, only setting management ip (no Cluster, no OVS ...), then upgrading to XS 7.6: no problem

XCP-NG 7.5 --> xe pool-dump-database
install fresh XS 7.5 --> restore dump = OK
upgrade to XS 7.6 = problem is back

and have seen this:

Nov  5 08:33:10 xen-01 xcp-networkd: [ warn|xen-01|2 ||network_monitor_thread] Error in IP watcher: Unix.Unix_error(Unix.ECONNREFUSED, "connect", "")#012Raised by primitive operation at file "lib/rpc_client.ml", line 31, characters 6-25#012Re-raised at file "lib/rpc_client.ml", line 35, characters 6-13#012Called from file "lib/rpc_client.ml", line 188, characters 6-40#012Called from file "ocaml/xapi-client/client.ml", line 13, characters 4-28#012Called from file "ocaml/xapi-client/client.ml", line 3422, characters 6-78#012Called from file "networkd/network_monitor_thread.ml", line 264, characters 15-93#012Called from file "networkd/network_monitor_thread.ml", line 299, characters 3-30#012Called from file "networkd/network_monitor_thread.ml", line 306, characters 3-10
Nov  5 08:33:10 xen-01 xcp-networkd: [ info|xen-01|2 ||network_monitor_thread] (Re)started IP watcher thread
Nov  5 08:34:31 xen-01 xapi: [ warn|xen-01|0 ||xapi] Duplicate configuration keys in Xcp_service.configure: disable-logging-for in [ use-switch; switch-path; search-path; pidfile; log; daemon; disable-logging-for; loglevel; inventory; config; config-dir; master_connection_reset_timeout; master_connection_retry_t

and:

Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |Setup DB configuration D:90a5218ac4b1|xapi] parsing db config file
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |Setup DB configuration D:90a5218ac4b1|xapi] [/var/lib/xcp/state.db]
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |Setup DB configuration D:90a5218ac4b1|xapi] mode:no_limit
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |Setup DB configuration D:90a5218ac4b1|xapi] format:xml
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |Setup DB configuration D:90a5218ac4b1|xapi] available_this_boot:true
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |Setup DB configuration D:90a5218ac4b1|xapi]
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |server_init D:d49c51c13603|startup] task [starting up database engine]
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |server_init D:d49c51c13603|dummytaskhelper] task starting up database engine D:d22ee5cf1ee2 created by task D:d49c51c13603
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:d22ee5cf1ee2|xapi] Attempting to populate database from one of these locations: [/var/lib/xcp/restore_db.db]
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:d22ee5cf1ee2|xapi] Dbconf contains: /var/lib/xcp/restore_db.db (generation 0)
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:d22ee5cf1ee2|xapi] Most recent db is /var/lib/xcp/restore_db.db (generation 0)
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:d22ee5cf1ee2|sql] attempting to restore database from /var/lib/xcp/restore_db.db
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:d22ee5cf1ee2|sql] database unmarshalled, schema version = 5.142
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:d22ee5cf1ee2|xapi] About to flush database: /var/lib/xcp/state.db
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|32 dbflush [/var/lib/xcp/state.db]||sql] In memory DB flushing thread created [/var/lib/xcp/state.db].
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:d22ee5cf1ee2|sql] XML backend [/var/lib/xcp/state.db] -- Write buffer flushed. Time: 0.025880
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:d22ee5cf1ee2|xapi] Performing initial DB GC
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |DB GC D:3e2faeeb71c4|db_gc_util] Connector PBD (OpaqueRef:fdc2545f-a9df-4d95-a7ee-bfb6d03b9209) has invalid refs [ref_1: INVALID; ref_2: valid]. Attempting to GC...
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |DB GC D:3e2faeeb71c4|db_gc_util] Connector PBD (OpaqueRef:aefdcf1f-ea89-46f0-beb6-a355b994d68a) has invalid refs [ref_1: INVALID; ref_2: valid]. Attempting to GC...
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |DB GC D:3e2faeeb71c4|db_gc_util] Connector PBD (OpaqueRef:9e36753f-0524-4d12-9c74-e5e1ef30c9b3) has invalid refs [ref_1: INVALID; ref_2: valid]. Attempting to GC...
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |DB GC D:3e2faeeb71c4|db_gc_util] Connector PBD (OpaqueRef:95c286f6-3cdd-4cf1-ba16-6f64ff4ddc1c) has invalid refs [ref_1: INVALID; ref_2: valid]. Attempting to GC...
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |DB GC D:3e2faeeb71c4|db_gc_util] Connector PBD (OpaqueRef:6243fe12-4a35-40c5-a5fd-e6c7e8c4b159) has invalid refs [ref_1: INVALID; ref_2: valid]. Attempting to GC...
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |DB GC D:3e2faeeb71c4|db_gc_util] Connector PBD (OpaqueRef:14bc8e35-764e-42e1-b4b9-2e2307157dee) has invalid refs [ref_1: INVALID; ref_2: valid]. Attempting to GC...
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |DB GC D:3e2faeeb71c4|db_gc_util] GCed PGPU OpaqueRef:7242c718-2f54-429e-92b7-e16228d0ea36
Nov  5 08:34:35 xen-01 xapi: [debug|xen-01|0 |DB GC D:3e2faeeb71c4|db_gc_util] session_log: active_sessions=0 (0 pool, 0 anon, 0 named - 0 groups)

and

Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|7 |dbsync (update_env) R:c25a98cc5535|network_utils] /sbin/ip link show dev eth0
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|7 |dbsync (update_env) R:c25a98cc5535|network_utils] Looking for link/ether in [2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP mode DEFAULT qlen 1000#012    link/ether 00:50:56:84:dc:c2 brd ff:ff:ff:ff:ff:ff#012]
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|7 |dbsync (update_env) R:c25a98cc5535|network_utils] Found at [ 15 ]
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|9 |dbsync (update_env) R:c25a98cc5535|network_utils] /sbin/ip link show dev xenbr0
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|9 |dbsync (update_env) R:c25a98cc5535|network_utils] Looking for mtu in [8: xenbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT qlen 1#012    link/ether 00:50:56:84:dc:c2 brd ff:ff:ff:ff:ff:ff#012]
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|9 |dbsync (update_env) R:c25a98cc5535|network_utils] Found at [ 3 ]
Nov  5 08:34:36 xen-01 xcp-networkd: [error|xen-01|10 |dbsync (update_env) R:c25a98cc5535|network_utils] Error in read one line of file: /sys/class/net/eth0/device/sriov_totalvfs, exception (Sys_error#012  "/sys/class/net/eth0/device/sriov_totalvfs: No such file or directory")#012Raised by primitive operation at file "pervasives.ml", line 366, characters 28-54#012Called from file "pervasives.ml" (inlined), line 371, characters 2-45#012Called from file "lib/network_utils.ml", line 132, characters 16-28
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|10 |dbsync (update_env) R:c25a98cc5535|network_utils] /opt/xensource/libexec/fcoe_driver --xapi eth0 capable
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|11 |dbsync (update_env) R:c25a98cc5535|network_utils] /sbin/ip link show dev eth1
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|11 |dbsync (update_env) R:c25a98cc5535|network_utils] Looking for link/ether in [3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP mode DEFAULT qlen 1000#012    link/ether 00:50:56:84:83:ea brd ff:ff:ff:ff:ff:ff#012]
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|11 |dbsync (update_env) R:c25a98cc5535|network_utils] Found at [ 15 ]
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|13 |dbsync (update_env) R:c25a98cc5535|network_utils] /sbin/ip link show dev xenbr1
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|13 |dbsync (update_env) R:c25a98cc5535|network_utils] Looking for mtu in [6: xenbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT qlen 1#012    link/ether 00:50:56:84:83:ea brd ff:ff:ff:ff:ff:ff#012]
Nov  5 08:34:36 xen-01 xcp-networkd: [ info|xen-01|13 |dbsync (update_env) R:c25a98cc5535|network_utils] Found at [ 3 ]
Nov  5 08:34:36 xen-01 xapi: [debug|xen-01|0 |dbsync (update_env) R:c25a98cc5535|xapi] PIF OpaqueRef:c4ca370a-308a-4e9e-a397-107900994a2a MTU <- 1500
Nov  5 08:34:36 xen-01 xcp-networkd: [error|xen-01|14 |dbsync (update_env) R:c25a98cc5535|network_utils] Error in read one line of file: /sys/class/net/eth1/device/sriov_totalvfs, exception (Sys_erBroadcast message from systemd-journald@xen-01 (Mon 2018-11-05 09:15:09 CET):y")#012Raised by primitive operation at file "pervasives.ml", line 366, characters 28-54#012Called from file "pervasives.ml" (inlined), line 371, characters 2-45#012Called from file "lib/network_utils.ml", line 132, characters 16-28

and:

Nov  5 08:34:36 xen-01 xapi: [debug|xen-01|87 UNIX /var/lib/xcp/xapi|VM.get_allowed_VBD_devices D:1c6d7217cb0d|audit] VM.get_allowed_VBD_devices: VM = '0cb69029-73ba-4cd1-a6f1-379856465d1a (Control domain on host: xen-01)'
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] SR.get_uuid D:f64d067d4057 failed with exception Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] Raised Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] 1/10 xapi @ xen-01 Raised at file ocaml/database/db_cache_impl.ml, line 54
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] 2/10 xapi @ xen-01 Called from file ocaml/database/db_cache_impl.ml, line 58
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] 3/10 xapi @ xen-01 Called from file ocaml/xapi/db_actions.ml, line 15056
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] 4/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 227
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] 5/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 236
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] 6/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 73
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] 7/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 91
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] 8/10 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 22
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|90 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ba0cfb8fedfc|backtrace] 9/10 xapi @ xen-01 Called from file map.ml, line 122
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] SR.get_uuid D:735323fb42ec failed with exception Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] Raised Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] 1/10 xapi @ xen-01 Raised at file ocaml/database/db_cache_impl.ml, line 54
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] 2/10 xapi @ xen-01 Called from file ocaml/database/db_cache_impl.ml, line 58
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] 3/10 xapi @ xen-01 Called from file ocaml/xapi/db_actions.ml, line 15056
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] 4/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 227
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] 5/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 236
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] 6/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 73
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] 7/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 91
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] 8/10 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 22
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|91 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:ae0dbf0255ae|backtrace] 9/10 xapi @ xen-01 Called from file map.ml, line 122
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] SR.get_uuid D:313a0f70e5c6 failed with exception Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] Raised Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] 1/10 xapi @ xen-01 Raised at file ocaml/database/db_cache_impl.ml, line 54
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] 2/10 xapi @ xen-01 Called from file ocaml/database/db_cache_impl.ml, line 58
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] 3/10 xapi @ xen-01 Called from file ocaml/xapi/db_actions.ml, line 15056
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] 4/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 227
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] 5/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 236
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] 6/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 73
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] 7/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 91
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] 8/10 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 22
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] 9/10 xapi @ xen-01 Called from file map.ml, line 122
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace] 10/10 xapi @ xen-01 Called from file src0/sexp_conv.ml, line 150
Nov  5 08:34:36 xen-01 xapi: [error|xen-01|92 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:adefc039f20b|backtrace]
Nov  5 08:34:37 xen-01 xapi: [debug|xen-01|136 UNIX /var/lib/xcp/xapi|VM.get_allowed_VBD_devices D:11a95c3c996e|audit] VM.get_allowed_VBD_devices: VM = '0cb69029-73ba-4cd1-a6f1-379856465d1a (Control domain on host: xen-01)'
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] SR.get_uuid D:22f47d5333f0 failed with exception Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] Raised Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] 1/10 xapi @ xen-01 Raised at file ocaml/database/db_cache_impl.ml, line 54
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] 2/10 xapi @ xen-01 Called from file ocaml/database/db_cache_impl.ml, line 58
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] 3/10 xapi @ xen-01 Called from file ocaml/xapi/db_actions.ml, line 15056
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] 4/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 227
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] 5/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 236
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] 6/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 73
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] 7/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 91
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] 8/10 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 22
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace] 9/10 xapi @ xen-01 Called from file map.ml, line 122
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|139 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:85b0a316e357|backtrace]
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] SR.get_uuid D:083e44a6a916 failed with exception Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] Raised Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] 1/10 xapi @ xen-01 Raised at file ocaml/database/db_cache_impl.ml, line 54
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] 2/10 xapi @ xen-01 Called from file ocaml/database/db_cache_impl.ml, line 58
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] 3/10 xapi @ xen-01 Called from file ocaml/xapi/db_actions.ml, line 15056
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] 4/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 227
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] 5/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 236
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] 6/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 73
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] 7/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 91
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] 8/10 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 22
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] 9/10 xapi @ xen-01 Called from file map.ml, line 122
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace] 10/10 xapi @ xen-01 Called from file src0/sexp_conv.ml, line 150
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|140 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cf0024366c2e|backtrace]
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] SR.get_uuid D:2d6177b60343 failed with exception Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] Raised Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] 1/10 xapi @ xen-01 Raised at file ocaml/database/db_cache_impl.ml, line 54
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] 2/10 xapi @ xen-01 Called from file ocaml/database/db_cache_impl.ml, line 58
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] 3/10 xapi @ xen-01 Called from file ocaml/xapi/db_actions.ml, line 15056
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] 4/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 227
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] 5/10 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 236
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] 6/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 73
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] 7/10 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 91
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] 8/10 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 22
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] 9/10 xapi @ xen-01 Called from file map.ml, line 122
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace] 10/10 xapi @ xen-01 Called from file src0/sexp_conv.ml, line 150
Nov  5 08:34:37 xen-01 xapi: [error|xen-01|141 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:e124a3e26e2c|backtrace]
Nov  5 08:34:38 xen-01 xapi: [debug|xen-01|0 |dbsync (update_env) R:c25a98cc5535|dbsync] Sync: sync_local_vdi_activations
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:fd2b8be2-234b-44a0-98e7-a1fcbfa6beec (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:f9840f6b-89e0-4fe9-a1c4-9cc991a180cc (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:efb94cf0-cd3d-4a91-b7d7-b1295a1c3a13 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:e712ac7c-8acc-4dc2-bb32-aeca84ed49f6 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:e5fd32e0-98bf-4fcb-ba50-bd194cf88185 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:d0e39672-0e31-426b-a998-be4017d7f748 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:c7d6ebbe-aeca-426a-9c5d-1b10400fc646 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:c557c1c4-e6fc-499f-9b9b-1de10a0c5082 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:c2ae00cb-519a-498e-bb58-4439e9ca9178 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:bca56647-d746-46e3-a585-983d52c430d6 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:aa3df68f-a063-4fc6-87e2-b3b3ad0cfd40 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:a848c9c3-23f3-46a6-a6e6-3dbcd9419372 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:9dd01b6f-bee3-4038-abd3-b56f56c5bbf4 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:902638d0-5129-4282-8662-5da77209c9e8 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:85c6f485-a7b4-4bb1-9d15-5c3fd1fa86a5 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:7c13cabd-6637-42c5-a176-0d6295354008 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:56075d48-75dc-49ec-ad15-de0682764744 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:51eb72fa-cf6f-49e5-a489-4bec98ca6bcd (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:4c35d4e2-2739-449c-b472-7a4cb7773423 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:4bc91ec5-9cf1-44de-9456-4534b5cd9e2b (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:3bec1302-74cf-4bb1-a739-298e25fe0f21 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:387b4350-d0c7-4209-842a-87f4d5d9b87a (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:3668a2d1-026e-4bd7-8e1f-25905c6b893b (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:2cb86cd9-c622-42b0-9a39-993c78258b97 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:28414707-a611-499a-8aed-ff36f95a119f (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:2558d4fc-a824-449d-b62f-b0e51f103506 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:23c5102f-7062-46ef-8d24-a5487000a28c (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:23117ce9-ca9e-4d5f-ac95-d646d771a092 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:209ccfb9-7de5-468b-ba55-faa6840d7f4e (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:134805b2-d7bf-4de0-9b96-7e4df2d1ebdf (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:0e9f66b5-c436-41ac-a84f-7614b729c5d9 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:06ee4f16-9362-4571-877c-d83d20ad73b7 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:02c42cd6-8f96-4bdf-916f-8f35b71ad647 (because it was leaked (pool join?))
Nov  5 08:34:38 xen-01 xapi: [ info|xen-01|0 |dbsync (update_env) R:c25a98cc5535|storage_access] Unlocking VDI OpaqueRef:018c06e9-5c62-4819-9779-08d8733e8352 (because it was leaked (pool join?))

leading to:

Nov  5 09:21:16 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:09cf7695eab3|xapi] Dbconf contains: /var/lib/xcp/state.db (generation 0)
Nov  5 09:21:16 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:09cf7695eab3|xapi] Most recent db is /var/lib/xcp/state.db (generation 0)
Nov  5 09:21:16 xen-01 xapi: [debug|xen-01|0 |starting up database engine D:09cf7695eab3|sql] attempting to restore database from /var/lib/xcp/state.db
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] starting up database engine D:09cf7695eab3 failed with exception Db_exn.DBCache_NotFound("missing column", "Cluster", "network")
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] Raised Db_exn.DBCache_NotFound("missing column", "Cluster", "network")
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 1/19 xapi @ xen-01 Raised at file ocaml/database/schema.ml, line 89
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 2/19 xapi @ xen-01 Called from file ocaml/database/db_xml.ml, line 127
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 3/19 xapi @ xen-01 Called from file list.ml, line 111
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 4/19 xapi @ xen-01 Called from file ocaml/database/db_xml.ml, line 124
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 5/19 xapi @ xen-01 Called from file ocaml/database/db_xml.ml, line 156
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 6/19 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 7/19 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 35
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 8/19 xapi @ xen-01 Called from file ocaml/database/backend_xml.ml, line 43
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 9/19 xapi @ xen-01 Called from file ocaml/database/db_cache_impl.ml, line 276
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 10/19 xapi @ xen-01 Called from file ocaml/database/db_cache_impl.ml, line 369
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 11/19 xapi @ xen-01 Called from file ocaml/xapi/xapi.ml, line 79
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 12/19 xapi @ xen-01 Called from file ocaml/xapi/xapi.ml, line 93
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 13/19 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 14/19 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 35
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 15/19 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 80
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 16/19 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 99
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 17/19 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 18/19 xapi @ xen-01 Called from file hashtbl.ml, line 194
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace] 19/19 xapi @ xen-01 Called from file lib/debug.ml, line 92
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 |server_init D:c5717aeed231|backtrace]
Nov  5 09:21:16 xen-01 xapi: [ warn|xen-01|0 |server_init D:c5717aeed231|startup] task [starting up database engine] exception: Db_exn.DBCache_NotFound("missing column", "Cluster", "network")
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 ||backtrace] server_init D:c5717aeed231 failed with exception Db_exn.DBCache_NotFound("missing column", "Cluster", "network")
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 ||backtrace] Raised Db_exn.DBCache_NotFound("missing column", "Cluster", "network")
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 ||backtrace] 1/1 xapi @ xen-01 Raised at file (Thread 0 has no backtrace table. Was with_backtraces called?, line 0
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 ||backtrace]
Nov  5 09:21:16 xen-01 xapi: [debug|xen-01|0 ||xapi] xapi top-level caught exception: INTERNAL_ERROR: [ missing column; Cluster; network ]
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 ||backtrace] Raised Db_exn.DBCache_NotFound("missing column", "Cluster", "network")
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 ||backtrace] 1/1 xapi @ xen-01 Raised at file (Thread 0 has no backtrace table. Was with_backtraces called?, line 0
Nov  5 09:21:16 xen-01 xapi: [error|xen-01|0 ||backtrace]
Nov  5 09:21:17 xen-01 xapi: [ warn|xen-01|0 ||xapi] Duplicate configuration keys in Xcp_service.configure: disable-logging-for in [ use-switch; switch-path; search-path; pidfile; log; daemon; disable-logging-for; loglevel; inventory; config; config-dir; master_connection_reset_timeout; master_connection_retry_timeout; master_connection_default_timeout; qemu_dm_ready_timeout; hotplug_timeout; pif_reconfigure_ip_timeout; pool_db_sync_interval; pool_data_sync_interval; domain_shutdown_total_timeout; emergency_reboot_delay_base; emergency_reboot_delay_extra; ha_xapi_healthcheck_interval; ha_xapi_healthcheck_timeout; ha_xapi_restart_attempts; ha_xapi_restart_timeout; logrotate_check_interval; rrd_backup_interval; session_revalidation

@olivierlambert
Copy link
Member

@olivierlambert olivierlambert commented Nov 5, 2018

I'm not sure to understand. So if you install XS 7.5 as you installed XCP-ng 7.5 (same choices, same IP/networks), then upgrade to XS 7.6, it works?

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 5, 2018

No my first test was a stupid simple test fresh installed Xen Server 7.5 without any more complex settings even no SRs etc.

As I don't know how to get the configuration back exactly like it was before, because as I said, I had a more complex configuration from CloudStack, multiple networks, VLANs, SRs, NFS, Cluster, OVS backend ...

I will try to configure it manually as good as possible and build a cluster and try the upgrade again.

@olivierlambert
Copy link
Member

@olivierlambert olivierlambert commented Nov 5, 2018

Thank you, this could help to see if it comes from your "complex" setup in the meantime 👍
Do you have the same issue with a simple fresh XCP-ng 7.5 with a simple config, then upgraded to XCP-ng 7.6?

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 5, 2018

@olivierlambert will try to do the check with simple XCP-NG config when I find the time.

While manually trying to readd one NFS datastore that was like this in the config:

device_config="(('options'%.'')%.('server'%.'10.x.x.x')%.('serverpath'%.'/mnt/pool1/x/xx/store')%.('nfsversion'%.'4'))"

it failed and the interesting thing in the Xen Center GUI is, that if I scan the server it find the existing SR with its ID but it changes automatically from NFS v4 to NFS v3 and doesn't allow to check v4.

Don't know if this causes a problem, as in the state.db is nfsversion 4 what I think should be NFS v4?

I added now a new SR with a new ID on that NFS share and it worked.

an error in the logs (01e0b78e-c674-e837-0728-f28b8a951291 is the old folder on the NFS share)

Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] SR.get_by_uuid D:18f64a847844 failed with exception Db_exn.Read_missing_uuid("SR", "", "01e0b78e-c674-e837-0728-f28b8a951291")
Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] Raised Db_exn.Read_missing_uuid("SR", "", "01e0b78e-c674-e837-0728-f28b8a951291")
Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] 1/9 xapi @ xen-01 Raised at file ocaml/database/db_cache_impl.ml, line 178
Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] 2/9 xapi @ xen-01 Called from file ocaml/xapi/db_actions.ml, line 15008
Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] 3/9 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 227
Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] 4/9 xapi @ xen-01 Called from file ocaml/xapi/rbac.ml, line 236
Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] 5/9 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 73
Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] 6/9 xapi @ xen-01 Called from file ocaml/xapi/server_helpers.ml, line 91
Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] 7/9 xapi @ xen-01 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 22
Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] 8/9 xapi @ xen-01 Called from file map.ml, line 122
Nov  5 12:31:21 xen-01 xapi: [error|xen-01|3471 INET :::80|dispatch:SR.get_by_uuid D:f5655d5cbc0b|backtrace] 9/9 xapi @ xen-01 Called from file src0/sexp_conv.ml, line 150
Nov  5 12:31:21 xen-01 xapi: [ info|xen-01|3487 |Async.SR.introduce R:ec19cc9dfdd2|dispatcher] spawning a new thread to handle the current task (trackid=6d56af1131b252c5e7a44fe45238603c)
Nov  5 12:31:21 xen-01 xapi: [debug|xen-01|3487 |Async.SR.introduce R:ec19cc9dfdd2|audit] SR.introduce: uuid = '01e0b78e-c674-e837-0728-f28b8a951291'; name label = 'store'

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 6, 2018

Can someone confirm that in the state.db in the pool entry there should be a valid/existing SR-uuid for crash_dump_SR, default_SR and suspend_image_SR? Or how can I check this uuid?

<table name="pool">
<row ... crash_dump_SR="OpaqueRef:3665de72-9228-4085-8b46-a3091ada353e" ... default_SR="OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087" ... suspend_image_SR="OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087" uuid="9cc45668-9bb3-927e-e983-12aff0e3512d" .../>
</table>

because I see this error:

Nov  6 11:50:10 xen-01 xapi: [error|xen-01|82 UNIX /var/lib/xcp/xapi|dispatch:SR.get_uuid D:cdca078c7488|backtrace] SR.get_uuid D:aa1805cba36f failed with exception Db_exn.DBCache_NotFound("missing row", "SR", "OpaqueRef:9ad35328-7324-4d4e-827e-16810a56d087")

and I don't have such an ID for an SR but if I manually check the parameters (or the same from the console of each host) there are only valid/existing SR-uuids reported:

xe sr-list
uuid ( RO)                : e2b1d887-2481-a8eb-89e5-605944242cf5
          name-label ( RW): XCP-ng Tools
    name-description ( RW): XCP-ng Tools ISOs
                host ( RO): <shared>
                type ( RO): iso
        content-type ( RO): iso


uuid ( RO)                : a700d28c-9aa0-a01c-36fc-c4123af83e66
          name-label ( RW): DVD drives
    name-description ( RW): Physical DVD drives
                host ( RO): xen-02
                type ( RO): udev
        content-type ( RO): iso


uuid ( RO)                : 1c5d9b81-4c31-1127-ee17-2ad550d05ad9
          name-label ( RW): Removable storage
    name-description ( RW):
                host ( RO): xen-02
                type ( RO): udev
        content-type ( RO): disk


uuid ( RO)                : c5b59b85-3425-9a28-974c-1c4844070a03
          name-label ( RW): Removable storage
    name-description ( RW):
                host ( RO): xen-01
                type ( RO): udev
        content-type ( RO): disk


uuid ( RO)                : a948da58-5bef-94d4-ad02-fc23120bc68c
          name-label ( RW): a948da58-5bef-94d4-ad02-fc23120bc68c
    name-description ( RW): Cloud Stack Local EXT Storage Pool for 6325c5ea-27f4-4526-b0a7-1db1ba9b7fd7
                host ( RO): xen-02
                type ( RO): ext
        content-type ( RO): user


uuid ( RO)                : 2bd72944-6d47-fc04-0726-ec3fce2f2e33
          name-label ( RW): 2bd72944-6d47-fc04-0726-ec3fce2f2e33
    name-description ( RW): Cloud Stack Local EXT Storage Pool for 7056acef-ad16-41c7-a2c3-506aa834d9b6
                host ( RO): xen-01
                type ( RO): ext
        content-type ( RO): user


uuid ( RO)                : 5ad5cbd6-f71e-49ea-feac-65ae7d6e627e
          name-label ( RW): DVD drives
    name-description ( RW): Physical DVD drives
                host ( RO): xen-01
                type ( RO): udev
        content-type ( RO): iso


[root@xen-01 ~]# xe pool-param-get uuid=9cc45668-9bb3-927e-e983-12aff0e3512d param-name=crash-dump-SR
2bd72944-6d47-fc04-0726-ec3fce2f2e33

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 6, 2018

I think I found the "one" checkbox that caused this trouble:
You should not enable "Clustering" on the pool properties ...

@olivierlambert
Copy link
Member

@olivierlambert olivierlambert commented Nov 6, 2018

"Clustering"? what's that thing, and where you configure it?

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 6, 2018

pool-clustering

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 6, 2018

and the very best thing with clustering 2 Servers is that there is no quorum and the second server will restart if the first on reboots ^^

Not that I would have needed that option ... it was a "test"

@olivierlambert
Copy link
Member

@olivierlambert olivierlambert commented Nov 6, 2018

GFS2 is NOT supported on XCP-ng, this option shouldn't be available (it's not exposed in XO). @borzel we need to disable it or it will potentially have side effects like this.

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 6, 2018

I created an own issue for it: xcp-ng/xenadmin#115

@olivierlambert
Copy link
Member

@olivierlambert olivierlambert commented Nov 6, 2018

Great, thanks! 👍

@Ehnix
Copy link

@Ehnix Ehnix commented Nov 6, 2018

Thank you very much cocoon

I had the same problem by updating xcp 7.4 to xcp 7.5, xapi was not running:

Nov 6 14:49:59 localhost xapi: [debug | localhost | 0 | starting up database engine D: 307281139104 | sql] attempting to restore database from /var/lib/xcp/state.db
Nov 6 14:49:59 localhost xapi: [error | localhost | 0 | server_init D: 93452b5a8d6a | backtrace] starting up database engine D: 307281139104 failed with exception Db_exn.DBCache_NotFound ("missing column", "Cluster", "network" )
Nov 6 14:49:59 localhost xapi: [error | localhost | 0 | server_init D: 93452b5a8d6a | backtrace] Raised Db_exn.DBCache_NotFound ("missing column", "Cluster", "network")

I disabled cluster option in xcp-ng Center but without Restart toolstack, so after upgrade i removed the cluster settings in the state.db file :

<table name="Cluster">..</table><table name="Cluster_host">..</table>

Now everything works fine.

Thanks again.

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 7, 2018

@olivierlambert It turned out to be fixed for v7.6 here:
#70
with "report restrict_corosync as true"

But the problem still exists for v7.5, so 7.5 still reports it would support Clustering/GFS2.

I thought it might be good to fix the v7.5 ISO (https://updates.xcp-ng.org/isos/7.5/) + yum updates
and maybe add a text to check Clustering option is disabled and update to latest 7.5 before upgrading to 7.6 to the Upgrade guide here:
https://xcp-ng.org/blog/2018/10/31/xcp-ng-7-6-upgrade/

What do you think?

@olivierlambert
Copy link
Member

@olivierlambert olivierlambert commented Nov 7, 2018

That's one of the reason why having a fork of XenCenter without doing very careful checks on side effects can be dangerous. I'll recommend to use xe or Xen Orchestra to avoid these issues :/

I'll see what we can do with @stormi about this on our side (XCP-ng side)

@stormi
Copy link
Member

@stormi stormi commented Nov 7, 2018

What would happen if 7.5 started reporting that it does not support clustering? Would it break for the same reason as it breaks after an upgrade to 7.6?

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 7, 2018

But it also would happen if someone comes from Citrix Xen Server and had it enabled.

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 7, 2018

@stormi: I can try it If you give me an updated daemon for 7.5?

@stormi
Copy link
Member

@stormi stormi commented Nov 7, 2018

@cocoon ok I'll build one

@stormi stormi added this to In Progress in Team board Nov 7, 2018
@stormi
Copy link
Member

@stormi stormi commented Nov 7, 2018

@cocoon here's an updated xcp-featured package built for 7.5: https://updates.xcp-ng.org/tmp/xcp-featured-1.1.1-1.el7.centos.x86_64.rpm

@stormi
Copy link
Member

@stormi stormi commented Nov 7, 2018

@olivierlambert It turned out to be fixed for v7.6 here:
#70
with "report restrict_corosync as true"

I believe this is what caused the issue after the upgrade actually. It looks like once the option is enabled it really doesn't like that support for it is removed from what the license daemon says... And instead of failing gracefully with an error message about that, it just breaks.

If this is true, a quick fix might be to re-enable the feature in the deamon, but:

  • some packages it depends on are not included anymore, so this could yield other errors
  • this would make it offered again in XCP-ng Center
  • this is just delaying the issue

We could also try to automatically change the setting upon upgrade, if there's a clean way to do it.

Or we can leave it as is, document the issue, how to solve it (I added a warning to the upgrade howto already), and quickly respond to anyone who gets stuck because of this enabled "clustering" option.

My preference currently balances between the second and the third solution, mostly depending on the number of affected users.

@olivierlambert
Copy link
Member

@olivierlambert olivierlambert commented Nov 7, 2018

Believe or not, I had a bad feeling when we removed this feature from the license daemon 😆

@cocoon
Copy link
Author

@cocoon cocoon commented Nov 7, 2018

XCP-NG 7.5 seems not to be very impressed about the changed daemon.
I can no longer disable clustering (no option in XCP-NG Center anymore) and it did not break xapi, but it seems to still enable clustering.

[root@xen-01 ~]# rpm -ivh xcp-featured-1.1.1-1.el7.centos.x86_64.rpm --force
Preparing...                          ################################# [100%]
Updating / installing...
   1:xcp-featured-1.1.1-1.el7.centos  ################################# [100%]


[root@xen-01 ~]# xe-toolstack-restart
Executing xe-toolstack-restart
Warning: v6d.service changed on disk. Run 'systemctl daemon-reload' to reload units.
^C
[root@xen-01 ~]# systemctl daemon-reload
[root@xen-01 ~]# xe-toolstack-restart
Executing xe-toolstack-restart
done.

## Reboot ##

grep "Cluster" xensource.log
Nov  7 15:48:59 xen-01 xapi: [debug|xen-01|0 |wait management interface to come up D:305b8f285526|xapi_cluster_host] Cluster_host.enable was successful for cluster_host: OpaqueRef:62d691c5-ac39-4e64-aa9b-6e1e378f1b51

@stormi
Copy link
Member

@stormi stormi commented Nov 9, 2018

I've deleted my previous comments because they were wrong. I did not manage to make it not to break by keeping the previous version of the license daemon.

All I can say for sure is that after upgrading XCP-ng 7.5 with clustering enabled to XCP-ng 7.6, the xapi database is not upgraded (in state.db, schema_minor_vsn stays at 142). Probably because it fails when reading it from the file before it can perform the database upgrade. Removing the <Cluster> and <Cluster_hosts> tags and waiting a bit results in an upgraded database (schema_minor_vsn = 203).

The error is raised by https://github.com/xapi-project/xen-api/blob/v1.110.0/ocaml/database/schema.ml#L89. Previous item in the backtrace is https://github.com/xapi-project/xen-api/blob/v1.110.0/ocaml/database/db_xml.ml#L127.

Here are the changes in the model regarding the Cluster table between 7.5 and 7.6:
xapi-project/xen-api@v1.90.5...v1.110.0diff-c486a5fc675f36df8d81213187cb3278

@stormi
Copy link
Member

@stormi stormi commented Nov 12, 2018

If you get hit by this issue:

  • If you actually relied on clustering through GFS2, there is no solution to keep using it in XCP-ng: the the xapi-clusterd and xapi-storage-plugins RPMs from Citrix are not opensource.
  • If you did not rely on it, you probably accidentally enabled it in XCP-ng Center (which won't offer it anymore starting with XCP-ng 7.6, to avoid this). Two solutions:
    • 1/ Cautiously edit /var/xapi/state.db (it's XAPI's database, it contains all your configuration!) and remove this from the file: <table name="Cluster">[...]</table><table name="Cluster_host">[...]</table>. Reboot. Warning: it's been reported to work and I tested it personnally, but I wouldn't go so far as guaranteeing that the state of the database will remain consistent. All I can say is that in my tests, it seems to automatically set clustering back to false and thus very much looks like the database you would have if you had never enabled the clustering feature.
    • 2/ Reinstall your server. Safer if you can afford it.

If you did activate clustering, do not really rely on it and are about to upgrade, disable it first, restart the toolstack (xe-toolstack-restart) and check that the <table name="Cluster"/> tag in /var/xapi/state.db is now empty. Then upgrade.

I have added a warning in our upgrade howto https://github.com/xcp-ng/xcp/wiki/Upgrade-Howto to help people avoid this issue.

I'm closing this issue. Many thanks to @cocoon for reporting and debugging it and to @Ehnix for the workaround.

@stormi stormi closed this as completed Nov 12, 2018
Team board automation moved this from In Progress to Done Nov 12, 2018
@borzel
Copy link
Member

@borzel borzel commented Nov 12, 2018

Maybe we can create a check in XCP-ng and disable clustering on server start?

@stormi
Copy link
Member

@stormi stormi commented Nov 12, 2018

I thought of it, but for the time being I think I prefer that people know about the issue rather than mask it automatically (and what if they did rely on clustering and we disabled it automatically?).

@borzel
Copy link
Member

@borzel borzel commented Nov 12, 2018

hmm... yes this sounds ok. Was just thinking around :-)

@olivierlambert
Copy link
Member

@olivierlambert olivierlambert commented Jul 29, 2019

#94 (comment) saved my life today 😆 (well a customer life in fact 😄 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

6 participants