Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bosh cck / hm resurrector broken with multi-cpi and different stemcells (eg: vpshere and openstack) #2287

Closed
poblin-orange opened this issue Nov 9, 2020 · 12 comments · Fixed by #2317

Comments

@poblin-orange
Copy link

poblin-orange commented Nov 9, 2020

Describe the bug
When leveraging bosh multi-cpi feature, targeting 2 different iaas (vsphere iaas, and openstack iaas)

  • bosh deploy is ok
  • bosh cck is KO if vms are ok
  • bosh cck is KO if one vm is KO - directors mismatches stemcell type between cpi-config

To Reproduce
Steps to reproduce the behavior :

  1. Deploy a bosh director on vsphere iaas, with appropriate cpi-config (multiple iaas backend) and cloud-config (map az to cpis)
  2. Upload vsphere stemcell
    2b. Upload openstack stemcell
  3. Deploy a manifest leveraging multiple azs, using the difference cpi and this diffrent iaas type
  4. delete a vm in one of the iaas
  5. Run bosh cck to try to repair

Expected behavior
As bosh deploy is ok with multi CPI on 2 target iaas types, I expect bosh cck to be usable in that context (and also bosh hm resurrector)

Logs

see https://github.com/orange-cloudfoundry/paas-templates/issues/840

Task 12905289 | 14:22:32 | Scanning 1 VMs: Checking VM states (00:00:16)
Task 12905289 | 14:22:48 | Scanning 1 VMs: 0 OK, 0 unresponsive, 1 missing, 0 unbound (00:00:00)
Task 12905289 | 14:22:49 | Applying problem resolutions: VM for 'mysql/b9af1d12-d120-4dc9-9b8c-b92062a60587 (0)' missing. (missing_vm 3451): Recreate VM and wait for processes to start (00:00:02)
                     L Error: Required stemcell {"name"=>"bosh-openstack-kvm-ubuntu-xenial-go_agent", "version"=>"621.69"} not found for cpi region-2, please upload again
Task 12905289 | 14:22:51 | Error: Error resolving problem '19166': Required stemcell {"name"=>"bosh-openstack-kvm-ubuntu-xenial-go_agent", "version"=>"621.69"} not found for cpi region-2, please upload again

Task 12905289 Started  Mon Nov  2 14:22:32 UTC 2020
Task 12905289 Finishedon Nov  2 14:22:51 UTC 2020
Task 12905289 Duration 00:00:19
Task 12905289 error

==> mismatch, as region 2 is configured as vpshere iaas type

Versions (please complete the following information):

  • Infrastructure: vpshere and openstack
  • BOSH version 271.2.0
  • BOSH CLI version version 6.4.0-7e5d8860-2020-08-31T17:09:13Z
  • Stemcell version 621.84 (vsphere and openstack)

Deployment info:

Plain bosh deployment, using bosh ops files, with multi cpi az
Deployment:

Name                  Release(s)                  Stemcell(s)                                       Config(s)            Team(s)  
01-cf-mysql-extended  bosh-dns/1.17.0             bosh-openstack-kvm-ubuntu-xenial-go_agent/621.84  122 cloud/default    -  
                      bosh-dns-aliases/0.0.3      bosh-vsphere-esxi-ubuntu-xenial-go_agent/621.84   121 runtime/default    
                      bpm/1.1.8                   bosh-vsphere-esxi-ubuntu-xenial-go_agent/621.84                          
                      cf-mysql/37.1.0                                                                                      
                      generic-scripting/3                                                                                  
                      minio/2020-06-18T02-23-35Z                                                                           
                      node-exporter/5.0.0                                                                                  
                      os-conf/22.0.0                                                                                       
                      prometheus/26.2.0                                                                                    
                      routing/0.206.0                                                                                      
                      shield/8.7.2                                                                                         
                      syslog/11.6.0                                                                                        

Here is the cpi-config.yml :

cpis:
- name: region-1
  properties:
    datacenters:
		xxxx
  type: vsphere
- name: region-2
  properties:
	datacenters:
		xxxx	
  type: vsphere
- name: region-3
  properties:
	xxxx
  type: openstack

Here is the cloud-config.yml:

azs:
- cpi: region-1
  name: z1
- cpi: region-1
  name: z2
- cpi: region-1
  name: z3
- cpi: region-2
  name: r2-z1
- cloud_properties:
    availability_zone: eu-west-0b
  cpi: region-3
  name: r3-z1

Additional context
Add any other context about the problem here.

@cf-gitbot
Copy link

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/175639525

The labels on this github issue will be updated when the story is started.

@poblin-orange
Copy link
Author

@any insight / update on this issue ?

@gberche-orange
Copy link

gberche-orange commented Jun 14, 2021

Summary of the reverse-engineering performed with @o-orand, where the problem also reproduces with a bosh deployment on a single az while the director has available stemcells of multiple types (vsphere and openstack)

With a manifest file structure as below, and the cpi-config.yml cloud-config.yml described in #2287 (comment)


addons:
- include:
    stemcell:
    - os: ubuntu-trusty
    - os: ubuntu-xenial
    - os: ubuntu-bionic

instance_groups.jobs
  name: mysql
  networks:
  - name: tf-net-osb-data-plane-dedicated-priv
  persistent_disk_type: large
  stemcell: default
  instances: 1
  azs:
  - z1
  - z2
  - z3

releases:
- exported_from:
  - os: ubuntu-bionic
    version: "1.1"
  name: cf-mysql
  stemcell:
    os: ubuntu-bionic
    version: "1.1"

stemcells:
- alias: default
  os: ubuntu-bionic
  version: "1.1"

When the cloud-check command fails, the following stack trace is displayed in bosh debug task

E, [2021-06-02T06:41:15.523131 #2760] [task:24935638] ERROR -- DirectorJobRunner: Error resolving problem '62674': 
Required stemcell {"name"=>"bosh-openstack-kvm-ubuntu-xenial-go_agent", "version"=>"621.107"}
 not found for cpi region-1, please upload again
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/cloud_check/apply_resolutions.rb:39:in `block in perform'
[...]/bosh-director-0.0.0/lib/bosh/director/lock_helper.rb:7:in `block in with_deployment_lock'
[...]/bosh-director-0.0.0/lib/bosh/director/lock.rb:79:in `lock'
[.../bosh-director-0.0.0/lib/bosh/director/lock_helper.rb:7:in `with_deployment_lock'
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/cloud_check/apply_resolutions.rb:35:in `perform'
[...]/bosh-director-0.0.0/lib/bosh/director/job_runner.rb:99:in `perform_job'
[.../bosh-director-0.0.0/lib/bosh/director/job_runner.rb:34:in `block in run'
[...]/bosh_common-0.0.0/lib/common/thread_formatter.rb:50:in `with_thread_name'
[...]/bosh-director-0.0.0/lib/bosh/director/job_runner.rb:34:in `run'
[.../bosh-director-0.0.0/lib/bosh/director/jobs/base_job.rb:9:in `perform'
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/db_job.rb:32:in `block in perform'
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/db_job.rb:98:in `block (3 levels) in run'
[.../eventmachine-1.2.7/lib/eventmachine.rb:1077:in `block in spawn_threadpool'
[...]/logging-2.2.2/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'

We're observing that the instances.spec_json database field documented within the director_schema has the incorrect stemcell value bosh-openstack-kvm-ubuntu-bionic-go_agent (instead of bosh-vsphere-esxi-ubuntu-bionic-go_agent)

We suspect that this instances.spec_json database field is assigned during a bosh deploy within the incorrect value assigned within

at the following stack trace, and assigned within the Bosh::Director::DeploymentPlan::InstancePlan in its nested Bosh::Director::DeploymentPlan::Instance.stemcell field

[...]/bosh-director-0.0.0/lib/bosh/director/deployment_plan/stemcell.rb:46:in `bind_model'",
[...]/bosh-director-0.0.0/lib/bosh/director/deployment_plan/assembler.rb:205:in `block in bind_stemcells'",
[...]/bosh-director-0.0.0/lib/bosh/director/deployment_plan/assembler.rb:204:in `each'",
[...]/bosh-director-0.0.0/lib/bosh/director/deployment_plan/assembler.rb:204:in `bind_stemcells'",
[...]/bosh-director-0.0.0/lib/bosh/director/deployment_plan/assembler.rb:34:in `bind_models'",
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/update_deployment.rb:94:in `block in prepare_deployment'",
[...]/bosh-director-0.0.0/lib/bosh/director/event_log.rb:105:in `advance_and_track'",
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/update_deployment.rb:87:in `prepare_deployment'",
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/update_deployment.rb:42:in `block in perform'",
[...]/bosh-director-0.0.0/lib/bosh/director/lock_helper.rb:7:in `block in with_deployment_lock'",
[...]/bosh-director-0.0.0/lib/bosh/director/lock.rb:79:in `lock'",
[...]/bosh-director-0.0.0/lib/bosh/director/lock_helper.rb:7:in `with_deployment_lock'",
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/update_deployment.rb:34:in `perform'",
[...]/bosh-director-0.0.0/lib/bosh/director/job_runner.rb:99:in `perform_job'",
[...]/bosh-director-0.0.0/lib/bosh/director/job_runner.rb:34:in `block in run'",
[...]/bosh_common-0.0.0/lib/common/thread_formatter.rb:50:in `with_thread_name'",
[...]/bosh-director-0.0.0/lib/bosh/director/job_runner.rb:34:in `run'",
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/base_job.rb:9:in `perform'",
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/db_job.rb:32:in `block in perform'",
[...]/bosh-director-0.0.0/lib/bosh/director/jobs/db_job.rb:98:in `block (3 levels) in run'",
[...]/eventmachine-1.2.7/lib/eventmachine.rb:1077:in `block in spawn_threadpool'",
[...]/logging-2.2.2/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'"]

We're suspecting that unlike a bosh deploy, the bosh cloud-check and bosh recreate don't yet select the instance stemcell from the instance az, but directly use the stemcell from the deployment plan.

# @return [DeploymentPlan::Stemcell]
attr_reader :stemcell

and

def parse_stemcell(name)
stemcell_name = safe_property(@instance_group_spec, 'stemcell', class: String)
stemcell = @deployment.stemcell(stemcell_name)
if stemcell.nil?
raise InstanceGroupUnknownStemcell,
"Instance group '#{name}' references an unknown stemcell '#{stemcell_name}'"
end
stemcell
end

This reverse engineering is based on source code analysis as well as adding debugging traces in the local source to dump models and stack traces.

We will now see if the bosh team could help us finding software architecture documentation, or potentially hint into the direction to contribute a fix.

@cunnie
Copy link
Member

cunnie commented Jun 14, 2021

This is a problem that I'd like to fix, but it's not easy.

I'll try to work on it 1/day per week. I've attached my notes on the problem so far (traveling through the class hierarchy).
multi-cpi-notes

@gberche-orange
Copy link

thanks a lot @cunnie for your work. We're eager to help testing any fix you might be able to develop in our 3 az lab (2 vsphere and 1 openstack).

As a stop-gap workaround, we're currently patching our director instances

with model = @models.last. In our 3 az environments (2 vsphere and 1 openstack), this should select the vsphere stemcell which is present on more AZs than the openstack one, making it possible after a bosh deploy --recreate to use bosh cloud-check on vsphere instances part of a multi-cpi deployment.

@bosh-admin-bot
Copy link

This issue was marked as Stale because it has been open for 21 days without any activity. If no activity takes place in the coming 7 days it will automatically be close. To prevent this from happening remove the Stale label or comment below.

@o-orand
Copy link

o-orand commented Jul 27, 2021

Hello @cunnie , did you have an opportunity to investigate this issue ?
Thanks !

@bosh-admin-bot
Copy link

This issue was marked as Stale because it has been open for 21 days without any activity. If no activity takes place in the coming 7 days it will automatically be close. To prevent this from happening remove the Stale label or comment below.

@cunnie
Copy link
Member

cunnie commented Aug 19, 2021

@o-orand I'll try to look at it tomorrow.

@cunnie
Copy link
Member

cunnie commented Aug 25, 2021

I think I may have a fix. Am testing it now.

@o-orand
Copy link

o-orand commented Aug 26, 2021

It seems promising, great !

cunnie added a commit that referenced this issue Aug 26, 2021
Prior to this commit, `bosh cck` would sometimes fail on multi-CPI
installations with a "Required stemcell ... not found for cpi" message.

This commit fixes that failure by selecting the stemcell appropriate for
the particular CPI (rather than merely grabbing the first stemcell,
which was the prior behavior).

An obvious question is, "why could `bosh deploy` find the correct
stemcell, but `bosh cck` couldn't?" The answer is that `bosh deploy`
follows a different codepath (deploy uses `CreateVmStep`).

Fixes, during `bosh cck` on a multi-CPI Director:
```
Task 41558 | 21:16:40 | Applying problem resolutions: VM for 'dummy-vsphere/812e9491-a615-465a-b975-5f0c044f7739 (0)' with cloud ID 'dummy-vsphere_sslipio_f41a348db816' is not responding. (unresponsive_agent 1945): Recreate VM without waiting for processes to start (00:00:21)
                     L Error: Required stemcell {"name"=>"bosh-aws-xen-hvm-ubuntu-bionic-go_agent", "version"=>"1.25"} not found for cpi vsphere, please upload again
```

[fixes #2287]

Signed-off-by: Maria Shaldybin <mariash@vmware.com>
cunnie added a commit that referenced this issue Aug 26, 2021
Prior to this commit, `bosh cck` would sometimes fail on multi-CPI
installations with a "Required stemcell ... not found for cpi" message.

This commit fixes that failure by selecting the stemcell appropriate for
the particular CPI (rather than merely grabbing the first stemcell,
which was the prior behavior).

An obvious question is, "why could `bosh deploy` find the correct
stemcell, but `bosh cck` couldn't?" The answer is that `bosh deploy`
follows a different codepath (deploy uses `CreateVmStep`).

Fixes, during `bosh cck` on a multi-CPI Director:
```
Task 41558 | 21:16:40 | Applying problem resolutions: VM for 'dummy-vsphere/812e9491-a615-465a-b975-5f0c044f7739 (0)' with cloud ID 'dummy-vsphere_sslipio_f41a348db816' is not responding. (unresponsive_agent 1945): Recreate VM without waiting for processes to start (00:00:21)
                     L Error: Required stemcell {"name"=>"bosh-aws-xen-hvm-ubuntu-bionic-go_agent", "version"=>"1.25"} not found for cpi vsphere, please upload again
```

[fixes #2287]

Signed-off-by: Maria Shaldybin <mariash@vmware.com>
cunnie added a commit that referenced this issue Aug 26, 2021
Prior to this commit, `bosh cck` would sometimes fail on multi-CPI
installations with a "Required stemcell ... not found for cpi" message.

This commit fixes that failure by selecting the stemcell appropriate for
the particular CPI (rather than merely grabbing the first stemcell,
which was the prior behavior).

An obvious question is, "why could `bosh deploy` find the correct
stemcell, but `bosh cck` couldn't?" The answer is that `bosh deploy`
follows a different codepath (deploy uses `CreateVmStep`).

Fixes, during `bosh cck` on a multi-CPI Director:
```
Task 41558 | 21:16:40 | Applying problem resolutions: VM for 'dummy-vsphere/812e9491-a615-465a-b975-5f0c044f7739 (0)' with cloud ID 'dummy-vsphere_sslipio_f41a348db816' is not responding. (unresponsive_agent 1945): Recreate VM without waiting for processes to start (00:00:21)
                     L Error: Required stemcell {"name"=>"bosh-aws-xen-hvm-ubuntu-bionic-go_agent", "version"=>"1.25"} not found for cpi vsphere, please upload again
```

[fixes #2287]

Signed-off-by: Maria Shaldybin <mariash@vmware.com>
cunnie added a commit that referenced this issue Aug 27, 2021
Prior to this commit, `bosh cck` would sometimes fail on multi-CPI
installations with a "Required stemcell ... not found for cpi" message.

This commit fixes that failure by selecting the stemcell appropriate for
the particular CPI (rather than merely grabbing the first stemcell,
which was the prior behavior).

An obvious question is, "why could `bosh deploy` find the correct
stemcell, but `bosh cck` couldn't?" The answer is that `bosh deploy`
follows a different codepath (deploy uses `CreateVmStep`).

Fixes, during `bosh cck` on a multi-CPI Director:
```
Task 41558 | 21:16:40 | Applying problem resolutions: VM for 'dummy-vsphere/812e9491-a615-465a-b975-5f0c044f7739 (0)' with cloud ID 'dummy-vsphere_sslipio_f41a348db816' is not responding. (unresponsive_agent 1945): Recreate VM without waiting for processes to start (00:00:21)
                     L Error: Required stemcell {"name"=>"bosh-aws-xen-hvm-ubuntu-bionic-go_agent", "version"=>"1.25"} not found for cpi vsphere, please upload again
```

[fixes #2287]

Signed-off-by: Maria Shaldybin <mariash@vmware.com>
cunnie added a commit that referenced this issue Aug 27, 2021
Prior to this commit, `bosh cck` would sometimes fail on multi-CPI
installations with a "Required stemcell ... not found for cpi" message.

This commit fixes that failure by selecting the stemcell appropriate for
the particular CPI (rather than merely grabbing the first stemcell,
which was the prior behavior).

An obvious question is, "why could `bosh deploy` find the correct
stemcell, but `bosh cck` couldn't?" The answer is that `bosh deploy`
follows a different codepath (deploy uses `CreateVmStep`).

Fixes, during `bosh cck` on a multi-CPI Director:
```
Task 41558 | 21:16:40 | Applying problem resolutions: VM for 'dummy-vsphere/812e9491-a615-465a-b975-5f0c044f7739 (0)' with cloud ID 'dummy-vsphere_sslipio_f41a348db816' is not responding. (unresponsive_agent 1945): Recreate VM without waiting for processes to start (00:00:21)
                     L Error: Required stemcell {"name"=>"bosh-aws-xen-hvm-ubuntu-bionic-go_agent", "version"=>"1.25"} not found for cpi vsphere, please upload again
```

[fixes #2287]

Signed-off-by: Maria Shaldybin <mariash@vmware.com>
@cunnie
Copy link
Member

cunnie commented Aug 27, 2021

Successful Multi-CPI cck test

We hot-patched our BOSH Director with this commit.

Deployed a 4-CPI (AWS, Azure, GCP, vSphere) deployment:

bosh -d multi-cpi-cck is
Using environment 'bosh-vsphere.nono.io' as user 'admin'

Task 41705. Done

Deployment 'multi-cpi-cck'

Instance                                      Process State  AZ       IPs
aws/65d12bae-150e-4dfd-a292-f550625e8a39      running        aws      10.0.0.7
                                                                      18.210.45.12
azure/d04c8cdb-4236-4346-af54-aac3a7ac0250    running        azure    10.0.0.5
                                                                      20.198.146.138
google/bd396e3c-323a-4073-83d2-b292df6227c9   running        google   10.128.0.4
                                                                      35.223.96.158
vsphere/f90ca385-75df-4140-9b84-2337230995d4  running        vsphere  10.2.0.203

Manually terminated each instance, then checked to make sure they were really down:

bosh -d multi-cpi-cck is
Using environment 'bosh-vsphere.nono.io' as user 'admin'

Task 41707. Done

Deployment 'multi-cpi-cck'

Instance                                      Process State       AZ       IPs
aws/65d12bae-150e-4dfd-a292-f550625e8a39      unresponsive agent  aws      10.0.0.7
                                                                           18.210.45.12
azure/d04c8cdb-4236-4346-af54-aac3a7ac0250    unresponsive agent  azure    10.0.0.5
                                                                           20.198.146.138
google/bd396e3c-323a-4073-83d2-b292df6227c9   unresponsive agent  google   10.128.0.4
                                                                           35.223.96.158
vsphere/f90ca385-75df-4140-9b84-2337230995d4  unresponsive agent  vsphere  10.2.0.203

4 instances

Succeeded

Successfully ran cck:

bosh -d multi-cpi-cck cck --resolution=recreate_vm
Using environment 'bosh-vsphere.nono.io' as user 'admin'

Using deployment 'multi-cpi-cck'

Task 41708

Task 41708 | 15:57:53 | Scanning 4 VMs: Checking VM states (00:00:21)
Task 41708 | 15:58:14 | Scanning 4 VMs: 0 OK, 4 unresponsive, 0 missing, 0 unbound (00:00:00)
Task 41708 | 15:58:14 | Scanning 0 persistent disks: Looking for inactive disks (00:00:00)
Task 41708 | 15:58:14 | Scanning 0 persistent disks: 0 OK, 0 missing, 0 inactive, 0 mount-info mismatch (00:00:00)

Task 41708 Started  Fri Aug 27 15:57:53 UTC 2021
Task 41708 Finished Fri Aug 27 15:58:14 UTC 2021
Task 41708 Duration 00:00:21
Task 41708 done

#     Type                Description
5507  unresponsive_agent  VM for 'google/bd396e3c-323a-4073-83d2-b292df6227c9 (0)' with cloud ID 'vm-5ae750b8-6df5-492f-707c-61abbe57653f' is not responding.
5508  unresponsive_agent  VM for 'aws/65d12bae-150e-4dfd-a292-f550625e8a39 (0)' with cloud ID 'i-05b36e881eb9f60df' is not responding.
5509  unresponsive_agent  VM for 'vsphere/f90ca385-75df-4140-9b84-2337230995d4 (0)' with cloud ID 'vsphere_multi-cpi-cck_9313ce32f92f' is not responding.
5510  unresponsive_agent  VM for 'azure/d04c8cdb-4236-4346-af54-aac3a7ac0250 (0)' with cloud ID 'agent_id:96737130-4d04-46a1-8f83-8ec309eaf7fd;resource_group_name:bosh-res-group' is not responding.

4 problems

Continue? [yN]: y


Task 41709

Task 41709 | 15:58:22 | Applying problem resolutions: VM for 'azure/d04c8cdb-4236-4346-af54-aac3a7ac0250 (0)' with cloud ID 'agent_id:96737130-4d04-46a1-8f83-8ec309eaf7fd;resource_group_name:bosh-res-group' is not responding. (unresponsive_agent 1947): Recreate VM and wait for processes to start
Task 41709 | 15:58:22 | Applying problem resolutions: VM for 'aws/65d12bae-150e-4dfd-a292-f550625e8a39 (0)' with cloud ID 'i-05b36e881eb9f60df' is not responding. (unresponsive_agent 1946): Recreate VM and wait for processes to start
Task 41709 | 15:58:22 | Applying problem resolutions: VM for 'google/bd396e3c-323a-4073-83d2-b292df6227c9 (0)' with cloud ID 'vm-5ae750b8-6df5-492f-707c-61abbe57653f' is not responding. (unresponsive_agent 1948): Recreate VM and wait for processes to start
Task 41709 | 15:58:22 | Applying problem resolutions: VM for 'vsphere/f90ca385-75df-4140-9b84-2337230995d4 (0)' with cloud ID 'vsphere_multi-cpi-cck_9313ce32f92f' is not responding. (unresponsive_agent 1949): Recreate VM and wait for processes to start
Task 41709 | 15:58:22 | Applying problem resolutions: VM for 'vsphere/f90ca385-75df-4140-9b84-2337230995d4 (0)' with cloud ID 'vsphere_multi-cpi-cck_9313ce32f92f' is not responding. (unresponsive_agent 1949): Recreate VM and wait for processes to start (00:01:19)
Task 41709 | 15:59:52 | Applying problem resolutions: VM for 'google/bd396e3c-323a-4073-83d2-b292df6227c9 (0)' with cloud ID 'vm-5ae750b8-6df5-492f-707c-61abbe57653f' is not responding. (unresponsive_agent 1948): Recreate VM and wait for processes to start (00:01:30)
Task 41709 | 16:00:11 | Applying problem resolutions: VM for 'aws/65d12bae-150e-4dfd-a292-f550625e8a39 (0)' with cloud ID 'i-05b36e881eb9f60df' is not responding. (unresponsive_agent 1946): Recreate VM and wait for processes to start (00:01:49)
Task 41709 | 16:01:52 | Applying problem resolutions: VM for 'azure/d04c8cdb-4236-4346-af54-aac3a7ac0250 (0)' with cloud ID 'agent_id:96737130-4d04-46a1-8f83-8ec309eaf7fd;resource_group_name:bosh-res-group' is not responding. (unresponsive_agent 1947): Recreate VM and wait for processes to start (00:03:30)

Task 41709 Started  Fri Aug 27 15:58:22 UTC 2021
Task 41709 Finished Fri Aug 27 16:01:52 UTC 2021
Task 41709 Duration 00:03:30
Task 41709 done

Succeeded
 cunnie@fedora  ~/workspace/deployments   master  bosh -d multi-cpi-cck is
Using environment 'bosh-vsphere.nono.io' as user 'admin'

Task 41712. Done

Deployment 'multi-cpi-cck'

Instance                                      Process State  AZ       IPs
aws/65d12bae-150e-4dfd-a292-f550625e8a39      running        aws      10.0.0.7
                                                                      18.210.45.12
azure/d04c8cdb-4236-4346-af54-aac3a7ac0250    running        azure    10.0.0.5
                                                                      20.198.146.138
google/bd396e3c-323a-4073-83d2-b292df6227c9   running        google   10.128.0.4
                                                                      35.223.96.158
vsphere/f90ca385-75df-4140-9b84-2337230995d4  running        vsphere  10.2.0.203

4 instances

Succeeded

Caveats

Ran unit tests but not integration tests/BATS. Submitting PR now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants