Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add part1 manual test case from v1.3.0 validated tickets. #1217

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

* Deployment

## Verification Steps
1. Use ipex-example to provision Harvester
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo, should be ipxe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Correct the repo in line 12.

```
write_files:
- encoding: b64
content: {harvester's kube config, the cluster namespace should be same as the pool you created (base64 enconded)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enconded -> encoded

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Correct the repo in line 21.

1. Copy the base64 encoded kubeconfig to the cloud config write file sections above
{{< image "images/rancher/4678-base64-kubeconfig.png" >}}
1. Provision the RKE2 guest cluster
1. After pools are created, we remove the harvester-cloud-provider in Apps > Installed Apps (kube-system namesapce).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namesapce -> namespace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Correct the repo in line 42.

1. Create the first pool, specif the `control plane` role only
1. Create the second pool, specify the `etcd` role only
1. Create the third pool, specify the `worker` role only
1. Repeat the steps 13 - 17 to create load blancer service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blancer -> balancer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Correct the repo in line 58.


## Expected Results
* Control-plane, ETCD and worker in same pool:
- Can sucessfully create LoadBalancer service on RKE2 guest cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sucessfully -> successfully

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Correct the repo in line 67

1. Each VM launched will have an additional label added.
1. The anti-affinity rule on the VM will leverage this label to ensure VM's are spread across in the cluster

1. Corden one of the node on Harvester host page
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corden -> Cordon

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Correct the repo in line 27.

* The anti-affinity rule set on the label of vm in the yaml content
{{< image "images/virtual-machines/4588-yaml-anti-affinity.png" >}}

* If Node 2 cordened, none of the VM would be scheduled on that node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cordoned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Correct the repo in line 38.

node1:~ # kubectl get volume -A
NAMESPACE NAME STATE ROBUSTNESS SCHEDULED SIZE NODE AGE
longhorn-system pvc-5a861225-920d-4059-b501-f02b2fd0ff27 detached unknown 10737418240 19m
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decrease an indent.

Current

    node1:~ # kubec
    NAMESPACE
    ...

Should be

node1:~ # kubec
NAMESPACE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Decrease the indent to align the format.

```
NAMESPACE NAME STATE ROBUSTNESS SCHEDULED SIZE NODE AGE
longhorn-system pvc-d1226d97-ab90-4d40-92f9-960b668093c2 detached unknown 10737418240 5m12s
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Decrease the indent to align the format.

1. Create a snapshot for the vm1
1. Restore the backup of vm1 to create a new VM
1. Check can restore vm correctly
1. Shutdown vm1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not end with comma.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Remove the comma in line 19.

Copy link
Collaborator

@khushboo-rancher khushboo-rancher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TachunLin for documenting all the test cases, can you take a look at the questions I have?

Comment on lines 12 to 15
1. Create a VM.
1. After the VM is ready, stop the VM.
1. Check VM volumes are detached.
1. Take a snapshot on the VM. The snapshot can be ready.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have this in automated test already?

Copy link
Contributor Author

@TachunLin TachunLin Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through our existing backend integration test scenarios of:

  1. test_1_volumes.py
  2. test_4_vm_backup_restore.py
  3. test_4_vm_snapshot.py

I found the following test plan in test_4_vm_snapshot.py may covered in the test_create_vm_snapshot_while_pvc_detached

def test_create_vm_snapshot_while_pvc_detached(self, api_client,
vm_snapshot_2_name, source_vm, wait_timeout):
"""
Test that a VM snapshot can be created when the source
PVC is detached.
Prerequisites:
The original VM (`source-vm`) exists and is stopped (so that
the PVC is detached.)
"""
name, _ = source_vm
stop_vm(name, api_client, wait_timeout)
code, _ = api_client.vm_snapshots.create(name, vm_snapshot_2_name)
assert 201 == code
deadline = datetime.now() + timedelta(seconds=wait_timeout)
while deadline > datetime.now():
code, data = api_client.vm_snapshots.get(vm_snapshot_2_name)
if data.get("status", {}).get("readyToUse"):
break
print(f"waiting for {vm_snapshot_2_name} to be ready")
sleep(3)
else:
raise AssertionError(f"timed out waiting for {vm_snapshot_2_name} to be ready")
code, data = api_client.vm_snapshots.get(vm_snapshot_2_name)
assert 200 == code
assert data.get("status", {}).get("readyToUse") is True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the test plan, remove the lines that have been covered in the automation e2e test.
And replace to indicate which e2e test covered this line.

1. Restore the snapshot to a new VM. The new VM can be ready.
1. Restore the snapshot to replace the old VM. The old VM can be ready.

Case 3: backup can work on a stopped VM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, can you check if we have this in automated test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Case 2:
I found the following test plan in test_4_vm_snapshot.py integration test may covered in the test test_restore_from_vm_snapshot_while_pvc_detached_from_source

def test_restore_from_vm_snapshot_while_pvc_detached_from_source(self,
api_client,
restored_vm_2,
host_shell,
vm_shell,
ssh_keypair,
wait_timeout):
"""
Test that a new virtual machine can be created from a
VM snapshot created from a source PersistentVolumeClaim
that is now detached.
Prerequisites:
The original VM (`source-vm`) exists and is stopped (so that
the PVC is detached.)
The original snapshot (`vm-snapshot`) exists.
"""
name, ssh_user = restored_vm_2
def actassert(sh):
out, _ = sh.exec_command("cat test.txt")
assert "123" in out
vm_shell_do(name, api_client,
host_shell, vm_shell,
ssh_user, ssh_keypair,
actassert, wait_timeout)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the test plan, remove the lines that have been covered in the automation e2e test.
And replace to indicate which e2e test covered this line.

Comment on lines +11 to +22
## Verification Steps
1. Prepare a 3 nodes Harvester cluster
1. Create a vm named vm1
1. Setup nfs backup target
1. Create a backup for the vm1
1. Create a snapshot for the vm1
1. Restore the backup of vm1 to create a new VM
1. Check can restore vm correctly
1. Shutdown vm1
1. Restore the backup of vm to replace the existing vm
1. Select Retain volume
1. Check can restore vm correctly
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check if this is covered in automated test already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By checking our existing vm backup and snapshot integration test scenarios in:

  1. test_4_vm_backup_restore.py
  2. test_4_vm_snapshot.py

I did not found a suitable automation test could both cover when vm already have the snapshot and backup on it.

Among these tests

test_4_vm_backup_restore.py

  1. test_connection
  2. tests_backup_vm
  3. test_update_backup_by_yaml
  4. test_restore_with_new_vm
  5. test_restore_replace_with_delete_vols
  6. test_restore_replace_vm_not_stop
  7. test_restore_with_invalid_name
  8. test_backup_migrated_vmtest_restore_replace_migrated_vm
  9. test_backup_multiple
  10. test_delete_last_backup
  11. test_delete_middle_backup

test_4_vm_snapshot.py

  1. test_vm_snapshot_create
  2. test_restore_into_new_vm_from_vm_snapshot
  3. test_replace_is_rejected_when_deletepolicy_is_retain
  4. test_replace_vm_with_vm_snapshot
  5. test_restore_from_vm_snapshot_while_pvc_detached_from_source
  6. test_create_vm_snapshot_while_pvc_detached
  7. test_vm_snapshots_are_cleaned_up_after_source_vm_deleted
  8. test_volume_snapshots_are_cleaned_up_after_source_volume_deleted

Comment on lines +60 to +62
1. Prepare three nodes Harvester and import with Rancher
1. Provision a RKE2 guest cluster in Rancher
1. Create the only one pool, specif the control plane, etcd and workers roles
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already covered in automation test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After checking and confirmed with team, our existing backend e2e test of rancher integration in test_9_rancher_integration.py:

There we provide the ability to create 3 nodes guest cluster

pytest.param(3, marks=pytest.mark.skip(reason="Skip for low I/O env."))])

In actual testing helped by Albin, the 3 nodes downstream cluster are created in the same pool and all set to the ALL role.
image

Thus we can assume this manual test plan to have different roles in different pools did not covered by e2e test.

1. After pools are created, we remove the harvester-cloud-provider in Apps > Installed Apps (kube-system namespace).
1. Add new charts in Apps > Repositories, use https://charts.harvesterhci.io to install and select 0.2.3.
{{< image "images/rancher/4678-add-harvester-repo.png" >}}
1. Install Harvester cloud provider 0.2.3 from market
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Install Harvester cloud provider 0.2.3 from Apps & marketplace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check.
Update to the correct term in this line.

Comment on lines 42 to 43
1. After pools are created, we remove the harvester-cloud-provider in Apps > Installed Apps (kube-system namespace).
1. Add new charts in Apps > Repositories, use https://charts.harvesterhci.io to install and select 0.2.3.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check a particular version of chart, removing the default and installing from Apps is fine but in general, this should not be practiced. As the bundled chart might get rolled back to the manifest shipped with guest cluster. So, I think adding this as generalized test case is not required.

Copy link
Contributor Author

@TachunLin TachunLin Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion.

That's true, the tester should confirm the cloud provider default provided in the RKE2 guest cluster at least to have 0.2.3 to include this changes.

These steps here are not necessary since these steps are just for the time when the bundled chart is not ready during the issue verification stage.

We can remove these lines and the related static screenshot.
Then add the check step to ensure the desired cloud provider version

Comment on lines +15 to +16
1. Create the first pool, specify the `etcd` and `control plane` roles
1. Create the second pool, specify the `work` role only
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@albin, do we have a test case with 2 pool created while testing guest cluster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After checking and helped by Albin, currently in the backend e2e test in test_9_rancher_integration.py
We provide creating 3 nodes in the same roles ALL and in the same pool.

Thus it seems the manual test is somehow differ with the backend e2e Rancher integration test.

@@ -0,0 +1,22 @@
---
title: Upgrade with VM shutdown in the operating system
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being taken care in upgrade automation test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my understating, our existing upgrade backend e2e test will shutdown VM before starting upgrade.
This manual test have a little bit difference with e2e test since in this test we shudown VM by accessing into the operating system while the e2e test shutdown VM from the virtual machine related API call.

1. Login to Windows VM and shutdown machine from menu
1. Login to openSUSE VM , use command to shutdown machine

Thus I am thinking whether we can keep this test plan since there operations are somehow different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/manual-test documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants