Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error associating wssdagent #18

Closed
ericvruder opened this issue Apr 15, 2021 · 5 comments
Closed

Error associating wssdagent #18

ericvruder opened this issue Apr 15, 2021 · 5 comments

Comments

@ericvruder
Copy link

ericvruder commented Apr 15, 2021

This error usually occurs when I try to removed the vm and redeploy it. This is not something I can consistently reproduce, but time seems to be a factor here. So I install eflow, use it as normal (I do quite a few Remove-EflowVm and Deply-Eflow), and then after some days this error starts coming up. Only way I have discovered to fix it is to reinstall Eflow.

Funny enough, it doesn't seem to be affecting anything. If I ignore it, it is still capable of registering to the iot hub, and even deploying correctly. No error logs from either edgeAgent or edgeHub, everything is coming back ok.

I will try this Friday to install Eflow and try to see how long it takes before it starts failing again.

[04/15/2021 08:51:33] Step 3: Configuring directories and virtual machine


[04/15/2021 08:51:33] Configuring directories


[04/15/2021 08:51:33] Associating wssdagent service with nodectl

 - C:\iotedge\nodectl.exe  security login --loginpath c:\programdata\wssdagent\nodelogin.yaml --identity failed to execute [Error: rpc error: code = Unauthenticated desc = Valid Token Required.]
False

To Reproduce
Steps to reproduce the behavior:

  • install eflow
  • remove-eflowvm and deploy-eflow a couple of times
  • after a few days, it starts failing.

Expected behavior
I don't expect it to fail.

Windows Host OS (please complete the following information):

  • OS Name: Microsoft Windows 10 Enterprise
  • OS Version: 10.0.19042 N/A Build 19042
  • Virtual Machine: Local VM
  • Eflow version 1.0.1.0
@fcabrera23
Copy link
Contributor

Hi @ericvruder,

Thank you for the detailed question and background. We are aware of a similar issue with the wssdagent and networking problems.

If you have this problem again, go to Hyper-V Manager, and select the Networking tab and check if the VM has an assigned IP. If not, please try resetting the VM and check again.

Let me know if that fixes the problem. We will be addressing it in our next update.

@ericvruder
Copy link
Author

ericvruder commented Apr 23, 2021

Hi @fcabrera23
So it fails during creation of the VM, so it's not something to do with any currently deployed VM. I start out by removing the guest and then redeploying.

[04/23/2021 08:15:11] Deleting virtual machine

 - Removing vnic (name: PCName-EFLOWInterface)
 - Removing storage vhd (file: AzureIoTEdgeForLinux-v1-EFLOW)

[04/23/2021 08:15:16] Virtual machine removed successfully

True

[04/23/2021 08:15:16] Deploying Azure IoT Edge for Linux on Windows


[04/23/2021 08:15:16] Enabling Microsoft Update. This will allow Azure IoT Edge for Linux on Windows to receive updates.

 - Microsoft Update is enabled.

[04/23/2021 08:15:16] Step 1: Preparing host for Azure IoT Edge for Linux on Windows


[04/23/2021 08:15:16] Checking host for required features

 - Checking the status of 'Microsoft-Hyper-V'
 - Checking the status of 'Microsoft-Hyper-V-Management-PowerShell'
 - Checking the status of 'Microsoft-Hyper-V-Hypervisor'
 - Checking the status of 'OpenSSH.Client*'

[04/23/2021 08:15:18] Checking for virtual switch with name 'Default Switch'

 - The virtual switch 'Default Switch' of type 'Internal' is already present

[04/23/2021 08:15:18] Step 2: Verifying host limits


[04/23/2021 08:15:18] Verifying required storage, RAM and number of cores are available

 - Drive 'C:' has 169 GB free
 - A minimum of 10 GB disk space is required on drive 'C:'
 - Host has 5 GB free memory                                                                                   
 - A minimum of 2 GB memory is required                                                                                  
 - Host has 6 CPU cores                                                                                                 
 - A minimum of 4 CPU cores is required                                                                                                                                                                                                         
 
[04/23/2021 08:15:18] Step 3: Configuring directories and virtual 

[04/23/2021 08:15:18] Configuring directories

It then fails here, associating the wssdagent

[04/23/2021 08:15:18] Associating wssdagent service with nodectl

 - C:\iotedge\nodectl.exe  security login --loginpath c:\programdata\wssdagent\nodelogin.yaml --identity failed to execute [Error: rpc error: code = Unauthenticated desc = Valid Token Required.]
False

After that, it continues on as normally.

[04/23/2021 08:15:18] Verifying installation

 - Testing for expected binaries
 - Testing for expected images
 - Testing for ssh key
 - Testing for wssdagent service
 - Testing if wssdagent is running
 - Testing for Hyper-V Host Compute Service
 - Verifying whether Hyper-V is active
 - Hyper-V is active
 - Testing if container resource is provisioned
 - Testing if vnet resource 'Default Switch' is provisioned

[04/23/2021 08:15:18] Step 4: Runtime install complete. Creating virtual machine


[04/23/2021 08:15:19] Creating virtual machine (username: iotedge-user)


[04/23/2021 08:15:19] Verifying required storage, RAM and number of cores are available

 - Drive 'C:' has 169 GB free
 - A minimum of 16 GB disk space is required on drive 'C:'
 - Host has 5 GB free memory
 - A minimum of 2 GB memory is required
 - Host has 6 CPU cores
 - A minimum of 4 CPU cores is required

[04/23/2021 08:15:21] Setting dynamically expanding virtual hard disk maximum size to 16 GB

 - Creating storage vhd (file: AzureIoTEdgeForLinux-v1-EFLOW)
 - Creating vnic (name: PCName-EFLOWInterface)
 - Instantiating virtual machine (name: PCName-EFLOW)
 - Virtual machine instantiated, hostname is: PCName-EFLOW-3f097630

[04/23/2021 08:15:31] Virtual machine created successfully.


[04/23/2021 08:15:31] Successfully created virtual machine


[04/23/2021 08:15:31] Querying IP and MAC addresses from virtual machine (PCName-EFLOW)

 - Virtual machine MAC: 00:00:00:00:00:00
 - Virtual machine IP : 172.18.156.139

[04/23/2021 08:15:40] Done.


[04/23/2021 08:15:40] Virtual machine hostname: PCName-EFLOW-3f097630


[04/23/2021 08:15:40] Virtual machine IP address: 172.18.156.139


[04/23/2021 08:15:40] Virtual machine MAC address: 00:00:00:00:00:00


[04/23/2021 08:15:40] Testing SSH connection...


[04/23/2021 08:15:41] ...successfully connected to the Linux VM


[04/23/2021 08:15:41] Retrieving vTPM EK pub hash and registration ID for automated provisioning with DPS

 - TPM provisioning information retrieved!

[04/23/2021 08:15:42] vTPM Endorsement Key: XXX


[04/23/2021 08:15:42] Registration ID: XXX


[04/23/2021 08:15:42] Step 5: Installing and verifying virtual machine software

 - Installing and verifying required virtual machine features (username: iotedge-user)
 - Successfully installed/verified moby-engine package
 - Successfully installed/verified azure-iotedge package

[04/23/2021 08:16:04] Provisioning information not specified, provisioning skipped.


[04/23/2021 08:16:04] Deployment successful

OK
Configuring Eflow VM...

[04/23/2021 08:16:04] Querying IP and MAC addresses from virtual machine (PCName-EFLOW)

 - Virtual machine MAC: 00:15:5d:26:43:0b
 - Virtual machine IP : 172.18.156.139

[04/23/2021 08:16:04] Done.

As can be seen, it got a IP address assigned to it. This is validated further on when I deploy some files to it through SCP

Warning: Permanently added '172.18.156.139' (ECDSA) to the list of known hosts.
cert.der                                                                              100% 1358   807.7KB/s   00:00
cert.pem                                                                              100% 1730     1.6MB/s   00:00
daemon.json                                                                           100%  115   112.3KB/s   00:00
nodeConfiguration.json                                                                100% 3749     3.6MB/s   00:00
opcPublisher.json                                                                     100% 1209     1.2KB/s   00:00
setup.sh                                                                              100%  686   670.0KB/s   00:00

From now on, this will ALWAYS happen when I delete and redeploy the VM. As mentioned, the only way to solve this is to reinstall Eflow completely. But I am not sure what the consequences of the error is? There doesn't seem to be any.

@ms-mahuber
Copy link

Hi @ericvruder, thanks for trying this out. In the next update of EFLOW, the publicly exposed functions "New-EflowVm" and "Remove-EflowVm" will be removed. Any call into these functions leads to unexpected behavior.

The lifecycle should be: Install the EFLOW MSI, run Deploy-Eflow exactly once, Uninstall EFLOW. After that, you can re-install EFLOW.
The removal of the EFLOW VM and subsequent re-creations such as by calling New-EflowVm or Deploy-Eflow are not supported scenarios.

Apologies, this is a deficiency in our documentation. Please let us know if you come across the described problem for the scenario where you run Deploy-Eflow only after MSI installation. In this case we will need to make an investigation.

Thanks,

Manuel

@ericvruder
Copy link
Author

ericvruder commented Apr 26, 2021

No worries! But I do have a few questions:

The lifecycle should be: Install the EFLOW MSI, run Deploy-Eflow exactly once, Uninstall EFLOW. After that, you can re-install EFLOW.

Why should I uninstall, then re-install?

What is the imagined flow if I need to re-provision the edge gateway? Right now, I reinstall everything to set it into a fresh condition. Will I be able to just call Provision-Eflow with some new parameters?

Is the goal here to remove the need for me to interact with the linux container host? By removing the remove-vm and redeploy-vm, you are making it more difficult for me to test setting up the vm with the correct configuration. But most of that stuff is "default", in the sense that I followed the production guide as closely as possible, and that suited my needs. I only change the configuration for the modules I am trying to deploy. Will you be simplifying that process?

Finally, is there a release date for the next version? :)

@fcabrera23
Copy link
Contributor

@ericvruder - If you want a new fresh installation of EFLOW, you need to uninstall EFLOW before being able to re-install it. That's the flow, we do not support reinstallation without cleaning the machine first.

Regarding the re-provisioning, in our next release, you can just call the Provision-Eflow with the method (TPM, manualx509, manual string, DPS x509) and the corresponding parameters.

For the setting up, you can use parameters to set up the hardware requirements. Also, we plan to add port configuration, file sharing, and certificate configuration in future releases.

Finally, we are working hard to get our next version public, probably soon, although we don't have a defined date set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants