Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix stopping module in 1.1.6 #5552

Merged
merged 2 commits into from
Sep 24, 2021
Merged

Conversation

huguesBouvier
Copy link
Contributor

@huguesBouvier huguesBouvier commented Sep 21, 2021

In 1.1.5 we fixed an issue with windows build. Windows would not create container if the socket doesn't exist.
However, the socket was created in "start" which comes after "create" of containers, so 1.1.5 would not work properly on windows if socket didn't exist beforehand.

The fix was to move the socket creation in "create" stage, this however creates another issue.
The server for the workload API starts in the create stage and get stopped when the container is stopped.
However, when a container is stopped, it is still "created" so it can be restarted without going through the "create" stage.
So stopping a module would kill the workload API server and restarting it, would make it running again but would not start the workload server because it is created in "create".

In the 1.1.5 fix, this was tested: 382cf77

Tried stopping a module and checked listener would be stopped.
Tried removing a module with and without stopping the module first

Removing a module and restarting it (without removing it) was not tried. Hence the issue went through the net.

The workaround the issue is simple:

  • restart iotedge (this will create back the server)
  • kill the container so it goes through the create state
  • put a dummy variable in portal.

Test:
Linux
Create a module "registry" and check it responds to curl --unix-socket /var/lib/iotedge/mnt/registry.sock http://hello
stop a module "registry" and check it still respond to curl --unix-socket /var/lib/iotedge/mnt/registry.sock http://hello
start a module "registry" and check it still respond to curl --unix-socket /var/lib/iotedge/mnt/registry.sock http://hello

Tested on both linux and windows (no curl on windows)
Killed a container and check it restarts correctly. Check in the logs that listener is created
Put a dummy variable in edge Agent to force a restart.
Simple iotedge restart.
Start/Stop/start a container.

@nyanzebra
Copy link
Contributor

So the solution is to now not send the Stop?

@huguesBouvier
Copy link
Contributor Author

So the solution is to now not send the Stop?

Exactly. We leave the server running even if the container is stopped.

varunpuranik
varunpuranik previously approved these changes Sep 22, 2021
nyanzebra
nyanzebra previously approved these changes Sep 22, 2021
@huguesBouvier huguesBouvier merged commit 2e6b208 into Azure:release/1.1 Sep 24, 2021
@huguesBouvier huguesBouvier deleted the fix_1.1.6 branch September 24, 2021 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants