Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Octopus takes a long time to connect to deployment targets #5641

Closed
TomPeters opened this issue Jun 24, 2019 · 2 comments · Fixed by OctopusDeploy/Halibut#88

Comments

@TomPeters
Copy link

commented Jun 24, 2019

Prerequisites

  • I have verified the problem exists in the latest version
  • I have searched open and closed issues to make sure it isn't already reported
  • I have written a descriptive issue title
  • I have linked the original source of this report
  • I have tagged the issue appropriately (area/*, kind/bug, tag/regression?)

The bug

When performing a task against a set of machines (eg. Deployments, Health checks), Octopus Server needs to established a connection to each of those machines. Typically those connections should happen in parallel if required.

Instead, these connections happen sequentially. For example, if you have three machines A, B and C, then Octopus server might connect to A first. It will wait until a connection has been successfully established to A before attempting to connect to B, and then wait for an established connection to B before attempting to connect to C.

In many cases, this does not significantly affect overall performance of the system, but in some cases it can cause significant delays. One case is if some of the machines are offline or inaccessible, in which case there will be a significant period of time while Octopus tries to connect, timeout, and retry multiple times, during which no other connections can be established. If there are multiple such machines, significant delays can be seen across the whole Octopus instance.

What I expected to happen

Octopus should be able to obtain connections to different machines in parallel. Connections to different machines should not affect each other in any way.

Steps to reproduce

  1. Add a large number of machines to Octopus (say 100 machines)
  2. Make sure that a subset of those machines are offline (say 20)
  3. Perform a health check against all machines

For some of the machines, the timestamp of the first message Performing health check on machine will appear minutes before the timestamp of the very next message. This time difference should not occur.

Log exerpt

Observe the time difference in the first two lines of this log exerpt

                    |   == Warning: Check deployment target: MyMachine ==
06:59:18   Verbose  |     Performing health check on machine
07:02:28   Verbose  |     Starting C:\Windows\system32\WindowsPowershell\v1.0\PowerShell.exe in working directory 'C:\Octopus\Work\xxx'...
07:02:28   Info     |     Host Name: MyMachine
07:02:28   Info     |     Running As: PC\SYSTEM (Local Administrator: True)
07:02:28   Info     |     Running Tentacle version 3.22.0
07:02:28   Info     |     Tentacle communication uses a 'sha1RSA' certificate
07:02:28   Warning  |     Not running latest version of Calamari. Expected: 4.19.3
07:02:28   Info     |     Drive C: has 28 GB available
07:02:28   Verbose  |     Process C:\Windows\system32\WindowsPowershell\v1.0\PowerShell.exe in C:\Octopus\Work\xxx exited with code 0
07:04:08   Verbose  |     Exit code: 0
07:04:08   Verbose  |     Checking if Calamari should be installed or updated
07:04:08   Info     |     Calamari is missing or out of date and will be installed when the machine is the target of a deployment.
07:04:08   Verbose  |     Recording health check results

Affected versions

The bug is present in Octopus Server, so the installed version of Tentacle on any of the machines is irrelevant.

Octopus Server:
2019.5.2 - 2019.5.11 (inclusive)

Workarounds

  • Rollback to an unaffected version of Octopus Server.
  • Take action to ensure that connections can be obtained quickly to all machines. This may mean deleting any inaccessible machines resources.

Links

Internal support tickets

Public support tickets

Fixed in

Halibut 4.3.14 - OctopusDeploy/Halibut#88
OctopusShared 4.9.32 - OctopusDeploy/OctopusShared@4ceb71b
OctopusDeploy PR#3993 - OctopusDeploy/OctopusDeploy#3993

@TomPeters

This comment has been minimized.

Copy link
Author

commented Jun 24, 2019

Release Note: Fixed a bug where Octopus can take a long time to connect to deployment targets

@jburger jburger closed this Jun 25, 2019
@lock

This comment has been minimized.

Copy link

commented Sep 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. If you think you've found a related issue, please contact our support team so we can triage your issue, and make sure it's handled appropriately.

@lock lock bot locked as resolved and limited conversation to collaborators Sep 24, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants
You can’t perform that action at this time.