Bugfix/brbdcf 1172 cannot get lock hosts pool deployment #15
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request description
Description of the change
Implemented a workaround for a Consul issue regarding the time to wait for a lock, issue hashicorp/consul#4003
Added a unit test where 50 threads attempt to allocate one host in a Pool of 50 free hosts (test failing on error getting the lock before the workaround).
Increased the max timeout used in Yorc to manage more than 150 simultaneous requests.
For now the lock timeout is not accessible for change to the user, it may be needed depending on the size of deployments we expect to support with Hosts Pools.
Misc change: while doing a test with big hosts pool, noticed a limitation in Consul regarding the number of operations that can be performed in a transaction (64).
Had to split the apply transaction if it contains more operations than this limit in Consul implementation.
What I did
The issue hashicorp/consul#4003 created on Consul is that when a lock is taken with the property
LockTryOnce=true
, a call tolock.Lock()
will attempt to get the lock for an expected duration ofLockWaitTime
before returning without having succeeded.Actually with a
LockWaitTime
of 45 seconds, the unit test with 50 threads failed in around 5 seconds.Modified our code to not use
LockTryOnce=true
, and instead as a workaround, arm a timer for aLockWaitTime
duration, to close a channelstopChannel
after this duration.This channel
stopChannel
is passed in argument tolock.Lock()
that is blocking until it will detect thatstopChannel
has been closed.How to verify it
Initially a deployment of elk-basic with a pool of 4 hosts was reported to fail. Trying again a 4 hosts deployment in development environment.
As a yorc user, generate private/public keys in a directory, check the private key is read-only for your yorc user, add the public key content to a file authorized_keys,
Create a devenv.sh file containing your proxy settings:
Then create 4 ssh servers :
Verify that as yorc user, you can connect to each host using the private key you generated :
Create a Hosts Pool made of these 4 hosts, using a yaml file like https://github.com/laurentganne/ystia-devenv/blob/master/testlab/pool4hosts.yaml
and running :
On a setup where the Yorc plugin was added to Alien4Cloud, Yorc was configured as an orchestrator, and a Hosts Pool location was configured.
From Alien4Cloud, Menu
Administration
>Orchestrators
, selectYorc
.Then in the left-hand-side menu, select
Location
,On demand resources
, and add the following resource :yorc.nodes.hostspool.Compute
. No need to set credentials or any other property.From Alien4Cloud, Menu
Catalog
>Manage Archives
>Git import
add the following Git location:
From the Menu
Applications
Create a new Application, Initializing the topology from Topology Template
welcome_basic
.Select the
Environment
,Topology
, Click onEdit
to edit the topology.Remove the Component
Network
.Then add 3 new Components of type
Compute
.And within each of these 3 new
Compute
components add a componentWelcome
.So finally we have 4 compute nodes, with an application Welcome deployed on each.
Click on
Save
to save changes.Then go back to
Environment
. At theLocation
step, select the Hosts Pool.Then deploy the application.
When the deployement has started on Yorc, run this command to check the 4 hosts in the pool were allocated for this application :
Wait until the deployment ends successfuly. Then undeploy the application, wait until it is undeployed successfuly.
Run again the command
yorc hp list
to check the 4 hosts are now free in the Hosts Pool.