-
Notifications
You must be signed in to change notification settings - Fork 314
v2.10.1: Merge develop into release-2.10 #2333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The new parameter allows to specify a custom node for the createami test. This new parameter permits to specify a custom node URL, that is needed when version bump is done and node package is not yet present in PyPi. Signed-off-by: Luca Carrogu <carrogu@amazon.com>
Signed-off-by: Enrico Usai <usai@amazon.com>
* Refactor src and tests structure according to https://docs.pytest.org/en/stable/goodpractices.html * Differentiated between tests with coverage and tests without coverage. Run without coverage with all supported Python version and test against the installed version of the CLI (installing from the sdist package). Run with coverage only for Python 3.8 - when running with coverage tests are executed against the package installed in development mode. * Added Python 3.9 to tests * Grouped all travis tasks in a single stage so that the run is faster * Updated setup.py file to reflect the new structure and to add missing project information Signed-off-by: Francesco De Martino <fdm@amazon.com>
Common tests with develop are put in common.yaml, included through jinja Signed-off-by: Luca Carrogu <carrogu@amazon.com>
…ws#2234) 1. Test `additional_sg` in the config file is added to head and compute nodes 2. Test `ssh_from` in the config file applies to the pcluster security group of the head node 3. Test `vpc_security_group_id` in the config file overwrites security group of head and compute nodes, FSx, and EFS Signed-off-by: Hanwen <hanwenli@amazon.com>
* This test is to verify FSx file system launched by pcluster has the correct deployment type as user set in pcluster config file. * FSx file system has three deployment types in commercial regions SCRATCH_1(dafault), SCRATCH_2, PERSISTENT_1 Signed-off-by: Yulei Wang <yuleiwan@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
Previously we were only continuing to poll when the state was one of CREATING or TRANSFERRING. According to the boto3 docs, we should also handle the PENDING state as well. Signed-off-by: Tim Lane <tilne@amazon.com>
Add 1 second sleep to give time to sqswatcher to reconfigure the master with np = max_nodes * node_slots This operation is performed right after sqswatcher removes the compute nodes from the scheduler Signed-off-by: Luca Carrogu <carrogu@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
This removes the need of calling CloudFormation API at every docker container launch. In order to do so a dependency on the head node substack has been introduced for the AWS Batch substack. This makes the cluster creation slower by around 40% when awsbatch is selected as the scheduler. Signed-off-by: Francesco De Martino <fdm@amazon.com>
if `custom_node` is not specified, the `env` variable was referenced before assignment. Signed-off-by: Enrico Usai <usai@amazon.com>
This makes sure we always download the latest for Amazon Linux 2 Signed-off-by: Francesco De Martino <fdm@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
This test uses `troposphere` to create cloudformation stacks for `efs`, `mount target`, and a instance to write an empty file with random name into the efs. Then the test verifies when the existing `efs` is provided through `efs_fs_id` in `pcluster` config file, the cluster created can read the randomly named file and share files between head node and compute node. Signed-off-by: Hanwen <hanwenli@amazon.com>
Add support for io2 volume type for EBS section and Raid section, add integration test to test different volume types Signed-off-by: chenwany <chenwany@amazon.com>
Signed-off-by: chenwany <chenwany@amazon.com>
Signed-off-by: Enrico Usai <usai@amazon.com>
Signed-off-by: Enrico Usai <usai@amazon.com>
Signed-off-by: Enrico Usai <usai@amazon.com>
Signed-off-by: Enrico Usai <usai@amazon.com>
Signed-off-by: Enrico Usai <usai@amazon.com>
When running pcluster in a region with free tier, default instance type is set to the free tier instance type. When running pcluster in the China (BJS) region or AWS GovCloud (US) regions, default instance type is t3.micro. Free tier is not available in the China (BJS) region and AWS GovCloud (US) regions. For more information about free tier, please see https://aws.amazon.com/free/free-tier-faqs/ Signed-off-by: Hanwen <hanwenli@amazon.com>
Move half p4d tests on PDX (us-west-2) Signed-off-by: Luca Carrogu <carrogu@amazon.com>
) Compute instance type parameter is not rendered if scheduler is Slurm. This caused the error `Parameters: [ComputeInstanceType] must have values` in CloudFormation because a value was still expected. With this commit we set "NONE" as default to prevent this value being silently used as if set by the user. Signed-off-by: ddeidda <ddeidda@amazon.com>
Reason for this change is that not all the regions support c4.xlarge. C5 family support is broader What does this change solve? It allows to run the test where C4 isn't present Signed-off-by: Luca Carrogu <carrogu@amazon.com>
What does the change solve? The change allows to run the iam policies test on the regions where AWS Batch is not present Signed-off-by: Luca Carrogu <carrogu@amazon.com>
This config will be used as test bed for new region. Signed-off-by: Luca Carrogu <carrogu@amazon.com>
The `network_interfaces_count` parameter depends on `compute_instance_type`, hence it could fail if this parameter is not specified in the config file. Since the default instance type will always have 1 network interface we can safely return 1 when compute_instance_type is not specified. Signed-off-by: ddeidda <ddeidda@amazon.com>
The test using p4d.24xlarge with slurm scheduler is already performed by the test_hit_efa test Change test_sit_efa to use sge and move it to us-west-2 Remove warning when using p4d.24xlarge with scheduler != slurm Signed-off-by: Luca Carrogu <carrogu@amazon.com>
Signed-off-by: Yulei Wang <yuleiwan@amazon.com>
The new EFA installer provides the EFA kmod for all supported OSs except for Centos8. This commit adds a validator to prevent EFA from being enabled on ARM architectures with Centos8. Signed-off-by: ddeidda <ddeidda@amazon.com>
* Remove the ban of using p4d as head node Signed-off-by: Hanwen <hanwenli@amazon.com> * Update CHANGELOG.md Co-authored-by: Francesco De Martino <demartinof@icloud.com>
* Modify hit_scaling tests to test logic when clustermgtd is down * Computemgtd should terminate any instance in DOWN or POWER_SAVE state, or if slurmctld is down * ResumeProgram should not launch any instance if clustermgtd is down Signed-off-by: Rex <shuningc@amazon.com>
Signed-off-by: Yulei Wang <yuleiwan@amazon.com>
aws#2304) 1. Add `iam_lambda_role` parameter to the config file. If specified, this role will be attached to all Lambda function resources created by CloudFormation Templates. 2. If both `ec2_iam_role` and `iam_lambda_role` are provided, and the scheduler is `sge`, `torque`, or `slurm`, there will be no created by `pcluster` commands. Note that if `awsbatch` is the scheduler, there will be role created during `pcluster create`. 3. Integration tests: Extract some functions (role creation, policy creation) from `storage.kms_key_factory` to `conftest`. The code in `kms_key_factory` is kept untouched to limit the scale of this commit. Signed-off-by: Hanwen <hanwenli@amazon.com>
Signed-off-by: chenwany <chenwany@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
The final number returned from `lspci -n` can be different from 0. Signed-off-by: ddeidda <ddeidda@amazon.com>
P4d is now supported also as head node. Signed-off-by: ddeidda <ddeidda@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
GPUs from manufacturers different from NVIDIA (ex. AMD) are currently not supported in ParallelCluster. With this patch we introduce a warning message that will be printed when GPUs from a manufacturer different from NVIDIA are detected, and we prevent them from being set in compute resurces. Signed-off-by: ddeidda <ddeidda@amazon.com>
Signed-off-by: Rex <shuningc@amazon.com>
When P4d instances are used as head node, the parameter use_public_ips must be set to true in order for the public IP to be assigned to the instance. Signed-off-by: ddeidda <ddeidda@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
Signed-off-by: Francesco De Martino <fdm@amazon.com>
Signed-off-by: Tim Lane <tilne@amazon.com>
Modify the iops and size range ro unblock user create io2 Block Express volume Signed-off-by: chenwany <chenwany@amazon.com>
Changelog ``` - EFA configuration: ``efa-config-1.7`` (from efa-config-1.5) - EFA profile: ``efa-profile-1.3`` (from efa-profile-1.1) - EFA kernel module: ``efa-1.10.2`` (no change) - RDMA core: ``rdma-core-31.2amzn`` (from rdma-core-31.amzn0) - Libfabric: ``libfabric-1.11.1amzn1.0`` (from libfabric-1.11.1amzn1.1) - Open MPI: ``openmpi40-aws-4.1.0`` (from openmpi40-aws-4.0.5) ``` Signed-off-by: Luca Carrogu <carrogu@amazon.com>
Build Number 597 aws-parallelcluster-cookbook Git hash: d5378bb60f7810bb2f467e5ada9589cc8607ee2e aws-parallelcluster-node Git hash: ae7c4b123d18399361b85e31473ad9ee53b21e45 Signed-off-by: ParallelCluster AMI bot <ec2-ds9-dev@amazon.com>
Codecov Report
@@ Coverage Diff @@
## release-2.10 #2333 +/- ##
================================================
+ Coverage 61.81% 61.83% +0.01%
================================================
Files 39 40 +1
Lines 6060 6186 +126
================================================
+ Hits 3746 3825 +79
- Misses 2314 2361 +47
Continue to review full report at Codecov.
|
|
Since it appears to be stuck, I'm going to disable the Travis checks on this branch. The CFN linter failure is expected. Merging. |
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.