Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An unexpected error occurred: cannot import name util in jobwatcher and sqswatcher #1236

Closed
porcaro33 opened this issue Jul 29, 2019 · 9 comments

Comments

@porcaro33
Copy link

Environment:

  • AWS ParallelCluster 2.4.0
  • OS: CentOS7
  • Scheduler: SGE
  • Master instance type: m5.2xlarge
  • Compute instance type: c5.18xlarge

Bug description and how to reproduce:
I installed GNOME Desktop and VNC Server on the master node. And then I rebooted the master node.
After that, when I submit jobs to SGE, the cluster never scale out. And I found "An unexpected error occurred: cannot import name util" in jobwatcher and sqswatcher

How can I fix this?

@porcaro33
Copy link
Author

attached both watcher log files

jobwatcher.txt
sqswatcher.txt

@tgjohnst
Copy link

#1142 this issue might contain relevant discussion. Do GNOME Desktop or VNC Server have the paramiko package as a dependency/subdependency?

If it is indeed another issue with python package conflicts, it looks from the release notes like updating to pcluster 2.4.1 might solve your issue due to the switch to isolated virtualenvs for the node daemons.

@demartinofra
Copy link
Contributor

Hi,

indeed that's exactly what happened. The custom packages broke the system python deps used by ParallelCluster, hence the scaling daemons are failing. Upgrading to 2.4.1 will solve the issue since the pcluster daemons will run into isolated virtualenvs.

@porcaro33
Copy link
Author

porcaro33 commented Jul 30, 2019

hi I upgraded parallelcluster to 2.4.1 and ran "pcluster create" with same config file. But I got "Unexpected error of type ValueError: too many values to unpack" like following.
Checking latest config syntax now...

$ pcluster create lustre-dev -c lustre-dev
Beginning cluster creation for cluster: lustre-dev
Unexpected error of type ValueError: too many values to unpack

@demartinofra
Copy link
Contributor

Could you share the config with us?

@porcaro33
Copy link
Author

here is my config file

test-cluster.txt

@porcaro33
Copy link
Author

I also tried #1241 (comment)

and here is the traceback

Traceback (most recent call last):
File "/home/centos/venv/pcluster-2.4.1/bin/pcluster", line 10, in
sys.exit(main())
File "/home/centos/venv/pcluster-2.4.1/lib/python2.7/site-packages/pcluster/cli.py", line 371, in main
args.func(args)
File "/home/centos/venv/pcluster-2.4.1/lib/python2.7/site-packages/pcluster/cli.py", line 29, in create
pcluster.create(args)
File "/home/centos/venv/pcluster-2.4.1/lib/python2.7/site-packages/pcluster/pcluster.py", line 77, in create
config = cfnconfig.ParallelClusterConfig(args)
File "/home/centos/venv/pcluster-2.4.1/lib/python2.7/site-packages/pcluster/cfnconfig.py", line 77, in init
self.__init_cluster_parameters()
File "/home/centos/venv/pcluster-2.4.1/lib/python2.7/site-packages/pcluster/cfnconfig.py", line 526, in __init_cluster_parameters
self.__validate_resource(cluster_options.get(key)[1], temp)
File "/home/centos/venv/pcluster-2.4.1/lib/python2.7/site-packages/pcluster/cfnconfig.py", line 289, in __validate_resource
self.__resource_validator.validate(resource_type, resource_value)
File "/home/centos/venv/pcluster-2.4.1/lib/python2.7/site-packages/pcluster/config_sanity.py", line 354, in validate
for actions, resource_arn in iam_policy:
ValueError: too many values to unpack

@demartinofra
Copy link
Contributor

Thanks for sharing additional debugging info. The issue you are facing is the same reported here: #1241. Fix is merged but will be available in the next version of ParallelCluster. you can disable sanity_check as a workaround

@porcaro33
Copy link
Author

I disabled sanity_check and added "autoScaling:SetInstanceHealth" into InstanceRole. And then my issue was fixed.

Thank you all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants