Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble getting client on SGE cluster using MongoDB backend #8657

Closed
jakirkham opened this issue Jul 24, 2015 · 14 comments
Closed

Trouble getting client on SGE cluster using MongoDB backend #8657

jakirkham opened this issue Jul 24, 2015 · 14 comments
Assignees
Milestone

Comments

@jakirkham
Copy link
Contributor

If I specify that I want to use the MongoDB backend in ipcluster_config.py, I find I am unable to run Client(profile=<PROFILE_NAME>) without always getting an IOError.

Eventually, the cluster disbands of its own accord suggesting that it failed to connect. However, I don't see this with the default database backed (i.e. NoDB) or the sqlite backend (i.e. SQLiteDB).

I am using iPython ( 3.2.1 ), MongoDB ( 2.4.6 ), and PyMongo ( 3.0.3 ).

@minrk minrk added this to the not ipython milestone Sep 9, 2015
@minrk
Copy link
Member

minrk commented Sep 9, 2015

This is likely a failure to connect to MongoDB. @jakirkham can you show the contents of the ipcluster log file?

@jakirkham
Copy link
Contributor Author

Sure. It doesn't say much, but this is what is in the ipcluster log.

2015-09-09 14:59:25.126 [IPClusterStart] Removing pid file: /root/.ipython/profile_sge/pid/ipcluster.pid                            
2015-09-09 14:59:25.127 [IPClusterStart] Starting ipcluster with [daemon=True]                                                      
2015-09-09 14:59:25.129 [IPClusterStart] Creating pid file: /root/.ipython/profile_sge/pid/ipcluster.pid                            
2015-09-09 14:59:25.129 [IPClusterStart] Starting Controller with SGE                                                               
2015-09-09 14:59:25.153 [IPClusterStart] Job submitted with job id: u'79'                                                           
2015-09-09 14:59:26.155 [IPClusterStart] Starting 7 Engines with SGE                                                                
2015-09-09 14:59:26.167 [IPClusterStart] Job submitted with job id: u'80'                                                           
2015-09-09 14:59:56.168 [IPClusterStart] Engines appear to have started successfully                                                
2015-09-09 15:01:51.682 [IPClusterStart] ERROR | IPython cluster: stopping                                                          
2015-09-09 15:01:51.682 [IPClusterStart] Stopping Engines...                                                                        
2015-09-09 15:01:54.694 [IPClusterStart] Removing pid file: /root/.ipython/profile_sge/pid/ipcluster.pid

Also, this is the content of stderr for the controller.

2015-09-09 14:59:37.011 [IPControllerApp] Using existing profile dir: u'/root/.ipython/profile_sge'

Finally, this is the stderr content for an engine, but the others looks basically the same.

2015-09-09 14:59:36.822 [IPEngineApp] Using existing profile dir: u'/root/.ipython/profile_sge'                                     
2015-09-09 14:59:36.826 [IPEngineApp] WARNING | url_file u'/root/.ipython/profile_sge/security/ipcontroller-engine.json' not found  
2015-09-09 14:59:36.826 [IPEngineApp] WARNING | Waiting up to 60.0 seconds for it to arrive.                                        
2015-09-09 15:00:36.896 [IPEngineApp] CRITICAL | Fatal: url file never arrived: /root/.ipython/profile_sge/security/ipcontroller-eng
ine.json

There was no stdout for either of these processes.

@minrk
Copy link
Member

minrk commented Sep 9, 2015

There might be log files in ~/.ipython/profile_default/log with more info. But from the timestamps it looks like the problem may be the engines starting too early, before the controller is ready for them.

@jakirkham
Copy link
Contributor Author

When creating these settings, I made a new profile called sge instead of using the default profile. I checked in ~/.ipython/profile_default/log, but didn't see anything. There was another log file (other than the cluster one) in ~/.ipython/profile_sge/log/, which I had missed. It was a controller log file. I think it holds the key. Its contents are below.

2015-09-09 17:48:48.955 [IPControllerApp] Hub listening on tcp://*:33894 for registration.
2015-09-09 17:48:48.956 [IPControllerApp] Hub using DB backend: 'MongoDB'
2015-09-09 17:48:48.957 [IPControllerApp] ERROR | Couldn't construct the Controller
Traceback (most recent call last):
  File "/opt/conda/lib/python2.7/site-packages/IPython/parallel/apps/ipcontrollerapp.py", line 328, in init_hub
    self.factory.init_hub()
  File "/opt/conda/lib/python2.7/site-packages/IPython/parallel/controller/hub.py", line 329, in init_hub
    self.db = import_item(str(db_class))(session=self.session.session,
  File "/opt/conda/lib/python2.7/site-packages/IPython/utils/importstring.py", line 42, in import_item
    module = __import__(package, fromlist=[obj])
  File "/opt/conda/lib/python2.7/site-packages/IPython/parallel/controller/mongodb.py", line 14, in <module>
    from pymongo import Connection
ImportError: cannot import name Connection

@jakirkham
Copy link
Contributor Author

This API for PyMongo was deprecated and then removed before switching to 3.x. More info here ( http://api.mongodb.org/python/2.8.1/api/pymongo/connection.html ).

@jakirkham
Copy link
Contributor Author

@minrk, I submitted a fix for this bug in the PR above.

@jakirkham
Copy link
Contributor Author

Also, submitted a fix to ipyparallel.

@minrk
Copy link
Member

minrk commented Sep 9, 2015

@jakirkham thanks!

@jakirkham
Copy link
Contributor Author

Thanks for pointing me in the right direction, @minrk. I added a fix for 2.x above. Is it worth having this ported back any further?

@minrk
Copy link
Member

minrk commented Sep 9, 2015

It's probably not worth backporting to 2.x. There will not be another 2.x release.

@jakirkham
Copy link
Contributor Author

Ah ok, sorry. It has been removed.

@jakirkham
Copy link
Contributor Author

@minrk, is there any plan as to when the next 3.x release might be?

@minrk
Copy link
Member

minrk commented Sep 11, 2015

No plan. Probably soon, since we have a few small fixes we want to push.

@jakirkham
Copy link
Contributor Author

Thanks for the recent iPython 3.2.2 release, @minrk.

I am going to go ahead and close this as the fix has been applied and released in iPython 3.x and will soon be released in ipyparallel 4.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants