Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No handlers could be found for logger "pymesos.process" #5

Closed
tjsongzw opened this issue May 23, 2016 · 9 comments
Closed

No handlers could be found for logger "pymesos.process" #5

tjsongzw opened this issue May 23, 2016 · 9 comments

Comments

@tjsongzw
Copy link

Hi there,

based on the demo.py, tensorflow/tensorflow#1996
I got

Tensorflow cluster registered
[WARNING] [tfmesos.scheduler] Task failed:
<Task
  mesos_task_id=1
  addr=None
>
No handlers could be found for logger "pymesos.process"

seems addr didn't parsed correctly, any idea?

@windreamer
Copy link
Contributor

hi @tjsongzw
you can try to enable loggging by:

import logging
logging.basicConfig(level=logging.INFO)

and then we will know what had happened for the failed tasks

@tjsongzw
Copy link
Author

@windreamer thx,
here

ERROR:pymesos.process:error while call <function handle at 0x7ffa118ea938> (tried 0 times)
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/pymesos/process.py", line 69, in run_jobs
    func(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/pymesos/process.py", line 142, in handle
    f(*args)
  File "/usr/local/lib/python2.7/dist-packages/pymesos/scheduler.py", line 123, in onStatusUpdateMessage
    self.sched.statusUpdate(self, update.status)
  File "/usr/local/lib/python2.7/dist-packages/tfmesos/scheduler.py", line 264, in statusUpdate
    task.connection.close()
AttributeError: 'NoneType' object has no attribute 'close'

and

MESOS_MASTER=zk://host1:2181,host2:2181,host3:2181/mesos
...
python /tmp/demo.py zk://host1:2181,host2:2181,host3:2181/mesos

the setting above in run.sh should be correct, right?

what I did: I pulled the tfmesos docker, then exec the script on mesos cluster

@windreamer
Copy link
Contributor

oops, there is a little bug, fixed in 18acc72
and I think there should be some thing not working, does your mesos cluster support docker?

@tjsongzw
Copy link
Author

yep, most components run as docker, every node is equipped with tfmesos images, and this should be fine? actually it's a DCOS cluster(marathon, etc.).

now I try again, thank you!

@tjsongzw
Copy link
Author

hi, still
task failed
something like below, all of the 4 tasks, the addr problem

WARNING:tfmesos.scheduler:Task failed:
<Task
  mesos_task_id=2
  addr=None
>, Docker container run error: Container exited on error: exited with status 1

docker logs

Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/tfmesos/server.py", line 57, in <module>
    sys.exit(main(sys.argv))
  File "/usr/local/lib/python2.7/dist-packages/tfmesos/server.py", line 24, in main
    c.connect(maddr)
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
socket.gaierror: [Errno -2] Name or service not known

@windreamer
Copy link
Contributor

at present, workers in docker will connect back to the submitter via hostname.
it seems the machine running tfmesos can not be reached by the mesos slaves?

@tjsongzw
Copy link
Author

hi, tfmesos works now!
the problem was I have to config all the mesos cluster machines /etc/hosts, otherwise address couldn't be parsed correctly.

ps, is there another way instead getting the hostname of the machine, to get the ip or something that mesos slave/master could recognize each other?

@windreamer
Copy link
Contributor

hmm, not sure, one machine can have multple ips, and it is hard to decide which to use.
for now, i prefer the hostname way.

@tjsongzw
Copy link
Author

yeah, multi interface makes it not that easy.
thank you!

you can close this issue thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants