Join GitHub today
Unable to connect to skein driver #165
However, the environment I am trying to get dask-yarn/skein working on is a 3 Node EMR v.5.21.0 cluster.
Edit: copied from stackoverflow below:
I am trying to get dask-yarn running. I figured out that, the skein module is used for submitting applications to the YARN cluster and it is failing to do so.
This code fails with:
Starting the driver from CLI with
Stopping the driver or trying to get an empty list of running applications writes
Some time ago dask-yarn was based on knit, so I tried that one out for applications submission. In knit I successfully submitted some hello world app, but with some manual configuration of the name node/resource manager in the client. Autodetection would not work. Maybe it is a similar problem? The skein driver is written in Java and I can't find a way to configure it the same way as knit.
Thanks for the issue report. Dask-Yarn and Skein have worked on EMR for myself and others, I'm not sure what's going on here. Could you run the following script and report back with the full output?
import skein print(skein.__version__) client = skein.Client(log_level='debug') # From your above description the error should happen before # you get here, but just in case lets try an operation client.get_applications()
Knit tried to find and parse the hadoop configuration files in Python, which was error prone. Skein just relies on the Hadoop Java libraries for this obviating the need for configuration. The issue you're running into is that the background java process that Skein starts is failing to communicate with the Python process that started it.
Failing to create a client, so no operations on that one.
That's odd. We start the Java driver process with a pipe from Python connected to stdin of that process. The java process blocks on reading from this pipe, and shutsdown when it closes. This lets the Java process know when the Python process has exited so that it can shutdown as well. For some reason the pipe is being closed unexpected, resulting in the Java process exiting early. I've never seen this error before.
When run as a background process a different exit method is used - lets see if that works.
import skein skein.Client.stop_global_driver(force=True) skein.Client.start_global_driver(log_level='debug', log='driver.log') client = skein.Client.from_global_driver() client.get_applications()
If the java process starts, you'll want to shut it down later:
$ skein driver stop --force
Can you report back here with the results of that script and the
Your code snippet fails in the same way and the
Java on the machine is:
The driver doesn't report itself as shutting down like it did above - is it still running after doing this (check using
import skein client = skein.Client(address='127.0.0.1:45165') # assuming same port as in the logs above
referenced this issue
Apr 20, 2019
Hmmm, I'm not sure. It doesn't look like localhost should ever resolve to the ipv6 address, but I'm not a networking expert.
I've pushed #166 to force use of 127.0.0.1 instead of localhost. If you could try installing that version to see if things are fixed that'd be appreciated.