Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treon stops running with multiple threads #15

Closed
asteppke opened this issue Aug 13, 2019 · 5 comments
Closed

Treon stops running with multiple threads #15

asteppke opened this issue Aug 13, 2019 · 5 comments

Comments

@asteppke
Copy link

When running treon on Windows 10 with multiple threads it sometimes stops running because of issues with the underlying jupyter client.

To some extent this is an issue with the jupyter client, and or nbconvert but treon is triggering the issue by calling nbconvert in multiple threads.

The error message and discussion of the jupyter client is at this issue: jupyter/jupyter_client#466

For treon a workaround though would be to use multiple processes instead of threads. Ipython does not seem to be thread-safe as of now but this is being worked on (jupyter/nbconvert#936).

@amit1rrr
Copy link
Member

@asteppke Thanks for reporting. From this comment it looks like the issue is resolved in nbconvert 5.6.0. Shouldn't our problem be solved by upgrading to that version?

Also, what's the exact failure beahviour you are seeing? Is it reproducible? If yes, please share the steps so we can try it out.

@asteppke
Copy link
Author

asteppke commented Sep 2, 2019

When starting the current version of treon on a system with the following recent jupyter environment (Windows 10):

jupyter core     : 4.5.0
jupyter-notebook : 6.0.0
qtconsole        : 4.5.4
ipython          : 7.8.0
ipykernel        : 5.1.2
jupyter client   : 5.3.1
jupyter lab      : 1.0.2
nbconvert        : 5.5.0
ipywidgets       : 7.5.1
nbformat         : 4.4.0
traitlets        : 4.3.2

Then I receive the following error messages:

ERROR in testing F:\notebooks\test1.ipynb
            Traceback (most recent call last):
  File "c:\users\alexander\anaconda3\lib\site-packages\treon\task.py", line 23, in run_tests
    self.is_successful, console_output = execute_notebook(self.file_path)
  File "c:\users\alexander\anaconda3\lib\site-packages\treon\test_execution.py", line 11, in execute_notebook
    ep.preprocess(notebook, {'metadata': {'path': '.'}})
  File "c:\users\alexander\anaconda3\lib\site-packages\nbconvert\preprocessors\execute.py", line 379, in preprocess
    with self.setup_preprocessor(nb, resources, km=km):
  File "c:\users\alexander\anaconda3\lib\contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "c:\users\alexander\anaconda3\lib\site-packages\nbconvert\preprocessors\execute.py", line 324, in setup_preprocessor
    self.km, self.kc = self.start_new_kernel(cwd=path)
  File "c:\users\alexander\anaconda3\lib\site-packages\nbconvert\preprocessors\execute.py", line 271, 
in start_new_kernel
    km.start_kernel(extra_arguments=self.extra_arguments, **kwargs)
  File "c:\users\alexander\anaconda3\lib\site-packages\jupyter_client\manager.py", line 236, in start_kernel
    "Currently valid addresses are: %s" % (self.ip, local_ips())
RuntimeError: Can only launch a kernel on a local interface. This one is not: 127.0.0.1.Make sure that the '*_address' attributes are configured properly. Currently valid addresses are: ['192.168.56.1', '192.168.1.108', '0.0.0.0', '']

The directory where treon is run contains several notebooks. Under the same conditions with the --threads=1 option the error message disappears.

This is related to the jupyter client, it looks like a race condition in the module determining the local ip addresses.

It is reproducible on several computers. I assume the key ingredients are a Windows system, at least one network interface besides localhost, and several notebooks which treon wants to run in parallel.

@amit1rrr
Copy link
Member

amit1rrr commented Jan 16, 2020

RuntimeError: Can only launch a kernel on a local interface. This one is not: 127.0.0.1.Make sure that the '*_address' attributes are configured properly. Currently valid addresses are: ['192.168.56.1', '192.168.1.108', '0.0.0.0', '']

What's your local_hostnames & allow_remote_access config values? Can you try setting allow_remote_access to false and adding 127.0.0.1 to local_hostnames list?

Reference: https://jupyter-notebook.readthedocs.io/en/stable/config.html

@asteppke
Copy link
Author

Generating a configuration and setting allow_remote_access to False, and adding 127.0.0.1 to local_hostnames does not change the outcome. The error messages remains the same:

  File "c:\miniconda3\envs\vti\lib\site-packages\jupyter_client\manager.py", line 236, in start_kernel
    "Currently valid addresses are: %s" % (self.ip, local_ips())

RuntimeError: Can only launch a kernel on a local interface. This one is not: 127.0.0.1. 
Make sure that the '*_address' attributes are configured properly. 
Currently valid addresses are: ['192.168.0.110', '192.168.183.225', '0.0.0.0', '']

It seems that the nbconvert mechanic is ignoring this configuration variable.

What seems to be the underlying issue is that treon uses the nbconvert library, which then uses manager.py from jupyter_client which then uses localinterfaces.py which on Windows operating systems uses the ipconfig command to fill a singleton-like data structure. The last step is not thread-safe, that results in the first treon thread to receive the correct output including the 127.0.0.1 address, in the second thread this fails and then we get the error message from above.

So I think a workaround would be to call nbconvert in a different process or get an upstream patch into jupyter_client to make the local ip lookup thread-safe.

@amit1rrr
Copy link
Member

amit1rrr commented Oct 8, 2020

This PR in jupyter_client fixes this issue.

@amit1rrr amit1rrr closed this as completed Oct 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants