Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_ip: handle getting 0.0.0.0 (#2554) #8712

Merged
merged 1 commit into from
Jun 27, 2024

Conversation

AdamWill
Copy link
Contributor

@AdamWill AdamWill commented Jun 20, 2024

In Fedora's package build environment, we do some stuff to ensure no network access is possible (this is to ensure the security, policy compliance and reproducibility of builds, and also just so builds don't fail or go slow or run up big networking bills).

In this environment it seems like get_ip() returns "0.0.0.0", which is obviously not a concrete host, but ensure_concrete_host goes ahead and returns what it gets from get_ip() without checking. This ultimately breaks several tests because a server's host is considered to be "0.0.0.0".

So, let's have get_ip() use the fallback to getaddrinfo if the result of the socket path is "0.0.0.0", but avoid the warning, because pytest seems to consider warnings during test collection as errors and we hit this code during test collection in the affected scenario.

Closes #2554

  • Tests added / passed
  • Passes pre-commit run --all-files

@AdamWill AdamWill requested a review from fjetter as a code owner June 20, 2024 20:49
@AdamWill
Copy link
Contributor Author

I honestly kinda preferred my other thought here:

diff --git a/distributed/utils.py b/distributed/utils.py
index 6fbbf0f2..46c53e20 100644
--- a/distributed/utils.py
+++ b/distributed/utils.py
@@ -176,6 +176,8 @@ def _get_ip(host, port, family):
     try:
         sock.connect((host, port))
         ip = sock.getsockname()[0]
+        if ip == "0.0.0.0":
+            raise OSError("Not a concrete IP, go to fallback")
         return ip
     except OSError as e:
         warnings.warn(

which sends get_ip() down its socket.getaddrinfo()-based fallback path. But the problem with that is that path generates a warning:

    warnings.warn(
        "Couldn't detect a suitable IP address for "
        "reaching %r, defaulting to hostname: %s" % (host, e),
        RuntimeWarning,
    )
    addr_info = socket.getaddrinfo(
        socket.gethostname(), port, family, socket.SOCK_DGRAM, socket.IPPROTO_UDP
    )[0]
    return addr_info[4][0]

and it seems like pytest treats warnings during test collection as errors, so when we run the tests in the build environment with that change, they blow up during collection:

__________________ ERROR collecting comm/tests/test_comms.py ___________________
/usr/lib/python3.13/site-packages/toolz/functoolz.py:457: in memof
    return cache[k]
E   KeyError: ('8.8.8.8', 80)

During handling of the above exception, another exception occurred:
../../BUILDROOT/usr/lib/python3.13/site-packages/distributed/utils.py:180: in _get_ip
    raise OSError("Not a concrete IP, go to fallback")
E   OSError: Not a concrete IP, go to fallback

During handling of the above exception, another exception occurred:
/usr/lib/python3.13/site-packages/_pytest/runner.py:341: in from_call
    result: Optional[TResult] = func()
/usr/lib/python3.13/site-packages/_pytest/runner.py:372: in <lambda>
    call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
/usr/lib/python3.13/site-packages/_pytest/python.py:531: in collect
    self._inject_setup_module_fixture()
/usr/lib/python3.13/site-packages/_pytest/python.py:545: in _inject_setup_module_fixture
    self.obj, ("setUpModule", "setup_module")
/usr/lib/python3.13/site-packages/_pytest/python.py:310: in obj
    self._obj = obj = self._getobj()
/usr/lib/python3.13/site-packages/_pytest/python.py:528: in _getobj
    return self._importtestmodule()
/usr/lib/python3.13/site-packages/_pytest/python.py:617: in _importtestmodule
    mod = import_path(self.path, mode=importmode, root=self.config.rootpath)
/usr/lib/python3.13/site-packages/_pytest/pathlib.py:567: in import_path
    importlib.import_module(module_name)
/usr/lib64/python3.13/importlib/__init__.py:88: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1387: in _gcd_import
    ???
<frozen importlib._bootstrap>:1360: in _find_and_load
    ???
<frozen importlib._bootstrap>:1331: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:935: in _load_unlocked
    ???
/usr/lib/python3.13/site-packages/_pytest/assertion/rewrite.py:186: in exec_module
    exec(co, module.__dict__)
../../BUILDROOT/usr/lib/python3.13/site-packages/distributed/comm/tests/test_comms.py:43: in <module>
    EXTERNAL_IP4 = get_ip()
../../BUILDROOT/usr/lib/python3.13/site-packages/distributed/utils.py:203: in get_ip
    return _get_ip(host, port, family=socket.AF_INET)
/usr/lib/python3.13/site-packages/toolz/functoolz.py:461: in memof
    cache[k] = result = func(*args, **kwargs)
../../BUILDROOT/usr/lib/python3.13/site-packages/distributed/utils.py:183: in _get_ip
    warnings.warn(
E   RuntimeWarning: Couldn't detect a suitable IP address for reaching '8.8.8.8', defaulting to hostname: Not a concrete IP, go to fallback

and I can't figure out how to tell pytest not to do that :/ So that has problems. OTOH now I'm running the entire test suite with this PR I think it might be causing other failures. sigh.

@AdamWill
Copy link
Contributor Author

argh, yeah, this breaks because test_comms.py has EXTERNAL_IP4 = get_ip(), so it expects 0.0.0.0 in various places we now return 127.0.0.1 - the opposite of the problem we're trying to fix here. Will have to think of something else.

Copy link
Contributor

github-actions bot commented Jun 20, 2024

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    29 files  ±0      29 suites  ±0   10h 55m 18s ⏱️ - 7m 38s
 4 061 tests ±0   3 956 ✅ ±0     97 💤 ±0   8 ❌ ±0 
55 939 runs  ±0  53 766 ✅ +3  2 163 💤 +1  10 ❌  - 4 

For more details on these failures, see this check.

Results for commit 1becd4a. ± Comparison against base commit adcb045.

♻️ This comment has been updated with latest results.

@AdamWill
Copy link
Contributor Author

OK, there's a different version that targets get_ip() but avoids the warning. Note I decided not to touch get_ipv6() because, in my testing, that doesn't actually behave the same (the socket path raises OSError in the IPv6 case, forcing us down the fallback path).

@AdamWill AdamWill changed the title ensure_concrete_host: handle getting 0.0.0.0 or :: (#2554) get_ip: handle getting 0.0.0.0 (#2554) Jun 20, 2024
In Fedora's package build environment, we do some stuff to ensure
no network access is possible (this is to ensure the security,
policy compliance and reproducibility of builds, and also just
so builds don't fail or go slow or run up big networking bills).

In this environment it seems like `get_ip()` returns `"0.0.0.0"`,
which is obviously not a concrete host, but `ensure_concrete_host`
goes ahead and returns what it gets from `get_ip()` without
checking. This ultimately breaks several tests because a server's
host is considered to be `"0.0.0.0"`.

So, let's have `get_ip()` use the fallback to `getaddrinfo` if
the result of the socket path is `"0.0.0.0"`, but avoid the
warning, because pytest seems to consider warnings during test
collection as errors and we hit this code during test collection
in the affected scenario.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
@AdamWill
Copy link
Contributor Author

The test failures all look like flakes to me, not obviously ones that would be caused by this change.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me, I agree that the test failures are unrelated.

Thanks for iterating here @AdamWill!

@jacobtomlinson jacobtomlinson merged commit a27a7b5 into dask:main Jun 27, 2024
26 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remote peers on 0.0.0.0 instead of 127.0.0.1
2 participants