Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix google oauth desktop #1807

Merged
merged 7 commits into from Dec 8, 2022
Merged

Conversation

RunOrVeith
Copy link
Contributor

Use local loopback to log into OAuth with a desktop client

Description

Fixes #1806 by introducing sign in through local loopback instead of out of band.

Status

Done, ready for review

Checklist (tick everything that applies)

  • Code linting (required, can be done after the PR checks)
  • Documentation
  • Tests
  • ICLA (required for bigger changes)

@RunOrVeith
Copy link
Contributor Author

The checks that are failing seem not to have been caused by my changes

@Kami
Copy link
Member

Kami commented Dec 5, 2022

Good catch, thanks.

def _redirect_uri_with_port(self):
return self.redirect_uri + ":" + str(self.redirect_uri_port)

def _receive_code_through_local_loopback(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if you could add a test case for this method. I know it's a bit more involved to test the whole method, but IIRC we already have existing patterns with mock http server and threads in some places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test case, I had to move one of the mocks in GoogleTestCase out of the parent class and into the child classes that actually require it, because that mocked the behavior I actually want to test here.
The test is more or less end to end, I am not mocking the HTTP server, just the user interaction of signing in is replaced with a fake response from google

@RunOrVeith
Copy link
Contributor Author

I might need some help to make the pipeline pass because I am not sure what is happening here.
The test file that I modified works fine on my machine in less than a second (I can not run all tests locally since I don't have the secrets file).

@Kami
Copy link
Member

Kami commented Dec 8, 2022

Thanks for adding tests. I've fixed test failure related issues in trunk when upgrading various dependencies, etc.

Will fix lint issue and merge it into trunk. Thanks again for your contribution!

asfgit pushed a commit that referenced this pull request Dec 8, 2022
@asfgit asfgit merged commit 693b597 into apache:trunk Dec 8, 2022
@Kami
Copy link
Member

Kami commented Dec 8, 2022

Re tests hanging - I noticed that as well, but it only happens some times. I assume it's some cross test pollution / race condition which only happens sometimes (because it likely depends on how tests are distributed across worker proceesses by pytest-xdist plugin which runs tests in multiple processes).

@Kami
Copy link
Member

Kami commented Dec 8, 2022

I did some digging in and added timeout for hanging tests and looks like the problematic test is the one which was added in this PR (It's also easy to reproduce it by just running libcloud/test/common/test_google.py test file directly):

======================================================================================== FAILURES =========================================================================================
__________________________________________________ GoogleInstalledAppAuthConnectionFirstLoginTest.test_it_aborts_if_state_is_suspicious ___________________________________________________
[gw8] linux -- Python 3.8.0 /home/kami/w/libcloud/libcloud/.tox/py3.8/bin/python

self = <libcloud.test.common.test_google.GoogleInstalledAppAuthConnectionFirstLoginTest testMethod=test_it_aborts_if_state_is_suspicious>

    def test_it_aborts_if_state_is_suspicious(self):
>       received_code = self._do_first_sign_in(
            expected_code="1234ABC", state=self.conn._state + "very suspicious"
        )

libcloud/test/common/test_google.py:292: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
libcloud/test/common/test_google.py:324: in _do_first_sign_in
    fake_sign_in_thread.join()
../../../.pythonz/pythons/CPython-3.8.0/lib/python3.8/threading.py:1011: in join
    self._wait_for_tstate_lock()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <Thread(Thread-1, started 140236942829248)>, block = True, timeout = -1

    def _wait_for_tstate_lock(self, block=True, timeout=-1):
        # Issue #18808: wait for the thread state to be gone.
        # At the end of the thread's life, after all knowledge of the thread
        # is removed from C data structures, C code releases our _tstate_lock.
        # This method passes its arguments to _tstate_lock.acquire().
        # If the lock is acquired, the C code is done, and self._stop() is
        # called.  That sets ._is_stopped to True, and ._tstate_lock to None.
        lock = self._tstate_lock
        if lock is None:  # already determined that the C code is done
            assert self._is_stopped
>       elif lock.acquire(block, timeout):
E       Failed: Timeout >15.0s

../../../.pythonz/pythons/CPython-3.8.0/lib/python3.8/threading.py:1027: Failed
---------------------------------------------------------------------------------- Captured stdout call -----------------------------------------------------------------------------------

Please Go to the following URL and sign in:
https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=email%40developer.gserviceaccount.com&redirect_uri=http%3A%2F%2F127.0.0.1%3A8087&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Ffoo&state=Libcloud+Request
---------------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------------

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Timeout +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Stack of Thread-1 (140236942829248) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  File "/home/kami/.pythonz/pythons/CPython-3.8.0/lib/python3.8/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "/home/kami/.pythonz/pythons/CPython-3.8.0/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/kami/.pythonz/pythons/CPython-3.8.0/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/kami/w/libcloud/libcloud/libcloud/test/common/test_google.py", line 310, in _get_code
    received_code = self.conn.get_code()
  File "/home/kami/w/libcloud/libcloud/libcloud/common/google.py", line 439, in get_code
    code = self._receive_code_through_local_loopback()
  File "/home/kami/w/libcloud/libcloud/libcloud/common/google.py", line 535, in _receive_code_through_local_loopback
    server.handle_request()
  File "/home/kami/.pythonz/pythons/CPython-3.8.0/lib/python3.8/socketserver.py", line 294, in handle_request
    ready = selector.select(timeout)
  File "/home/kami/.pythonz/pythons/CPython-3.8.0/lib/python3.8/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Stack of <unknown> (140237122565824) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  File "/home/kami/w/libcloud/libcloud/.tox/py3.8/lib/python3.8/site-packages/execnet/gateway_base.py", line 285, in _perform_spawn
    reply.run()
  File "/home/kami/w/libcloud/libcloud/.tox/py3.8/lib/python3.8/site-packages/execnet/gateway_base.py", line 220, in run
    self._result = func(*args, **kwargs)
  File "/home/kami/w/libcloud/libcloud/.tox/py3.8/lib/python3.8/site-packages/execnet/gateway_base.py", line 967, in _thread_receiver
    msg = Message.from_io(io)
  File "/home/kami/w/libcloud/libcloud/.tox/py3.8/lib/python3.8/site-packages/execnet/gateway_base.py", line 432, in from_io
    header = io.read(9)  # type 1, channel 4, payload 4
  File "/home/kami/w/libcloud/libcloud/.tox/py3.8/lib/python3.8/site-packages/execnet/gateway_base.py", line 400, in read
    data = self._read(numbytes - len(buf))

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Timeout +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
______________________________________ GoogleInstalledAppAuthConnectionFirstLoginTest.test_it_receives_the_code_that_google_sends_via_local_loopback ______________________________________
[gw8] linux -- Python 3.8.0 /home/kami/w/libcloud/libcloud/.tox/py3.8/bin/python

self = <libcloud.test.common.test_google.GoogleInstalledAppAuthConnectionFirstLoginTest testMethod=test_it_receives_the_code_that_google_sends_via_local_loopback>

    def test_it_receives_the_code_that_google_sends_via_local_loopback(self):
        expected_code = "1234ABC"
        received_code = self._do_first_sign_in(expected_code=expected_code, state=self.conn._state)
>       self.assertEqual(received_code, expected_code)
E       AssertionError: None != '1234ABC'

libcloud/test/common/test_google.py:289: AssertionError

I will dig in more later this week and try to fix it.

@Kami
Copy link
Member

Kami commented Dec 8, 2022

Race condition is indeed related to this new test - we need to ensure "send_code" thread is started before "get_thread" to avoid race condition / dead lock (get code trying to retrieve code before server is running).

I've added a temporary fix / workaround in - 97ba9b5 . Proper fix would not rely on wait / sleep, but on a state change.

@Kami
Copy link
Member

Kami commented Dec 8, 2022

And there is yet another issue related to re-using the same port each time which causes issues when running multiple tests in parallel using multiple processes - https://github.com/apache/libcloud/actions/runs/3652800376/jobs/6171593038.

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/opt/hostedtoolcache/PyPy/3.7.13/x64/lib-python/3/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/opt/hostedtoolcache/PyPy/3.7.13/x64/lib-python/3/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/runner/work/libcloud/libcloud/libcloud/test/common/test_google.py", line 311, in _get_code
    received_code = self.conn.get_code()
  File "/home/runner/work/libcloud/libcloud/libcloud/common/google.py", line 439, in get_code
    code = self._receive_code_through_local_loopback()
  File "/home/runner/work/libcloud/libcloud/libcloud/common/google.py", line 533, in _receive_code_through_local_loopback
    server = HTTPServer(server_address=server_address, RequestHandlerClass=AccessCodeReceiver)
  File "/opt/hostedtoolcache/PyPy/3.7.13/x64/lib-python/3/socketserver.py", line 452, in __init__
    self.server_bind()
  File "/opt/hostedtoolcache/PyPy/3.7.13/x64/lib-python/3/http/server.py", line 137, in server_bind
    socketserver.TCPServer.server_bind(self)
  File "/opt/hostedtoolcache/PyPy/3.7.13/x64/lib-python/3/socketserver.py", line 466, in server_bind
    self.socket.bind(self.server_address)
OSError: [Errno 98] Address already in use

Pushed a quick fix in ea0bf7c, but it may require more work (will need to dig in to see if any other tests exercise that code as well, since those would be also affected).

@RunOrVeith
Copy link
Contributor Author

RunOrVeith commented Dec 9, 2022

Ok, I see the problem. I only ran the tests locally without any parallel execution so I didn't notice this. Thanks for the fixes!
I think the proper fix for the second problem would be to use Event, so basically (untested):

        ...
        sign_in_started = threading.Event()
        received_code = None

        def _get_code():
            nonlocal received_code
            sign_in_started.set()
            received_code = self.conn.get_code()

        def _send_code():
            target_url = self.conn._redirect_uri_with_port
            params = {"state": state, "code": expected_code}
            params = urllib.parse.urlencode(params, quote_via=urllib.parse.quote)
            requests.get(url=target_url, params=params)

        fake_sign_in_thread = threading.Thread(target=_get_code)
        fake_google_response = threading.Thread(target=_send_code)

        fake_sign_in_thread.start()
        sign_in_started.wait()
        fake_google_response.start()
        ...

@Kami
Copy link
Member

Kami commented Dec 9, 2022

@RunOrVeith Yeah, I agree that Event (+timeout) is likely indeed the most robust approach.

@Kami
Copy link
Member

Kami commented Dec 9, 2022

I quickly tried to get it working using Event without success (requires more work and digging in), so I will leave it as-is.

Well actually, I did get it to work if I added timeout to wait() but the only reason why that worked is because that introduced the delay in the same way as time.sleep() does and not because of the actual event signaling itself.

@Kami Kami added this to the v3.7.0 milestone Jan 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OAuth2 Desktop Client login is broken for base google driver
3 participants