Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async database pool does not return connections #285

Closed
felindev opened this issue Jan 23, 2024 · 38 comments
Closed

Async database pool does not return connections #285

felindev opened this issue Jan 23, 2024 · 38 comments
Labels
bug Something isn't working patch available

Comments

@felindev
Copy link

  1. What versions are you using?
    oracledb 2.0.1

Give your database version.
Oracle Database 19

Also run Python and show the output [...]
platform.platform: Linux-5.15.133.1-microsoft-standard-WSL2-x86_64-with-glibc2.36
sys.maxsize > 2**32: True
platform.python_version: 3.12.1
oracledb.version: 2.0.1

  1. Is it an error or a hang or a crash?
    Hang - oracledb.create_pool_async() does not return connection.

  2. What error(s) or behavior you are seeing?
    There's no error, application does not return connection and does not continue in that function.
    I'm using fastapi - that connection does not respond with anything, while app overall remains functional.

  3. Does your application call init_oracle_client()?
    No.

  4. Include a runnable Python script that shows the problem.

from os import getenv
from typing import Any

import fastapi
import oracledb

dbpool = oracledb.create_pool_async(
    user=getenv("db_user"),
    password=getenv("db_password"),
    port=getenv("db_port", "1521"),
    service_name=getenv("db_service"),
    host=getenv("db_host"),
    min=0,
    max=3,
    increment=1,
)

app = fastapi.FastAPI()
@app.get("/test")
async def testpoint() -> Any:
    print("Starting test")
    async with dbpool.acquire() as con:
        print("DB aquired!")
        with con.cursor() as cursor:
            print("Got cursor!")
            await cursor.exeucte("SELECT SYSDATE FROM DUAL")
            d = await cursor.fetchone()
            return d


@app.get("/test2")
async def testpoint2() -> Any:
    print("Starting test")
    async with oracledb.connect_async(
        user=getenv("db_user"),
        password=getenv("db_password"),
        port=getenv("db_port", "1521"),
        service_name=getenv("db_service"),
        host=getenv("db_host"),
    ) as con:
        print("DB aquired!")
        with con.cursor() as cursor:
            print("Got cursor!")
            await cursor.execute("SELECT SYSDATE FROM DUAL")
            d = await cursor.fetchone()
            return d

/test endpoint hangs while /test2 returns successfully.

My environment is docker container generated from following test dockerfile:

FROM python:3.12

WORKDIR /app/

COPY api/requirements.txt /app/requirements.txt
RUN python3 -m venv .venv && \
    /app/.venv/bin/pip3 install -r requirements.txt

COPY api/ /app/api/

EXPOSE 8000
CMD ["/app/.venv/bin/uvicorn", "bslapi:app", "--host", "0.0.0.0", "--port", "8000", "--proxy-headers"]

and requirements.txt:

annotated-types==0.6.0
anyio==4.2.0
cffi==1.16.0
click==8.1.7
colorama==0.4.6
cryptography==42.0.0
fastapi==0.109.0
h11==0.14.0
idna==3.6
oracledb==2.0.1
pycparser==2.21
pydantic==2.5.3
pydantic_core==2.14.6
sniffio==1.3.0
starlette==0.35.1
typing_extensions==4.9.0
uvicorn==0.27.0
@felindev felindev added the bug Something isn't working label Jan 23, 2024
@anthony-tuininga
Copy link
Member

Looks to me like this is a bug in your code! When I used your code I got the following:

    await cursor.exeucte("SELECT SYSDATE FROM DUAL")
          ^^^^^^^^^^^^^^
AttributeError: 'AsyncCursor' object has no attribute 'exeucte'. Did you mean: 'execute'?

When I replaced exeucte() with execute() it worked just fine. Can you confirm?

@citreae535
Copy link

I‘ve encountered the same problem when using async connection pools. Strangely, the code works fine with our test database (12c) and dev database (18). It only hangs with our production database (12c).

import asyncio

import oracledb


async def test():
    pool = oracledb.create_pool_async(
        user="user",
        password="password",
        service_name="service_name",
        host="host",
    )
    async with pool.acquire() as connection:
        print("DB aquired!")
        with connection.cursor() as cursor:
            print("Got cursor!")
            await cursor.execute("SELECT SYSDATE FROM DUAL")
            d = await cursor.fetchone()
            print(d)


async def main():
    await test()


asyncio.run(main())

If I press Ctrl + C, the exception output is

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/asyncio/base_events.py", line 684, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/app/test_async_pool/__main__.py", line 23, in main
    await test()
  File "/app/test_async_pool/__main__.py", line 13, in test
    async with pool.acquire() as connection:
  File "/app/.venv/lib/python3.12/site-packages/oracledb/connection.py", line 1398, in __aenter__
    await self._connect_coroutine
  File "/app/.venv/lib/python3.12/site-packages/oracledb/connection.py", line 1440, in _connect
    impl = await pool_impl.acquire(params_impl)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/oracledb/impl/thin/pool.pyx", line 875, in acquire
  File "/usr/local/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
    return await fut
           ^^^^^^^^^
  File "src/oracledb/impl/thin/pool.pyx", line 689, in _acquire_helper
  File "src/oracledb/impl/thin/pool.pyx", line 704, in oracledb.thin_impl.AsyncThinPoolImpl._acquire_helper
  File "/usr/local/lib/python3.12/asyncio/locks.py", line 293, in wait_for
    await self.wait()
  File "/usr/local/lib/python3.12/asyncio/locks.py", line 266, in wait
    await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/app/test_async_pool/__main__.py", line 26, in <module>
    asyncio.run(main())
  File "/usr/local/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/asyncio/runners.py", line 123, in run
    raise KeyboardInterrupt()
KeyboardInterrupt

@anthony-tuininga
Copy link
Member

Interesting. Can you set the environment variable PYO_DEBUG_PACKETS to any value before running your script? Are you using version 2.0.1?

@citreae535
Copy link

Yes, I'm using version 2.0.1. Running docker run -e PYO_DEBUG_PACKETS=true test-async-pool prints an infinite loop of sending and receiving packets.

@citreae535
Copy link

It seems that the code can log on to the server, but disconnects immediately after a session is established.

@anthony-tuininga
Copy link
Member

Can you try without the pool -- just a standalone connection? Do you get an error? Can you share some packets (send me an e-mail with the packet logs -- just the first 15-20 should be fine).

@citreae535
Copy link

A standalone connection works fine. It's just the pool that hangs. I've captured some packets in docker logs. Where should I send them?

@anthony-tuininga
Copy link
Member

You can e-mail them to me. See the PyPI record for the e-mail address: https://pypi.org/project/oracledb/

@felindev
Copy link
Author

felindev commented Feb 5, 2024

Looks to me like this is a bug in your code! When I used your code I got the following:
When I replaced exeucte() with execute() it worked just fine. Can you confirm?

Oops. It seems that I did have a typo, but sadly issue wasn't that, connection pool still fails to establish, with exact same results as @citreae535 got.

Additionally, running test code provided by them raises exception on my windows host:

Exception in callback _ProactorReadPipeTransport._loop_reading(<_OverlappedF...hed result=10>)
handle: <Handle _ProactorReadPipeTransport._loop_reading(<_OverlappedF...hed result=10>)>
Traceback (most recent call last):
  File "C:\Program Files\Python311\Lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "C:\Program Files\Python311\Lib\asyncio\proactor_events.py", line 324, in _loop_reading      
    self._data_received(data, length)
  File "C:\Program Files\Python311\Lib\asyncio\proactor_events.py", line 274, in _data_received     
    self._protocol.data_received(data)
  File "src\\oracledb\\impl/thin/protocol.pyx", line 895, in oracledb.thin_impl.BaseAsyncProtocol.data_received
TypeError: Expected bytes, got bytearray
Exception in callback _ProactorReadPipeTransport._loop_reading(<_OverlappedF...ed result=372>)      
handle: <Handle _ProactorReadPipeTransport._loop_reading(<_OverlappedF...ed result=372>)>
Traceback (most recent call last):
  File "C:\Program Files\Python311\Lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "C:\Program Files\Python311\Lib\asyncio\proactor_events.py", line 324, in _loop_reading      
    self._data_received(data, length)
  File "C:\Program Files\Python311\Lib\asyncio\proactor_events.py", line 274, in _data_received     
    self._protocol.data_received(data)
  File "src\\oracledb\\impl/thin/protocol.pyx", line 895, in oracledb.thin_impl.BaseAsyncProtocol.data_received
TypeError: Expected bytes, got bytearray`
(with the same results under WSL/Docker - program loops)

@citreae535
Copy link

You can e-mail them to me. See the PyPI record for the e-mail address: https://pypi.org/project/oracledb/

I sent the packet log and relevant code just now. Thanks for helping us troubleshoot this bug!

@anthony-tuininga
Copy link
Member

@felindev, the error you got is a bug in the Python asyncio implementation on Windows. This was fixed in either Python 3.11 or Python 3.12.

@citreae535 , the packet output looks very much like a bug that was squashed earlier (#275). Can you verify that you are indeed using python-oracledb 2.0.1?

@citreae535
Copy link

Just checked again: the version of oracledb in poetry.lock is indeed 2.0.1.

@anthony-tuininga
Copy link
Member

Can you check by actually printing out the value of oracledb.__version__? Stray installations have been known to happen! If that doesn't solve the problem then I can suggest some changes to the code to help debug the issue. Are you able to build from source?

@citreae535
Copy link

Yes. I can build from source. Just pulled the master branch and added print(oracledb.__version__). The async pool version still hangs.

2.1.0b1
DB aquired!
Got cursor!
(datetime.datetime(2024, 2, 18, 8, 53, 9),)
^CTraceback (most recent call last):
  File "/usr/local/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/asyncio/base_events.py", line 685, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/app/test_async_pool/__main__.py", line 40, in main
    await test_pool()
  File "/app/test_async_pool/__main__.py", line 28, in test_pool
    async with pool.acquire() as connection:
  File "/app/.venv/lib/python3.12/site-packages/oracledb/connection.py", line 1413, in __aenter__
    await self._connect_coroutine
  File "/app/.venv/lib/python3.12/site-packages/oracledb/connection.py", line 1455, in _connect
    impl = await pool_impl.acquire(params_impl)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/oracledb/impl/thin/pool.pyx", line 875, in acquire
  File "/usr/local/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
    return await fut
           ^^^^^^^^^
  File "src/oracledb/impl/thin/pool.pyx", line 689, in _acquire_helper
  File "src/oracledb/impl/thin/pool.pyx", line 704, in oracledb.thin_impl.AsyncThinPoolImpl._acquire_helper
  File "/usr/local/lib/python3.12/asyncio/locks.py", line 293, in wait_for
    await self.wait()
  File "/usr/local/lib/python3.12/asyncio/locks.py", line 266, in wait
    await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/app/test_async_pool/__main__.py", line 43, in <module>
    asyncio.run(main())
  File "/usr/local/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/asyncio/runners.py", line 123, in run
    raise KeyboardInterrupt()
KeyboardInterrupt

@anthony-tuininga
Copy link
Member

I'm assuming that you are doing something different than the code posted earlier. Are you running a standalone connection version which performs the query as expected followed by the pool version which hangs?

Can you add some debugging lines in this function and this function to find out whether or not a connection is being established and the waiting coroutine notified?

@citreae535
Copy link

Yes. A standalone connection is used first and it works fine.

Debug output:

2.1.0b1
DB aquired!
Got cursor!
(datetime.datetime(2024, 2, 18, 10, 16, 27),)
bg task will create connections:  1
bg task to create connections:  1
helper is creating a connection
helper added a connection to the pool
bg task will create connections:  1
bg task to create connections:  1
helper is creating a connection
helper added a connection to the pool
<... infinite loop ...>
bg task will create connections:  1
bg task to create connections:  1
helper is creating a connection
^CTraceback (most recent call last):
<... same exception ...>

Here's the code:

    async def _bg_task_func(self):
        cdef:
            uint32_t open_count, num_to_create
            list conn_impls_to_drop
            bint wait

        # create connections and close connections as needed
        while True:
            conn_impls_to_drop = []
            wait = True

            # determine if there is any work to do
            async with self._condition:
                open_count = self.get_open_count()
                if self._open and self._num_to_create == 0:
                    if open_count < self.min:
                        self._num_to_create = self.min - open_count
                    elif self._num_waiters > 0:
                        self._num_to_create = min(self.increment,
                                              self.max - open_count)
                if not self._open or self._bg_exc is None:
                    conn_impls_to_drop = self._conn_impls_to_drop
                    self._conn_impls_to_drop = []

            # create connections, if needed
            print("bg task will create connections: ", self._num_to_create, flush=True)
            if self._num_to_create > 0:
                wait = False
                num_to_create = self._num_to_create
                print("bg task to create connections: ", self._num_to_create, flush=True)
                await self._create_conn_impls_helper()
                if num_to_create > 1:
                    self._check_timeout()

            # close connections, if needed
            if conn_impls_to_drop:
                wait = False
                self._drop_conn_impls_helper(conn_impls_to_drop)

            # if pool has closed and no connections to drop, stop thread!
            if not self._open and not self._conn_impls_to_drop:
                break

            # otherwise, if nothing needed to be done, wait for notification
            if wait:
                print("bg task is waiting", flush=True)
                async with self._bg_task_condition:
                    await self._bg_task_condition.wait()
                print("bg task received notification", flush=True)

    async def _create_conn_impls_helper(self):
        cdef:
            AsyncThinConnImpl conn_impl
            object exc
        while self._num_to_create > 0:
            conn_impl = None
            try:
                print("helper is creating a connection", flush=True)
                conn_impl = await self._create_conn_impl()
            except Exception as e:
                print("an exception occurred when creating a connection:", e, flush=True)
                exc = e
            async with self._condition:
                if conn_impl is not None:
                    if self._open:
                        self._free_new_conn_impls.append(conn_impl)
                        self._num_to_create -= 1
                        print("helper added a connection to the pool", flush=True)
                    else:
                        conn_impl._force_close()
                        break
                else:
                    self._bg_exc = exc
                self._condition.notify()
                if conn_impl is None:
                    break

@anthony-tuininga
Copy link
Member

Can you instrument this function as well? In particular whether line 102 is being called? That was the problem in the previous issue -- which suggests there is another cases that causes it to occur!

@anthony-tuininga
Copy link
Member

Can you try this patch:

diff --git a/src/oracledb/impl/thin/protocol.pyx b/src/oracledb/impl/thin/protocol.pyx
index fc5bb7b0..0486aef6 100644
--- a/src/oracledb/impl/thin/protocol.pyx
+++ b/src/oracledb/impl/thin/protocol.pyx
@@ -651,6 +651,7 @@ cdef class BaseAsyncProtocol(BaseProtocol):
         # mark protocol to indicate that connect is no longer in progress; this
         # allows the normal break/reset mechanism to fire
         conn_impl.warning = auth_message.warning
+        self._read_buf._session_needs_to_be_closed = False
         self._in_connect = False
 
     async def _connect_tcp(self, ConnectParamsImpl params,

It looks like only the synchronous version was corrected, not the asynchronous version!

@citreae535
Copy link

Can you instrument this function as well? In particular whether line 102 is being called? That was the problem in the previous issue -- which suggests there is another cases that causes it to occur!

Yeah, line 102 is being called in every loop. I'll try the patch now.

@anthony-tuininga
Copy link
Member

I have a better patch coming up -- avoids duplicate code. Give me a few more minutes!

@anthony-tuininga
Copy link
Member

anthony-tuininga commented Feb 18, 2024

Try this one:

diff --git a/src/oracledb/impl/thin/protocol.pyx b/src/oracledb/impl/thin/protocol.pyx
index fc5bb7b0..15e4dcf1 100644
--- a/src/oracledb/impl/thin/protocol.pyx
+++ b/src/oracledb/impl/thin/protocol.pyx
@@ -89,6 +89,20 @@ cdef class BaseProtocol:
             self._write_buf._transport = None
             transport.disconnect()
 
+    cdef int _post_connect(self, BaseThinConnImpl conn_impl,
+                           AuthMessage auth_message) except -1:
+        """"
+        Performs activities after the connection has completed. The protocol
+        must be marked to indicate that the connect is no longer in progress,
+        which allows the normal break/reset mechanism to fire. The session must
+        also be marked as not needing to be closed since for listener redirects
+        the packet may indicate EOF for the initial connection that is
+        established.
+        """
+        conn_impl.warning = auth_message.warning
+        self._read_buf._session_needs_to_be_closed = False
+        self._in_connect = False
+
     cdef int _release_drcp_session(self, BaseThinConnImpl conn_impl,
                                    uint32_t release_mode) except -1:
         """
@@ -297,14 +311,8 @@ cdef class Protocol(BaseProtocol):
         if auth_message.resend:
             self._process_message(auth_message)
 
-        # mark protocol to indicate that connect is no longer in progress; this
-        # allows the normal break/reset mechanism to fire; also mark the
-        # session as not needing to be closed since for listener redirects
-        # the packet may indicate EOF for the initial connection that is
-        # established
-        conn_impl.warning = auth_message.warning
-        self._read_buf._session_needs_to_be_closed = False
-        self._in_connect = False
+        # perform post connect activities
+        self._post_connect(conn_impl, auth_message)
 
     cdef int _connect_tcp(self, ConnectParamsImpl params,
                           Description description, Address address, str host,
@@ -648,10 +656,9 @@ cdef class BaseAsyncProtocol(BaseProtocol):
         if auth_message.resend:
             await self._process_message(auth_message)
 
-        # mark protocol to indicate that connect is no longer in progress; this
-        # allows the normal break/reset mechanism to fire
-        conn_impl.warning = auth_message.warning
-        self._in_connect = False
+        # perform post connect activities
+        self._post_connect(conn_impl, auth_message)
+
 
     async def _connect_tcp(self, ConnectParamsImpl params,
                            Description description, Address address, str host,

@anthony-tuininga
Copy link
Member

I am reasonably confident so I pushed the patch to make it easier for you. Let me know if that works for you!

@citreae535
Copy link

Yeah, it works now. Thanks a lot!

@anthony-tuininga
Copy link
Member

Good to hear! Thanks for your help in diagnosing the issue!

@WolfEYc
Copy link

WolfEYc commented Feb 21, 2024

I have the same issue, how do i get this patch?

i can connect_async and form an AsyncConnection and get data back,
but an AsyncConnectionPool.aquire() hangs indefinitely.

@felindev
Copy link
Author

felindev commented Feb 22, 2024

Issue has been marked as "patch available", it should be solved in next release. @WolfEYc

@WolfEYc
Copy link

WolfEYc commented Feb 22, 2024

Cool thanks! Looking forward to using some async pooling!

@anthony-tuininga
Copy link
Member

@WolfEYc, if you can build from source you can make use of this patch before the release, too.

@WolfEYc
Copy link

WolfEYc commented Feb 25, 2024

I suppose i could but i dont mind waiting.

Rn i just make an async connection per sql query (its fine for now).

Easy switch once it is out since I abstract oracle interaction quite nicely in one place using dataframes.

So ill only have to change it in one spot.

@anthony-tuininga
Copy link
Member

This patch was included in version 2.1.0 which was just released.

@WolfEYc
Copy link

WolfEYc commented Mar 11, 2024

@anthony-tuininga can you reopen this as the same bug still occurs (infinite hang) in 2.1.0

@WolfEYc
Copy link

WolfEYc commented Mar 11, 2024

its when you try to async with pool.aquire()

@anthony-tuininga
Copy link
Member

Are you sure you're using the new version? If so, can you e-mail me the packet trace so I can take a look at it?

@WolfEYc
Copy link

WolfEYc commented Mar 12, 2024

I think its working now (ill post any updates here if it stops for any reason)

@WolfEYc
Copy link

WolfEYc commented Mar 12, 2024

its a real pain that create_pool_async is not marked as an async method, but will fail if not run inside an async method / with an event loop running.

causes issues in testing.

@anthony-tuininga
Copy link
Member

How does it cause issues in testing? It is not an asynchronous method so marking it as such would also be confusing, I think!

@WolfEYc
Copy link

WolfEYc commented Mar 12, 2024

if it is not called when there is not a global event loop then it fails (ie if it is called in an async function it works)

so when trying to do tests it requires many workarounds to try to get it working (still havent gotten it working yet with pytest)

but in normal use it is fine

for context im using FastAPI

@anthony-tuininga
Copy link
Member

Yes, it will not work outside of asyncio but it still is a synchronous function. I'm not sure what difficulties you are having with pytest. I know my own tests (using unittest) had to use a different base class for testing async workloads. Is it related to that? Maybe start a new discussion instead of hijacking this closed issue! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working patch available
Projects
None yet
Development

No branches or pull requests

4 participants