Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Awaiting multiple execute within a transaction seems to block forever #68

Closed
seeker89 opened this issue Jan 20, 2017 · 22 comments
Closed

Comments

@seeker89
Copy link

seeker89 commented Jan 20, 2017

I'm trying to execute a couple of sql queries with arguments, in a context of a transaction.

I used this code:

 async def execute_many_in_transaction(self, queries_and_args):
    pool = await get_pool() # returns a pool of clients
    async with pool.acquire() as connection:
        async with connection.transaction():
            for query, args in queries_and_args:
                await connection.execute(
                    query, *args
                )

When I call it with a list with one tuple [("SQL STATEMENT", [arg1, arg2])] all is good.

The moment I try to execute more than one, it just hangs there forever. For example:

[
    ("SQL STATEMENT 1", [arg1, arg2]),
    ("SQL STATEMENT 2", [arg1, arg3]),
]

I know that there is execute_many method, but that only seems to work on the same sql statement with multiple lists of arguments.

Is there something I'm misunderstanding ?

Thanks !

@elprans
Copy link
Member

elprans commented Jan 20, 2017

Are you by any chance using pgbouncer?

@seeker89
Copy link
Author

seeker89 commented Jan 20, 2017

@elprans No, I'm not.

I'm creating the pool with this:

import asyncpg

async def get_pool(all_the_args)
    return await asyncpg.create_pool(**all_the_args)

Just to make sure - that is supposed to work just fine, right ?

@1st1
Copy link
Member

1st1 commented Jan 20, 2017

Can you give us a complete snippet of code to reproduce?

@elprans
Copy link
Member

elprans commented Jan 20, 2017

@seeker89 your code should work, I tested on a simple query and could not reproduce the issue. Which version of asyncpg, Python and PostgreSQL are you running?

@seeker89
Copy link
Author

That's what I'm running:

import asyncio
import asyncpg
import os

async def get_pool(host, port, database, user, password):
    return await asyncpg.create_pool(host=host,
                                          port=port,
                                          database=database,
                                          user=user,
                                          password=password)

async def execute_many_in_transaction(details, queries_and_args):
    pool = await get_pool(**details) # returns a pool of clients
    async with pool.acquire() as connection:
        async with connection.transaction():
            for query, args in queries_and_args:
                await connection.execute(
                    query, *args
                )

details = dict(
    host=os.environ.get("HOST"),
    port=os.environ.get("PORT"),
    database=os.environ.get("DB"),
    user=os.environ.get("USER"),
    password=os.environ.get("PASS"),
)

loop = asyncio.get_event_loop()
tasks = [
    execute_many_in_transaction(details, [
        ("SELECT * FROM something LIMIT 1;", []),
    ]),
    execute_many_in_transaction(details, [
        ("SELECT * FROM something LIMIT 1;", []),
        ("SELECT * FROM something LIMIT 1;", [])
    ]),
]
loop.run_until_complete(asyncio.wait(tasks))
asyncpg (0.8.4)
Python 3.5.1
Postgres 9.4

If that's supposed to work fine, the problem must be somewhere else in my setup.

I'll look into it. Many thanks @elprans and @1st1 for super fast response times. Kudos 👍 !

@1st1
Copy link
Member

1st1 commented Jan 20, 2017

Can you try this with uvloop?

@seeker89
Copy link
Author

Will do.

@skamboj
Copy link

skamboj commented Jan 20, 2017

Using uvloop did not help.

We tracked it down to using custom types, for example, we have the following:

CREATE TYPE my_type AS ENUM ('abc', 'def', 'ghi');
CREATE TABLE my_table (
  timestamp timestamptz,
  col1 varchar(256),
  col2 my_type,
  PRIMARY KEY (col1)
);
INSERT  INTO my_table (col1, col2) VALUES ('foo','abc');
INSERT  INTO my_table (col1, col2) VALUES ('bar','def'); 

And in the test script below, the first query works while the second one fails:

import asyncio
import asyncpg
import os

async def get_pool(host, port, database, user, password):
    return await asyncpg.create_pool(host=host,
                                     port=port,
                                     database=database,
                                     user=user,
                                     password=password)

async def run_query(details, query):
    pool = await get_pool(**details) # returns a pool of clients                                                                                                                                                                                              
    async with pool.acquire() as connection:
        async with connection.transaction():
            results = await connection.fetch(query)
            print(query, results)
            return results

details = dict(
    host=os.environ.get("HOST"),
    port=os.environ.get("PORT"),
    database=os.environ.get("DB"),
    user=os.environ.get("USER"),
    password=os.environ.get("PASS"),
)

loop = asyncio.get_event_loop()
tasks = [
    run_query(details, "SELECT col1 FROM my_table"), # Works                                                                                                                                                                                                  
    run_query(details, "SELECT col1, col2 FROM my_table"), # Fails; hangs forever                                                                                                                                                                                            
]
loop.run_until_complete(asyncio.wait(tasks))

@elprans
Copy link
Member

elprans commented Jan 20, 2017

Is the above literally what fails for you? I still can't reproduce:

$ HOST=localhost DB=postgres USER=postgres python3 1.py
SELECT col1 FROM my_table [<Record col1='foo'>, <Record col1='bar'>]
SELECT col1, col2 FROM my_table [<Record col1='foo' col2='abc'>, <Record col1='bar' col2='def'>]

@elprans
Copy link
Member

elprans commented Jan 20, 2017

I added a test for ENUM types, the buildbots also seem to be happy.

@seeker89
Copy link
Author

I made a PR to illustrate what exactly hangs forever in our setup: #69

Can you confirm that test passes fine for you ?

@seeker89
Copy link
Author

Ok, so your CI setup proved me wrong via #69, so that probably means, that we have a problem elsewhere in our setup. Will continue debugging on Monday, and I'll report back.

@seeker89
Copy link
Author

What's the best way to see (print, debug) what queries are actually being executed on the postgres instance ?

Searching for logging only brings one result in the repo, and that makes it difficult.

@skamboj
Copy link

skamboj commented Jan 23, 2017

Is the above literally what fails for you? I still can't reproduce:

That is literally what's failing for us:

$ cat test.py 
import logging
import asyncio
import asyncpg
import os

async def get_pool(host, port, database, user, password):
    return await asyncpg.create_pool(host=host,
                                     port=port,
                                     database=database,
                                     user=user,
                                     password=password)

async def run_query(details, query):
    pool = await get_pool(**details) # returns a pool of clients
    async with pool.acquire() as connection:
        async with connection.transaction():
            results = await connection.fetch(query)
            print(query, results)
            return results

details = dict(
    host=os.environ.get("HOST"),
    port=os.environ.get("PORT"),
    database=os.environ.get("DB"),
    user=os.environ.get("USER"),
    password=os.environ.get("PASS"),
)

# logging.basicConfig(level=logging.DEBUG)

loop = asyncio.get_event_loop()
loop.set_debug(True)
tasks = [
    run_query(details, "SELECT col1 FROM my_table"), # Works
    run_query(details, "SELECT col1, col2 FROM my_table"), # Fails
]
loop.run_until_complete(asyncio.wait(tasks))
$ python test.py 
SELECT col1 FROM my_table [<Record col1='bar'>, <Record col1='foo'>]
^CTraceback (most recent call last):
  File "test.py", line 37, in <module>
    loop.run_until_complete(asyncio.wait(tasks))
  File "/opt/rh/rh-python35/root/usr/lib64/python3.5/asyncio/base_events.py", line 325, in run_until_complete
    self.run_forever()
  File "/opt/rh/rh-python35/root/usr/lib64/python3.5/asyncio/base_events.py", line 295, in run_forever
    self._run_once()
  File "/opt/rh/rh-python35/root/usr/lib64/python3.5/asyncio/base_events.py", line 1199, in _run_once
    event_list = self._selector.select(timeout)
  File "/opt/rh/rh-python35/root/usr/lib64/python3.5/selectors.py", line 432, in select
    fd_event_list = self._epoll.poll(timeout, max_ev)
KeyboardInterrupt

In case it helps, here's our python version and postgres version.

$ python --version
Python 3.5.1

sqld=> select version();
                                                 version                                                  
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11), 64-bit
(1 row)

sqld=> 

@elprans
Copy link
Member

elprans commented Jan 23, 2017

What's the best way to see (print, debug) what queries are actually being executed on the postgres instance?

@seeker89 You want SELECT * FROM pg_stat_activity

@elprans
Copy link
Member

elprans commented Jan 23, 2017

@skamboj Are you on RHEL6?

@seeker89
Copy link
Author

@skamboj Are you on RHEL6?

We're on RHEL7.

@elprans
Copy link
Member

elprans commented Jan 23, 2017

We would need help recreating your environment. The easiest would probably be a Dockerfile or a VM image.

@seeker89
Copy link
Author

Amazingly, it turns out that the query wasn't hanging forever... It was hanging for a couple of minutes.

When we dug down to see what postgres was actually executing, it turned out that it transformed any simple query taking into account a custom enum into a super complex monster, despite the fact, that the exact query ent through psql was straightforwards. Will continue to investigate.

@elprans
Copy link
Member

elprans commented Feb 2, 2017

@seeker89 @skamboj Any updates on this? Safe to close? Thanks.

@seeker89
Copy link
Author

seeker89 commented Feb 2, 2017

Yup, it turned out to be a problem somewhere else in the system. Thanks for your time !

@seeker89 seeker89 closed this as completed Feb 2, 2017
@MichaelDBA
Copy link

Be nice if folks elaborate a bit more on what the "other" problem was that caused this issue in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants