Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enum and array in preparated statement hammer performance #782

Closed
samuelcolvin opened this issue Jul 14, 2021 · 6 comments
Closed

enum and array in preparated statement hammer performance #782

samuelcolvin opened this issue Jul 14, 2021 · 6 comments

Comments

@samuelcolvin
Copy link
Contributor

  • asyncpg version: asyncpg==0.23.0 (also tested with asyncpg==0.21.0
  • PostgreSQL version: PostgreSQL 13.3 (Ubuntu 13.3-0ubuntu0.21.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 10.3.0-1ubuntu1) 10.3.0, 64-bit
  • Python version: Python 3.8.10
  • Platform: Ubuntu 21.04
  • Do you use pgbouncer?: no
  • Did you install asyncpg with pip?: yes
  • Can the issue be reproduced under both asyncio and
    uvloop?
    : yes, makes no different

I have no idea what's going on here, there's a very weird (and big!) performance impact sometimes when you use an enum and a list/array together as prepared arguments.

I had this on a production system, but here is a minimal reproduction of the problem (debug is just here to test performance, it makes not difference:

import asyncio
import asyncpg

from devtools import debug


async def main():
    conn = await asyncpg.connect('postgresql://postgres@localhost/test')
    try:
        await conn.execute(
            """
            drop table if exists users;
            drop type if exists user_spam;
            create type user_spam as enum ('foo', 'bar');
            
            create table users(
                id serial primary key,
                spam_enum user_spam default 'foo',
                spam_varchar varchar(5) default 'foo',
                str_list varchar(31)[]
            )
            """
        )

        with debug.timer('spam_enum,variables,two-fields'):
            await conn.execute(
                'insert into users (spam_enum, str_list) values ($1, $2) returning id',
                'bar', ('foo', 'bar')
            )
        with debug.timer('spam_enum,variables,only-spam_enum'):
            await conn.execute('insert into users (spam_enum) values ($1) returning id', 'bar')

        with debug.timer('spam_enum,variables,only-str_list'):
            await conn.execute('insert into users (str_list) values ($1) returning id', ('foo', 'bar'))

        with debug.timer('spam_varchar,variables,two-fields'):
            await conn.execute(
                'insert into users (spam_varchar, str_list) values ($1, $2) returning id',
                'bar', ('foo', 'bar')
            )

        with debug.timer('spam_enum,static,two-fields'):
            await conn.execute(
                "insert into users (spam_enum, str_list) values ('bar', '{foo,bar}'::varchar[]) returning id",
            )

        with debug.timer('spam_varchar,static,two-fields'):
            await conn.execute(
                "insert into users (spam_varchar, str_list) values ('bar', '{foo,bar}'::varchar[]) returning id",
            )

        print('number of users created:', await conn.fetchval('select count(*) from users'))
    finally:
        await conn.close()

if __name__ == '__main__':
    asyncio.run(main())

Output:

spam_enum,variables,two-fields:      0.679s elapsed
spam_enum,variables,only-spam_enum:  0.000s elapsed
spam_enum,variables,only-str_list:   0.000s elapsed
spam_varchar,variables,two-fields:   0.000s elapsed
spam_enum,static,two-fields:         0.000s elapsed
spam_varchar,static,two-fields:      0.000s elapsed
number of users created:             6

For now my work around is to use simple varchar fields instead of enums, but I'd love to know why this is happening.

@samuelcolvin
Copy link
Contributor Author

a bit more investigating and this is not related to insert, you can get it with select too, it also only happens the first time the query is run on a connection.

If I run the following:

        with debug.timer('select'):
            await conn.execute(
                'select $1::user_spam, $2::varchar[]',
                'bar', ('foo', 'bar')
            )
        with debug.timer('select'):
            await conn.execute(
                'select $1::user_spam, $2::varchar[]',
                'bar', ('foo', 'bar')
            )

Only the first query is slow.

@elprans
Copy link
Member

elprans commented Jul 14, 2021

Try disabling jit in Postgres config or in connect(server_settings={'jit': 'off'})

@samuelcolvin
Copy link
Contributor Author

Yes, that solves the problem. Thanks so much.

Is there any concensus on whether to disable the jit in production?

E.g., if I reuse a connection so first queries aren't significant does anyone know if the jit really improves performance?

@elprans
Copy link
Member

elprans commented Jul 14, 2021

jit really improves performance?

It only makes sense for queries that do lots of computation in complex expressions over a large number of rows. Unfortunately, Postgres deems asyncpg's introspection query as "complex" and triggers JIT compilation on it, even though the query isn't that complicated, just large.

@samuelcolvin
Copy link
Contributor Author

thanks.

@elprans
Copy link
Member

elprans commented Jul 14, 2021

This is a frequent source of grief, actually, there's been a bunch of reports like this, so we might want to put a mitigation in place and disable jit locally when running the introspection query. Unfortunately, that'll increase its overhead when jit is off, although not as dramatically as you were seeing here.

elprans added a commit that referenced this issue Sep 23, 2023
The misapplication of JIT to asyncpg introspection queries has been a
constant source of user complaints.

Closes: #530
Closes: #1078
Previously: #875, #794, #782, #741, #727 (and probably more).
elprans added a commit that referenced this issue Oct 7, 2023
The misapplication of JIT to asyncpg introspection queries has been a
constant source of user complaints.

Closes: #530
Closes: #1078
Previously: #875, #794, #782, #741, #727 (and probably more).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants