Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lazy evaluation of server_version_info #371

Merged
merged 1 commit into from
May 15, 2023

Conversation

mdesmet
Copy link
Contributor

@mdesmet mdesmet commented May 7, 2023

Description

Non-technical explanation

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label May 7, 2023
@mdesmet mdesmet requested a review from aalbu May 7, 2023 20:12
Copy link
Member

@aalbu aalbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's cool. Have you checked whether the getter is ever called in the process of creating a connection/running some SQL/closing the connection? It seemed to me like the property is never accessed, but I'm not positive.

@hashhar
Copy link
Member

hashhar commented May 8, 2023

This is something that the SQLA engine interface provides so user scripts/programs can make use of it. Core itself doesn't use it other than for populating a server_version_info attribute.

Some dialects also use it internally for version dependent logic, an example is Oracle - https://github.com/zzzeek/sqlalchemy/blob/671647c8574bd6ff4c39fe503a2b1958a802772d/lib/sqlalchemy/dialects/oracle/base.py#L1487

@hashhar
Copy link
Member

hashhar commented May 8, 2023

How much does this really help, it's only called once per connection. So only if someone is using NullPool or creating new engine everytime.

@mdesmet
Copy link
Contributor Author

mdesmet commented May 8, 2023

Well, it is called once per connection and not used in 99.999% of the cases, so making it lazily evaluated makes definitely a difference as it removes these series of requests from EVERY sqlalchemy usage.

@hashhar
Copy link
Member

hashhar commented May 10, 2023

tests are failing

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % comment.

tests/integration/test_sqlalchemy_integration.py Outdated Show resolved Hide resolved
trino/sqlalchemy/dialect.py Outdated Show resolved Hide resolved
@hashhar
Copy link
Member

hashhar commented May 10, 2023

Well, it is called once per connection and not used in 99.999% of the cases, so making it lazily evaluated makes definitely a difference as it removes these series of requests from EVERY sqlalchemy usage.

I was thinking in terms of number of requests.
This function isn't executed per-query, so it only reduces 1 call for entire application/connection (which is generally much lower than number of queries).

i.e. for someone who uses only a few connections there is no benefit of this lazy evaluation.

return None

# We make the server_version_info be evaluated lazily
cls.server_version_info = property(get_server_version_info, lambda instance, value: None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the decorator @property?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because server_version_info is not in our codebase but in sqlalchemy's, so we need to resort to this type of instantiation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean why use the property() function instead of a decorator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at sqlalchemy's code. Simply adding the property wouldn't work as it would be overwritten.

Copy link
Member

@nineinchnick nineinchnick May 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I get it now. But it seems we're doing a hack here where we override the server info property with one that has a getter for lazy evaluation, but a noop setter. Then we return a None, which sqlalchemy tries to assign to that prop, but that'll get ignored because go the setter. Is it worth adding a comment explaining this, if I got it right?

@nineinchnick
Copy link
Member

When working with a server that's geographically far away, or over a bad network connection, the overhead of the extra query can be significant, especially when executing short/simple queries. IMO, it is very valuable to avoid this unnecessary query.

@nineinchnick
Copy link
Member

Note that I tested today that getting the server version info is done only once after creating the engine, even if connections are not reused (using a NullPool and closing the connection after every query).

@hashhar hashhar merged commit 12e6f44 into trinodb:master May 15, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants