Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NamedTupleCursor is inefficient #838

Closed
Changaco opened this Issue Feb 1, 2019 · 10 comments

Comments

Projects
None yet
2 participants
@Changaco
Copy link

Changaco commented Feb 1, 2019

Calling the standard library's namedtuple function over and over again to create ephemeral classes isn't efficient. (You may want to take a look at namedtuple's source code if you're not familiar with it.)

Instead of doing that a single Record class could be created and used as the row_factory. It could be similar to DictRow but with a __getattr__ method in addition to __getitem__.

The current NamedTupleCursor implementation also degrades performance by calling re.compile() and re.sub() repeatedly in its _make_nt() method. The Record class could avoid that most of the time by only generating the python-compatible names when an attribute lookup fails.

Would a Pull Request implementing those changes be merged?

@dvarrazzo

This comment has been minimized.

Copy link
Member

dvarrazzo commented Feb 1, 2019

The namedtuple() function should be called once per execute, not once per record. Do you confirm this is what happens?

@Changaco

This comment has been minimized.

Copy link
Author

Changaco commented Feb 1, 2019

Yes. I also confirm that I was aware of that caching mechanism when I wrote my previous comment.

@dvarrazzo

This comment has been minimized.

Copy link
Member

dvarrazzo commented Feb 1, 2019

So we are talking about a single namedtuple() call for every execute(), which implies a roundtrip to the server, the server running the query, returning the data on the network, and then there is all the Postgres to Python data type adaptation. To me it doesn't sound a worthwhile optimisation. Do you have numbers to back its need?

I'm more ok with micro-optimisations e.g. the re.compile() call, which are very probably not measurable either but at least they fit in a pattern of correct usage, and still produce a named tuple on a thing called NamedTupleCursor.

If we have to update the named tuple to "a thing with __getattr__" it should be an entirely different object (it could even be DictRow itself as far as I'm concerned), or calling it NamedTupleCursor would be a lie. The overhead of doing it is also in documentation and upgrade path for the users, and of course NamedTupleCursor should stay where it is and with unchanged semantics as a lot of users use it expecting precisely named tuples.

@Changaco

This comment has been minimized.

Copy link
Author

Changaco commented Feb 2, 2019

Do you have numbers to back its need?

Yes, I've been profiling a web app. With Python 3.6 the calls to namedtuple make up 14% of the time needed to process a thousand requests for a page that executes 3 SQL queries, even surpassing the 13% taken by the actual processing of the queries (the execute method). In Python 3.7 namedtuple was optimized and the calls to it consume a more reasonable 4.5% of the time, versus 15% for the query processing. However that's still a significant overhead.

(The sub-call to the builtin exec function is the most expensive part of namedtuple.)

If we have to update the named tuple to "a thing with __getattr__" it should be an entirely different object (it could even be DictRow itself as far as I'm concerned), or calling it NamedTupleCursor would be a lie.

A single Record class could mimic the classes created by namedtuple in almost every way: it could be a tuple subclass and have similar methods (__repr__, _asdict, etc.). Consequently I don't think NamedTupleCursor would be a lie, but if you still prefer that a new cursor factory class be created instead then I'm fine with that.

@dvarrazzo

This comment has been minimized.

Copy link
Member

dvarrazzo commented Feb 2, 2019

Interesting to know, thank you.

A much simpler optimisation would be to cache namedtuples by names, keeping a lru-cache of e.g. 100-1000 of them.

@Changaco

This comment has been minimized.

Copy link
Author

Changaco commented Feb 2, 2019

Caching the classes would speed things up, but it would also cost extra RAM, and it only works well as long as the cache size remains larger than the number of classes used regularly by the application.

@dvarrazzo

This comment has been minimized.

Copy link
Member

dvarrazzo commented Feb 2, 2019

Caching 100-1000 of them is not that much ram. I don't believe your application uses so many different ones. If your page runs 3 sql queries that's the same 3 classes built over and over...

@Changaco

This comment has been minimized.

Copy link
Author

Changaco commented Feb 2, 2019

Obviously the web app has more than one page. A cache size of 100 would most likely be too low since we have more than 500 SQL queries in our app.

I'm not going to write or use a caching version of the current NamedTupleCursor, because in my opinion it's not the correct solution to this issue.

dvarrazzo added a commit that referenced this issue Feb 2, 2019

@dvarrazzo

This comment has been minimized.

Copy link
Member

dvarrazzo commented Feb 2, 2019

I have implemented caching for named tuples. The cache size is 1024, and can be tweaked.

Even if you have more than 2000 queries in your app not all will be called with the same frequency. The hot ones will remain in the cache. Also note that the cache is not on the number of queries but on the tuple of names returned: queries without return don't affect the cache; different queries returning the same data type will have only one cache entry.

If you can test the above patch I would be thankful and I am pretty sure it would suite your case (even if the cache size was lower than the default). If you prefer otherwise you are free to write your own specialised cursor class, and if you write anything of general usefulness we would be happy to use it.

@dvarrazzo

This comment has been minimized.

Copy link
Member

dvarrazzo commented Feb 2, 2019

In branch https://github.com/psycopg/psycopg2/tree/fast-namedtuple:

Moved to use a proper LRU cache: the stats can be accessed by NamedTupleCursor._cached_make_nt.cache_info(). Interested to know the hit/miss ratio after your app has been running for a while, thank you.

@dvarrazzo dvarrazzo closed this in f1e7350 Feb 6, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.