Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of pypika objects being immutable, and pre-build as much of the query as possible. #64

Merged
merged 12 commits into from
Nov 13, 2018

Conversation

grigi
Copy link
Member

@grigi grigi commented Nov 5, 2018

Make use of pypika objects being immutable, and pre-build as much of the query as possible.
Also pre-build DB-specific filter override list. (avoids a transaction lookup which is quite expensive)
Also optimised queryset cloning.,

This gives us a further 21-48% speed up for small queries. 😁

This actually slowed down tests marginally, as in the tests we re-build all the query caches for each test. 🤷‍♂️

@coveralls
Copy link

coveralls commented Nov 5, 2018

Pull Request Test Coverage Report for Build 281

  • 18 of 20 (90.0%) changed or added relevant lines in 4 files are covered.
  • 327 unchanged lines in 10 files lost coverage.
  • Overall coverage increased (+0.01%) to 91.548%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tortoise/models.py 10 12 83.33%
Files with Coverage Reduction New Missed Lines %
tortoise/init.py 2 96.42%
tortoise/backends/asyncpg/init.py 2 100.0%
tortoise/backends/mysql/init.py 2 100.0%
tortoise/backends/asyncpg/executor.py 6 100.0%
tortoise/backends/base/client.py 8 91.38%
tortoise/backends/asyncpg/schema_generator.py 8 100.0%
tortoise/backends/mysql/schema_generator.py 13 100.0%
tortoise/backends/mysql/executor.py 21 100.0%
tortoise/backends/asyncpg/client.py 131 84.39%
tortoise/backends/mysql/client.py 134 85.08%
Totals Coverage Status
Change from base Build 276: 0.01%
Covered Lines: 2092
Relevant Lines: 2226

💛 - Coveralls

1 similar comment
@coveralls
Copy link

Pull Request Test Coverage Report for Build 281

  • 18 of 20 (90.0%) changed or added relevant lines in 4 files are covered.
  • 327 unchanged lines in 10 files lost coverage.
  • Overall coverage increased (+0.01%) to 91.548%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tortoise/models.py 10 12 83.33%
Files with Coverage Reduction New Missed Lines %
tortoise/init.py 2 96.42%
tortoise/backends/asyncpg/init.py 2 100.0%
tortoise/backends/mysql/init.py 2 100.0%
tortoise/backends/asyncpg/executor.py 6 100.0%
tortoise/backends/base/client.py 8 91.38%
tortoise/backends/asyncpg/schema_generator.py 8 100.0%
tortoise/backends/mysql/schema_generator.py 13 100.0%
tortoise/backends/mysql/executor.py 21 100.0%
tortoise/backends/asyncpg/client.py 131 84.39%
tortoise/backends/mysql/client.py 134 85.08%
Totals Coverage Status
Change from base Build 276: 0.01%
Covered Lines: 2092
Relevant Lines: 2226

💛 - Coveralls

@coveralls
Copy link

coveralls commented Nov 5, 2018

Pull Request Test Coverage Report for Build 304

  • 88 of 92 (95.65%) changed or added relevant lines in 10 files are covered.
  • 309 unchanged lines in 9 files lost coverage.
  • Overall coverage increased (+0.05%) to 91.586%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tortoise/backends/mysql/client.py 0 2 0.0%
tortoise/models.py 11 13 84.62%
Files with Coverage Reduction New Missed Lines %
tortoise/backends/asyncpg/init.py 2 100.0%
tortoise/backends/mysql/init.py 2 100.0%
tortoise/backends/asyncpg/executor.py 6 100.0%
tortoise/backends/base/client.py 8 92.42%
tortoise/backends/asyncpg/schema_generator.py 8 100.0%
tortoise/backends/mysql/schema_generator.py 13 100.0%
tortoise/backends/mysql/executor.py 21 100.0%
tortoise/backends/asyncpg/client.py 124 83.54%
tortoise/backends/mysql/client.py 125 84.3%
Totals Coverage Status
Change from base Build 276: 0.05%
Covered Lines: 2092
Relevant Lines: 2225

💛 - Coveralls

@grigi grigi changed the title Make use of pypika objecte being immutable, and pre-build as much of the query as possible. Make use of pypika objects being immutable, and pre-build as much of the query as possible. Nov 5, 2018
@grigi
Copy link
Member Author

grigi commented Nov 6, 2018

@Zeliboba5 Please review.

tortoise/models.py Outdated Show resolved Hide resolved
@abondar
Copy link
Member

abondar commented Nov 7, 2018

Also, regarding README, I think we should add info about mysql and pg drivers availability, because right now that info isn't available in README.
Also we could remove defining of id field in examples, because at the moment it is still non-configurable, so people could just use predefined id field, without giving them hope that they can change it :)

@abondar
Copy link
Member

abondar commented Nov 7, 2018

Also pypika resolved issue with boolean casting kayak/pypika#171
Could you test if it resolves #62 ?

Added __slots__ to fields to reduce memory usage a little.
Renamed `generate_filters` for clarity.
@grigi
Copy link
Member Author

grigi commented Nov 7, 2018

I'm in two minds about removing id from all examples...
Do you have another suggestion?

@abondar
Copy link
Member

abondar commented Nov 7, 2018

May be remove them but leave a comment on first model class occurrence saying that id field will be automatically generated?
Both in README and docs.
I think that way it will be quite clear and won't cause misunderstanding for users

@grigi
Copy link
Member Author

grigi commented Nov 9, 2018

So, @Zeliboba5 the new pypika doesn't fix it: https://travis-ci.com/tortoise/tortoise-orm/jobs/157473493

However, on my local system the tests passed, before and after the pypika upgrade.
So I suspect that this is fixed in a later version of SQLite...
But most people don't run the latest SQLite, so we need to support the alias of True=1 and False=0.

@abondar
Copy link
Member

abondar commented Nov 9, 2018

Sorry if I was misleading, but new new version of PyPika meant not only just version, but also using SQLiteQuery from this commit kayak/pypika@d14965c
Could you add it and retest please?

@grigi
Copy link
Member Author

grigi commented Nov 9, 2018

Unforutnately the implementation is not complete:

>>> from pypika import Table
>>> from pypika.dialects import SQLLiteQuery
>>> table_abc = Table('abc')
>>> SQLLiteQuery.from_('abc').select('a').where(table_abc.a == True)
SELECT "a" FROM "abc" WHERE "a"=true

But we have our own to_db_value functions, and I see it is configured for inserts, but not filters. Should we not do that instead?

@grigi
Copy link
Member Author

grigi commented Nov 11, 2018

@Zeliboba5 I Fixed #62 by using the overrideable to_db functions in filtering, instead of just insert.
In the end I did not use the new SQLLiteQuery dialect, as it gained me nothing. And we now consistently handle per-DB field type overrides.

Please review.

self._joined_tables = [] # type: List[Table]
self.model = model
self.query = model._meta.basequery # type: Query
self._db = db if db else model._meta.db # type: BaseDBAsyncClient
Copy link
Member

@abondar abondar Nov 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember I specifically removed db field from instancing in __init__, because it brings trouble, where you define your query outside of transaction and then run it inside transaction, which leads to query being executed outside of transaction, cause it's fetched connection before transaction even started.
So that's why connection should be fetched only after query being awaited

However, as I see lower, I didn't remove it in DeleteQuery and some other queries, which seems wrong to me now. Is there some case that I am missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was valid before the transaction rewrite, because now the connection is actually stored in a context variable?

Also, I was toying with separating the DB instance from the actual "connection", I was trying it for #56 but then realised the test runner was causing chaos, hence I deferred that until after that was done. (Since we are an async-only ORM, why can't we always manage connections as a pool?) (Currently all that PR does is remove all the scattered connection logic, just to simplify things, so we can try and re-add it, and lazy evaluation makes debugging that super complicated)

And then there is the case of allowing a single Model that would be available on different databases (of potentially different types). I considered that, and the whole way we manage QuerySet assumes it is a 1:1 relationship, and it will require a large rewrite to make it, so decided that it is an OK limitation for now, as I never needed to do that, and it will complicate everything too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was valid before the transaction rewrite, because now the connection is actually stored in a context variable?

Yeah, but then, after refactoring, I encountered the case I described above and fixed it. But you are actually right, that it looks scattered for now. How about making db property for AwaitableQuery, so code will still look better, but actual value will be evaluated only on call?

Regarding allowing using of single model on different databases - it's seems to me that implementation that allows it will bring much more complexity that we want to handle at the moment, though it doesn't bring benefits for most of users, cause I really think that in most cases you would want to just create two separate models to avoid confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, too much complexity, in both interface and implementation. I like the suggestion of two separate models in that case. Allows you to manage migrations separately.

I'll look at moving some of that logic to awaitablequery tomorrow. Just concerned with the todb overrides.
Just for clarity, the issue was specific to pooled connections?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK with you after I get this merged, then change #56 to be a PR to remove all the pooled logic that isn't working. Then a new PR to try and add it back in a controlled way. Then the main purpose of #56 would be to simplify the codebase?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the following tests, expecting them to fail:

async def test_await_across_transaction_fail(self):
    tournament = Tournament(name='Test')
    query = tournament.save()

    try:
        async with in_transaction() as connection:
            result = await query
            raise Exception('moo')
    except Exception:
        pass

    self.assertEqual(await Tournament.all(), [])

async def test_await_across_transaction_success(self):
    tournament = Tournament(name='Test')
    query = tournament.save()

    async with in_transaction() as connection:
        result = await query

    self.assertEqual(await Tournament.all(), [tournament])

But they pass. Which means to me that if I prepare OUTSIDE the transaction but await INSIDE a transaction, the transaction isolation then works the same as if everything was inside a transaction.

If that is the case, then the issue you were experiencing is now not valid anymore?
Makes sense to me, because QuerySet._execute() only runs when you await it. so it inherits the state of where it gets awaited.

Which seems correct to me, so can we leave it as is?

Copy link
Member

@abondar abondar Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue I described doesn't apply to .save(), cause when you call .save() without awaiting nothing occurs, it's only executed when it's awaited.

Problem should occur when you build queryset, and await it later, like that:

    async def test_await_across_transaction(self):
        tournament_query = Tournament.filter(name='Test')

        async with in_transaction() as connection:
            await Tournament.create(name='Test')
            result = await tournament_query
            self.assertEqual(len(result), 1)

But after actually writing that test I understood that you were right, that problem occurs only in connection pool context and when we deal with single connection it doesn't matter because everything after transaction start will fall into it transaction.

Sorry for confusion. If we are cutting out pools for now we can leave _db as it is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let me also do tests for .update() and then we see.
Only .save() and .update() modify contents, so those should be the only queries applicable to transactions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, transactions are not only about writing, it's also about transaction isolation.
In test that I wrote in my previous message - if there was connection pool and tournament_query was executed outside of transaction - it won't see changes brought by .create() which leads to errors.
And if we talk about modification - .delete() also modifies content.

So I think all operations are equally important in terms of their interactions with transactions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I added tests for insert/update/delete/select and they all work as I expected: https://github.com/tortoise/tortoise-orm/pull/64/files#diff-12a2be3f481afdc9a647b690a7a299f9
So, for now I think this is OK.

And it becomes the problem of when the connection pooling happens.

@abondar
Copy link
Member

abondar commented Nov 13, 2018

So far this PR seems complete to me.
Are you going to add anything or we are good to merge it?

@grigi grigi merged commit 7e94fa1 into master Nov 13, 2018
@abondar abondar deleted the perf branch November 23, 2018 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants