New pool (WIP) #69

grigi · 2018-11-26T07:38:15Z

Now that code base is simpler, and test runner should be more sane, attempt to add connection pooling for the third time.

The plan is to change to connection pooling ONLY, as we currently implement persistent connections, but only one persistent connection.
A connection pool should:

add robustness (if connection dies, then reconnect)
Allow multiple DB clients to operate at the same time (up to maxsize)
Allow more conflicts, so we need to handle rollback/retries explicitly.

Things done:

Change to a connection pooling system for MySQL.
Add tests for concurrency
Add tests for robustness (hackery allowed)
Add tests for handling conflicts.

Concerns:

Can SQLite be concurrent at all? If difficult, should we limit it?
We need to add concurrency to the benchmarks, to manage performance

coveralls · 2018-11-28T08:15:40Z

Pull Request Test Coverage Report for Build 349

23 of 69 (33.33%) changed or added relevant lines in 7 files are covered.
241 unchanged lines in 10 files lost coverage.
Overall coverage decreased (-6.3%) to 88.032%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
tortoise/models.py	2	3	66.67%
tortoise/backends/asyncpg/client.py	0	7	0.0%
tortoise/backends/mysql/client.py	0	38	0.0%

Files with Coverage Reduction	New Missed Lines	%
tortoise/backends/asyncpg/init.py	2	100.0%
tortoise/init.py	2	99.32%
tortoise/backends/mysql/init.py	2	0.0%
tortoise/models.py	3	93.17%
tortoise/backends/asyncpg/executor.py	6	100.0%
tortoise/backends/asyncpg/schema_generator.py	8	100.0%
tortoise/backends/mysql/schema_generator.py	12	0.0%
tortoise/backends/mysql/executor.py	21	0.0%
tortoise/backends/mysql/client.py	89	0.0%
tortoise/backends/asyncpg/client.py	96	100.0%

Totals
Change from base Build 347:	-6.3%
Covered Lines:	1896
Relevant Lines:	2145

💛 - Coveralls

coveralls · 2018-11-28T08:15:41Z

Pull Request Test Coverage Report for Build 349

23 of 69 (33.33%) changed or added relevant lines in 7 files are covered.
241 unchanged lines in 10 files lost coverage.
Overall coverage decreased (-6.3%) to 88.032%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
tortoise/models.py	2	3	66.67%
tortoise/backends/asyncpg/client.py	0	7	0.0%
tortoise/backends/mysql/client.py	0	38	0.0%

Files with Coverage Reduction	New Missed Lines	%
tortoise/backends/asyncpg/init.py	2	100.0%
tortoise/init.py	2	99.32%
tortoise/backends/mysql/init.py	2	0.0%
tortoise/models.py	3	93.17%
tortoise/backends/asyncpg/executor.py	6	100.0%
tortoise/backends/asyncpg/schema_generator.py	8	100.0%
tortoise/backends/mysql/schema_generator.py	12	0.0%
tortoise/backends/mysql/executor.py	21	0.0%
tortoise/backends/mysql/client.py	89	0.0%
tortoise/backends/asyncpg/client.py	96	100.0%

Totals
Change from base Build 347:	-6.3%
Covered Lines:	1896
Relevant Lines:	2145

💛 - Coveralls

grigi · 2018-12-22T07:13:47Z

There is several useful changes in this PR that I'm going to pull out in its own PR:

contextvars fixes
autocommit for MySQL
Less confusing logs
Isolation fixes in test runner
db_url improvements
minor test enchancements
dependency updates

grigi · 2018-12-22T07:19:34Z

@abondar Also whilst debugging the idiotic autocommit issue, I found that we don't have a clear expectation of how transactions should operate.

I'm proposing this:

Root level: auto-commit
First transaction: full isolation
Nested transaction: no-op: These are all handled differently depending on version of DB, connector used, etc... and with the current implementation may just create a separate transaction on a different connection causing even more confusion.

This takes the useful parts from #69: * contextvars update & testrunner fixes * db_url improvements * minor test enhancements * dependency updates * autocommit for MySQL * re-ordered execute-SQL log to be inside connection context manager for less confusing logs.

grigi · 2018-12-25T21:05:38Z

@abondar I rebased and simplified this connection pooling PR.
Right now only the *await_across_transaction* are failing. You are welcome to have a go yourself.

abondar · 2018-12-27T15:11:31Z

@grigi For some reason python3.7/mysql build failed crazily, but other builds are now fine. I'll look into failed build later if you won't manage to do it before me

grigi · 2018-12-27T16:17:07Z

My first thought was "Probably contextvars" as the backport aiocontextvars isn't exactly the same as the py3.7 version.
Looking at the test results, it looks like test isolation completely failed. I get more errors with less processes and less errors with more processes, and if I run a test by itself it always passes...

grigi · 2018-12-28T06:49:20Z

It doesn't use the transactioned connection, but a different one:

2018-12-28 08:43:32     DEBUG Acquired connection for transaction <aiomysql.connection.Connection object at 0x7fcbf47fbb00>
2018-12-28 08:43:32     DEBUG Acquired connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG INSERT INTO `tournament` (`name`,`created`) VALUES (%s,%s): ['Test', datetime.datetime(2018, 12, 28, 6, 43, 32, 189123)]
2018-12-28 08:43:32     DEBUG Released connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG Acquired connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG UPDATE `tournament` SET `name`='Updated name' WHERE `id`=1
2018-12-28 08:43:32     DEBUG Released connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG Acquired connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG SELECT `created`,`name`,`id` FROM `tournament` WHERE `name`='Updated name' LIMIT 1
2018-12-28 08:43:32     DEBUG Released connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG Acquired connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG INSERT INTO `tournament` (`name`,`created`) VALUES (%s,%s): ['Test 2', datetime.datetime(2018, 12, 28, 6, 43, 32, 195122)]
2018-12-28 08:43:32     DEBUG Released connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG Acquired connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG SELECT `id` `0` FROM `tournament`
2018-12-28 08:43:32     DEBUG Released connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG Acquired connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG SELECT `id` `id`,`name` `name` FROM `tournament`
2018-12-28 08:43:32     DEBUG Released connection <aiomysql.connection.Connection object at 0x7fcbf4b24ac8>
2018-12-28 08:43:32     DEBUG Released connection for rolled back transaction <aiomysql.connection.Connection object at 0x7fcbf47fbb00>

It creates a transaction on connection 0x7fcbf47fbb00, but then runs all SQL on connection 0x7fcbf4b24ac8.
Which to me does confirm a contextvars related issue, as that is where the connection should be stored?
I'm sure I fixed this, but maybe in the last rebase I broke something?

This should be unrelated to your queryset changes.

grigi · 2018-12-28T06:51:07Z

Yup, it was already broken on my rebase, and was working before then.
Meh, so much changed :-(

grigi · 2019-01-01T12:35:49Z

Ok, I did some more digging, and found that I was mistaken. py37+mysql (pooling) never isolated right.
And if I use contextvars with reset() I will always get an error about context having changed.
Meaning somehow the context changes, so the value we set is not seen where we think it should.
So we may have to manage the context manually? Or I may be missing something.

There is some documentation re the differences here: https://pypi.org/project/aiocontextvars/
I feel if we can get the complaint of different Context when we use reset() to go away, we will probably have contextvars working properly on 3.7

grigi · 2019-03-05T09:50:41Z

Currently the client looks up a ContextVar that exist globally, outside of its own scope. I think the common issues for globals may actually be relevant here. So I'm going to consider removing the global state tracking and try and simplify the transaction handling in the Client. To make it easier to reason about things.

It is easy to trigger the current issue, and it appears to be a very easy race-condition. Unfortunately the code isn't that easy to reason about, hence my attempt at simplifying it some more.

abondar · 2019-03-06T08:24:38Z

Currently the client looks up a ContextVar that exist globally, outside of its own scope. I think the common issues for globals may actually be relevant here.

I don't really understand why there is problem with contextvars, shouldn't it be okay for them to be global, because that what it shows in examples in docs https://docs.python.org/3/library/contextvars.html ?
And what is alternative way to store those variables?

grigi · 2019-03-06T19:35:26Z

Yes, the docs talk of globals, but how I see it being used in Gino is as part of a instance-variable.
There is a behaviour difference between the backport and what is on 3.7 and I don't really know why 3.7 fails for us.

Also, the way transactions sort-of replace the class just doesn't feel clean, hence me considering refactoring it.
e.g. The Client contains the pool, and the entirety of transaction management could be a ContextVar inside the class.

zhoufeng1989 · 2019-08-22T06:04:28Z

Hi, guys. I am using tortoise ORM (PostgreSQL backend) in project. Now I am considering applying database pool (asyncpg pool) in tortoise.

It seems that in 'new_pool' branch, you guys have already implemented pool for MySQL.
I run tests in 'new_pool' branch, four of the test cases fail:

tortoise.tests.test_capabilities.TestCapabilities.test_actually_runs
tortoise.tests.test_capabilities.TestCapabilities.test_connection_name
tortoise.tests.test_tester.TestTesterASync.test_fail
tortoise.tests.test_tester.TestTesterASync.TestTesterSync.test_fail

But these fails seem irrelevant to database pool. I would like to know the progress of this feature. Are problems you talked in this issue still unsolved? If I add pool for PostgreSQL, is there anything tortoise-specific I need to know about? Can I start with this branch?

Thanks for your excellent work for this library!

@grigi @abondar

grigi · 2019-08-22T11:30:59Z

Hi @zhoufeng1989
Thank you for your interest! We have done a bit more work on the problem, and understand the issues a bit better.

So, for a short history as to why this have been sitting around so long:
At first we wanted to push for this to get gather() operations to work without issues, somewhat naïvely as with pooling is how one solves concurrency in a threaded environment?
But we ran into these odd hard-to-debug issues. (There is another PR where I essentially took all the bits I found helped out as separate PRs, and this is essentially the second attempt).

We then had a few concurrency related bug reports, and I managed to fix all of them quite easily except for one. That was when mutex around a transaction. The fail was identical to this.
I shelved that here: 51b8d20 And now that I had a small diff I could try and understand why I could get this to work on Py3.5 & Py3.6, but it would fail on Py3.7

The obvious difference is that async contextvars is built-in on Py3.7, and a monkey patch on 3.5/3.6

So it turns out that the context gets applied at higher resolution on py3.7, so would be tied to stack level, instead of to co-routine scheduling. So we need to control the stack a bit better so that we can enable this reliably. I suppose that the "working way" on 3.5/3.6 was probably not entirely reliable.

Then life interfered (Kid got very sick) but that is gradually returning to normal. So for a while most of the dev work was done by contributors, and only recently (the last 2 months or so) I actually did real work on this again.

If you read on #141 Andrey re-attempted it, but ran ultimately into the same issue, and after discussing the core issue as I see it (and I may be wrong) he agreed that we may have to do a large refactor to fix the stacking issue.

So, yes, if you want to contribute in any way to help us reach the milestone reliably, I would be super grateful 😀

I would start with that commit I linked, and see how restructuring the code, could enable the stack-level resolution of the built in contextvars to work.
We now have a relaxed requirement that we just deprecated py3.5 so any work on a large feature can happily break py3.5 compatibility.
So now we can use cleaner, simpler generators: https://docs.python.org/3.7/whatsnew/3.6.html#whatsnew36-pep525 which I think could really help simplify the flow of co-routines, so might allow us to fix this issue without a massive refactor which may or may not fix the issue.

Wow, that was a lot longer than I intended, but I hope it gives enough context.
I want that issue resolved,

grigi · 2019-08-22T11:34:26Z