Skip to content
This repository has been archived by the owner on Feb 22, 2020. It is now read-only.

Parallel queries #3

Closed
dimaqq opened this issue Aug 18, 2011 · 12 comments
Closed

Parallel queries #3

dimaqq opened this issue Aug 18, 2011 · 12 comments

Comments

@dimaqq
Copy link

dimaqq commented Aug 18, 2011

I am unable to execute queries to same database in separate connections in parallel.
amysql + pthreads somehow serializes queries (perhaps doesn't release GIL?)
amysql + gevent crashes (python: PacketWriter.cpp:93: void PacketWriter::pull(size_t): Assertion `m_writeCursor - m_readCursor <= cbSize' failed.)

for comparison, mysqldb + pthreads runs same queries in parallel just fine.

version info: x86_64, amysql git, python 2.7.1, gevent 0.13.6

@mthurlin
Copy link
Member

Could you provide a minimal failing example?

@jskorpan
Copy link

I'd like to ask if you're using the same Connection object in several greenlets (gevent) or threads? (CPython)

@dimaqq
Copy link
Author

dimaqq commented Aug 18, 2011

A separate connection object per green thread or real thread.
I'll write up a smaller test case than what I have now and post it here.

@yashh
Copy link

yashh commented Aug 18, 2011

I am using amysql + greenlets to query in parallel. Have nt seen any issues. May be an exact snippet can help replicate. I think the greenlets are sharing the connection .

@dimaqq
Copy link
Author

dimaqq commented Aug 18, 2011

== mysql+gevent ==

Tracked down the assertion crash - apparently amysql doesn't like long lines, e.g. those produced by mysqldump. This only happens for amysql built against gevent.
$ wc test.sql
1 5 959312 test.sql

#!/usr/bin/env python
import sys, gevent, gevent.queue, amysql, time, logging, random
SQL=file("test.sql").read()

slave = amysql.Connection()
slave.connect("xxx", 3306, "xxx", "xxx", "test", True, 'utf8');
slave.query(SQL)

python: PacketWriter.cpp:93: void PacketWriter::pull(size_t): Assertion `m_writeCursor - m_readCursor <= cbSize' failed.
Aborted

Now that I go around this by using short sql statements, I could test amysql+gevent concurrency and it works fine after all.

== mysql+pthreads ==

long sql is ok
however there's no concurrency
-- use test;
-- crate table test ( test float );
-- load a bunch of data into test so that sum(test) from test is slow

#!/usr/bin/env python
import sys, threading, amysql, time, logging, random

PASSWORD="xxx"
USER="xxx"
HOST="xxx"

class DThread(threading.Thread):
 def run(self):
  try:
    slave = amysql.Connection()
    slave.connect(HOST, 3306, USER, PASSWORD, "test", True, 'utf8');
    time.sleep(1) # allow other threads to start
    while True:
      slave.query("select sum(test) from test")
  except:
    logging.exception("th")
    raise

threads = [DThread() for i in range(4)]
for t in threads: t.setDaemon(True)
for t in threads: t.start()

time.sleep(30)

show processlist shows this:
| 227541 | root | pillow:59470 | test | Sleep | 13 | | NULL |
| 227542 | root | pillow:59471 | test | Query | 1 | Sending data | select sum(test) from test |
| 227543 | root | pillow:59472 | test | Sleep | 13 | | NULL |
| 227544 | root | pillow:59473 | test | Sleep | 13 | | NULL |

which basically means that other threds don't get to run.

@jskorpan
Copy link

Please be aware that you'll have to compile amysql using setup_gevent.py to build for gevent and anything built with the normal setup.py will not produce a binary compatible with gevent I/O (and the other way around). If anyone has a better way to solve this, as in packaging a packet separately for gevent and "normal" Python please feel free to contribute.

Let's move any discussion about long queries crashing amysql to the new issue that I've created

@dimaqq
Copy link
Author

dimaqq commented Aug 19, 2011

That's exactly how I've done it. setup_gevent.py for amysql+gevent and setup.py for amysql+pthreads.

@jskorpan
Copy link

Does tests.py pass for gevent build?

@dimaqq
Copy link
Author

dimaqq commented Aug 19, 2011

all tests pass (after manual setup of testdb.sql, which is confusing, as tests.py doesn't check that)

..........................

Ran 26 tests in 22.189s

OK

@jskorpan
Copy link

Then please fork amysql and add a test of your own to tests.py which fails on the pararelle issue as you describe it above

Thanks!

@dimaqq
Copy link
Author

dimaqq commented Aug 19, 2011

To recap, as far as parallel requests go, amysql+gevent is ok, amysql+pthread bugs.
A simple grep -ri thread amysql shows that GIL is not apparently released anywhere in the extension code.

Btw., right now tests.py requires gevent, thus it can't be used to test amysql+pthreads as is.

@jskorpan
Copy link

Ahh, that makes more sense. I think you're onto the GIL issue.

I guess a release/acquire of the GIL in io_cpython.c right before we do select() would do the trick?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants