Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install apsw with pypy3 #323

Closed
prRZ5F4LXZ opened this issue Feb 22, 2022 · 3 comments
Closed

install apsw with pypy3 #323

prRZ5F4LXZ opened this issue Feb 22, 2022 · 3 comments

Comments

@prRZ5F4LXZ
Copy link

I am not able to install apsw with pypy3. Can this support be added?

I suppose pypy3 will run an apsw python script faster than vanilla python? If so, it is beneficial to support pypy3 by apsw?

@rogerbinns
Copy link
Owner

Note that APSW is C code that bridges the C based API of CPython with the C based API of SQLite. pypy focuses on Python code, and provides some compatibility code for C extensions. I did quickly try again and pypy3 doesn't implement PyUnicode_CopyCharacters which was added to CPython 10 years ago. It is extra work to reimplement that sort of thing, and may eventually happen.

But if you are using apsw, you'll find that the vast majority of time is spent in SQLite itself, especially doing things like waiting for storage to confirm data is really written and can survive a power failure. ie if you want to run things faster, your best effort would be in optimising the SQL side and paying very careful attention to transaction boundaries since that is where SQLite has to wait on storage.

I am busy updating APSW and will try pypy again when done.

@rogerbinns
Copy link
Owner

I've tried again and these are the compile errors:

$ pypy3 setup.py build_ext --inplace --force
running build_ext
SQLite: Using amalgamation /space/apsw/sqlite3/sqlite3.c
SQLite: Using configure generated /space/apsw/sqlite3/sqlite3config.h
building 'apsw' extension
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/src
gcc -pthread -DNDEBUG -O2 -fPIC -DNDEBUG=1 -DAPSW_FORK_CHECKER=1 -DAPSW_USE_SQLITE_AMALGAMATION=1 -DAPSW_USE_SQLITE_CONFIG=1 -I/space/apsw/sqlite3 -Isrc -I/usr/lib/pypy3/include -c src/apsw.c -o build/temp.linux-x86_64-3.7/src/apsw.o
In file included from src/apsw.c:79:
/usr/lib/pypy3/include/Python.h:6: warning: "_GNU_SOURCE" redefined
    6 | #define _GNU_SOURCE 1
      | 
In file included from src/apsw.c:49:
/space/apsw/sqlite3/sqlite3.c:242: note: this is the location of the previous definition
  242 | # define _GNU_SOURCE
      | 
In file included from src/apsw.c:136:
src/util.c: In function ‘apsw_write_unraiseable’:
src/util.c:116:30: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘frame’
  116 |   frame = PyThreadState_GET()->frame;
      |                              ^~
src/util.c:120:18: error: ‘PyFrameObject’ {aka ‘struct _frame’} has no member named ‘f_back’
  120 |     frame = frame->f_back;
      |                  ^~
In file included from src/apsw.c:139:
src/statementcache.c: In function ‘statementcache_prepare_internal’:
src/statementcache.c:139:12: warning: implicit declaration of function ‘_Py_HashBytes’ [-Wimplicit-function-declaration]
  139 |     hash = _Py_HashBytes(utf8, utf8size);
      |            ^~~~~~~~~~~~~
src/apsw.c: In function ‘formatsqlvalue’:
src/apsw.c:1155:7: warning: implicit declaration of function ‘PyUnicode_CopyCharacters’ [-Wimplicit-function-declaration]
 1155 |       PyUnicode_CopyCharacters(strres, 1, value, 0, input_length);
      |       ^~~~~~~~~~~~~~~~~~~~~~~~
error: command 'gcc' failed with exit status 1

The first two in apsw_write_unraiseable are only involved in writing unraisable exceptions, and that section could just be omitted in pypy.

_Py_HashBytes isn't officially documented in CPython, but is the underlying function used for hashing strings and bytes. This is only used in the statement cache for APSW, and anything would work, even return 0 (worst case). PyUnicode_CopyCharacters is a strange one to be missing, but can be replaced with a loop of PyUnicode_ReadChar/PyUnicode_WriteChar.

So it wouldn't take that much to get this to compile. Do note that APSW depends on the GIL and does not separately protect its data structures. However I doubt there would be any performance differences.

@rogerbinns
Copy link
Owner

rogerbinns commented Mar 31, 2022

I just did quick compliant implementations of the missing functions, and can compile and run with pypy3.

There are lots of these:

Exception ignored in: weakref callback <bound method list.remove of []>
ValueError: list.remove(): <weakref at 0x000055b526854680; dead> is not in list 

The root cause is APSW's tracking of cursors, blobs etc that belong to a Connection. In order to call Connection.close each of those dependents has to be remembered and close called on them. Weak references are used to do so, but pypy is doing weak reference/garbage collection in a different order than CPython causing the above to happen. It doesn't affect correctness.

Other than that, most of the test suite passes with various edge cases not normally experienced in regular code having errors.

I also did a speedtest. pypy's timing compared to CPython was very erratic, sometimes faster than CPython but usually not. ie running the exact same speedtest item multiple times in a row in pypy gives numbers that vary by up to 50%. CPython has a millisecond or two of difference for each iteration.

The bigstmt speedtest takes ~2 seconds to execute with APSW and sqlite3 on CPython. Under pypy3 APSW still takes 2 seconds, while sqlite3 is over 10 minutes and still going!

(The --correctness flag is also suspicious of sqlite3 under pypy).

Edit: pypy's sqlite3 speedtest on bigstmt took 30 mins, 46 mins, and 43 mins to complete!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants