-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
I'm doing some data processing in pandas, "pivoting" the dataframe and saving it to a sqlite table.
The data processing is rather expensive and, for performance reasons, I'd like to process it once with pandas and write the cross-tabulated data frame into a sqlite table for later queries. This strategy works well until I get over 1000 columns when I run into the following sqlite limitation (See "9. Maximum Number Of Host Parameters In A Single SQL Statement" on http://www.sqlite.org/limits.html)
To prevent excessive memory allocations, the maximum value of a host parameter number
is SQLITE_MAX_VARIABLE_NUMBER, which defaults to 999.
The full traceback is
Traceback (most recent call last):
File "create_sqlite.py", line 28, in <module>
sql.write_frame(final, name='cond', con=conn)
File "/usr/local/apps/murdock/env/lib/python2.6/site-packages/pandas/io/sql.py", line 210, in write_frame
func(frame, name, safe_names, cur)
File "/usr/local/apps/murdock/env/lib/python2.6/site-packages/pandas/io/sql.py", line 222, in _write_sqlite
cur.executemany(insert_query, data)
sqlite3.OperationalError: too many SQL variables
Do you think it would be possible to get around the issue by chunking the insert into smaller statements? Any ideas on a workaround (aside from recompiling sqlite)?