Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifying SQLITE_MAX_VARIABLE_NUMBER #124

Open
C2H6O opened this issue Jan 10, 2020 · 6 comments
Open

Modifying SQLITE_MAX_VARIABLE_NUMBER #124

C2H6O opened this issue Jan 10, 2020 · 6 comments

Comments

@C2H6O
Copy link

C2H6O commented Jan 10, 2020

Hello!

First, thanks for all the work being put into this library.

Problem: We keep bumping into the 999 default limit (https://www.sqlite.org/limits.html) and would like to increase it.

Question with regards to adding an additional SQLITE_MAX_VARIABLE_NUMBER flag to the SQLite compilation process (sqlite-android/sqlite-android/src/main/jni/sqlite/Android.mk). Currently I see no way that this can be customized on the consumer side. At this moment, I plan on creating a fork to add this flag for our purposes, but perhaps there's a way to make this more customizable?

@mikehardy
Copy link
Contributor

mikehardy commented Jan 13, 2020

The impact appears to be per-statement memory usage:

SQLite allocates space to hold all host parameters between 1 and the largest host parameter number used. Hence, an SQL statement that contains a host parameter like ?1000000000 would require gigabytes of storage. This could easily overwhelm the resources of the host machine. To prevent excessive memory allocations, the maximum value of a host parameter number is SQLITE_MAX_VARIABLE_NUMBER, which defaults to 999.

...so what's the per-variable-number impact (or the total impact going from 999 to 32k)? In order to know if we're in an area where we might "overwhelm the resources of the host machine"

I'm just another user of the library but if it wasn't much memory seems fine, although I'm struggling to personally see a use for more than 999 named variables in a query I gave up predicting API usage for other's projects, I assume there is no way to redesign the queries to stay below the limit

@C2H6O
Copy link
Author

C2H6O commented Jan 13, 2020

@mikehardy Thanks for the input!

If I'm not mistaken, that will depend on the type of the variable that is being used as a parameter. In our case, we use UUIDs, which get converted to Strings. That's 36 chars, so if my math is correct, that would correspond to 36 chars * 16 bits/char * 32768 items * 1/1024 * 1/1024 = 18 MBs in the worst-case scenario. And with the old 999 limit, that would correspond to 999/32768*18 = 0.54 MBs. Is this math correct?

Reworking the queries is possible, but would require a lot of engineering time and ongoing efforts to ensure that future CRUD operations that take variable number of parameters also follow the same pattern.

@mikehardy
Copy link
Contributor

I would never trust the math without somehow figuring out how to actually measure it while running - it might be that the variable for your app are 36-char strings, but that the pre-allocation they talk about is just an int (64 bits? plus some padding for structural things?) and if they are unused stay at 64-bit (or whatever they are), meaning if they were (for example) 128-bits each your change is just from 999128/1024 = 124 KBs to 32768128/1024 = 4096 KBs (or 4 MBs). I have no idea if that matches reality at all though. The sqlite mailing lists might be able to solve this immediately though?

@C2H6O
Copy link
Author

C2H6O commented Jan 13, 2020

@mikehardy That's a good idea. I'll send an email to that mailing list.

@C2H6O
Copy link
Author

C2H6O commented Jan 15, 2020

@mikehardy @npurushe Reading the mailing list archives (https://www.mail-archive.com/search?q=SQLITE_MAX_VARIABLE_NUMBER&l=sqlite-users%40mailinglists.sqlite.org), it leads me to conclude that there's no memory penalty by simply bumping the limit to some higher value (https://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg65410.html and related messages).

The only downside that I see is how the system behaves when a statement is executed with a large number of parameters.

In the case where the limit is X and the # of parameters is greater than X, then the process throws a nice exception with a clear explanation of what happened.

In the case where the limit is X and the system runs out of memory, we'll probably get some vague OOM exception, but it would probably not be clear why these OOM exceptions happen.

@mikehardy
Copy link
Contributor

Before I say anything I just want to say I view myself as a user, so this is free opinion, worth what you paid ;-)

Great links from the list, I take those as authoritative since they were from the sqlite author :-). So it's a lazy allocation, but it will be linear up to the highest numbered parameter you reference. Makes sense.

For low-level libraries like this one I usually am of the opinion that if there is a choice been limiting functionality or letting a user shoot their own foot off, I usually choose more power + possible foot pain. Perhaps there is a way to warn the API consumer if they create a statement with a really high variable count? Like a debug-only warning log if it is over XX (to be determined, but maybe like 10000 - 2 orders of magnitide higher and maybe like 9MB based on the above link)

As an API user, I'd be fine with that anyway, and when I scanned the log I'd see that warning right above the OOM...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants